peer: rotate samizdat credentials hourly (closes engineering#3437)#472
Open
myleshorton wants to merge 1 commit into
Open
peer: rotate samizdat credentials hourly (closes engineering#3437)#472myleshorton wants to merge 1 commit into
myleshorton wants to merge 1 commit into
Conversation
Currently peer.Client builds the libbox inbound exactly once per Start
and holds the same X25519 keypair / shortID / masquerade for the
entire peer process lifetime — a leaked credential (logs, telemetry,
support bundles, the route_id leakage in engineering#3440) remains
usable for hours or days.
This adds a credRotationLoop goroutine on a 1h tick. On each tick:
1. Re-register with lantern-cloud against the same (address, port)
tuple — same router-side mapping, fresh server-side row, fresh
samizdat creds.
2. Patch the new options for VPN bypass.
3. Build a new libbox service.
4. Close the old box (releases the listening port).
5. Start the new box (re-binds the same port with new creds).
6. Atomic swap of c.box, c.routeID.
7. Best-effort deregister of the prior route_id so the bandit
stops handing the old (now-invalid) creds to clients within
~immediately rather than waiting up-to-TTL for the row to
expire.
Steps 4-5 leave a brief (~hundreds of ms) window where the port is
unbound; samizdat clients see TCP RST and reconnect via the bandit.
That's the trade-off vs. the security cost of holding the same cred
for the peer process lifetime — caps blast radius from cred leakage
to ~1h regardless of how long the peer has been running.
Rotation is best-effort: a single failure logs and waits for the
next tick. The current box and creds remain serving in the failure
case so a transient register error doesn't kill the session.
Config gains CredRotationInterval (defaults to peerCredRotationInterval
= 1h) so tests drive the loop without a 1h sleep — see
TestClient_RotatesCredentialsAtInterval.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Contributor
There was a problem hiding this comment.
Pull request overview
This PR adds periodic rotation of peer “Share My Connection” samizdat credentials by re-registering with lantern-cloud, rebuilding the libbox inbound with fresh server config, swapping the active route/box, and best-effort deregistering the prior route to reduce the impact window of leaked credentials.
Changes:
- Introduces
Config.CredRotationIntervaland acredRotationLoop()goroutine that callsrotateCreds()on a schedule (default: 1 hour). - Persists port mapping details and run context on the
Clientso rotations can re-register against the same(external IP, port)tuple and rebuild libbox. - Adds
TestClient_RotatesCredentialsAtIntervalto verify repeated register/build/close behavior over a short interval.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 8 comments.
| File | Description |
|---|---|
peer/peer.go |
Adds credential rotation loop, stores mapping/run context for reuse, and implements re-register → rebuild → swap → deregister sequence. |
peer/peer_test.go |
Extends stub server to vary register responses per call and adds a rotation-interval test. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| // channel: heartbeatLoop's done already gates Stop, and rotation | ||
| // failures are non-fatal (log + retry next tick), so there's nothing | ||
| // the Stop path needs to wait on from this goroutine. | ||
| func (c *Client) credRotationLoop(ctx context.Context, interval time.Duration) { |
Comment on lines
+531
to
+536
| c.mu.Lock() | ||
| c.box = newBox | ||
| c.routeID = regResp.RouteID | ||
| c.boxOptions = options | ||
| c.status.RouteID = regResp.RouteID | ||
| c.mu.Unlock() |
Comment on lines
+538
to
+545
| // Deregister the prior route so the bandit stops handing the old | ||
| // (now-invalid) creds to clients. Best-effort: the prior row will | ||
| // expire from its TTL anyway, but explicit deregister cuts the | ||
| // stale-creds window from up-to-TTL down to ~immediately. | ||
| if err := c.cfg.API.Deregister(ctx, oldRouteID); err != nil { | ||
| slog.Warn("deregister prior route after rotation", | ||
| "err", err, "old_route_id", oldRouteID) | ||
| } |
| c.box = newBox | ||
| c.routeID = regResp.RouteID | ||
| c.boxOptions = options | ||
| c.status.RouteID = regResp.RouteID |
Comment on lines
+490
to
+516
| regResp, err := c.cfg.API.Register(ctx, RegisterRequest{ | ||
| ExternalIP: externalIP, | ||
| ExternalPort: extPort, | ||
| InternalPort: intPort, | ||
| }) | ||
| if err != nil { | ||
| return fmt.Errorf("re-register: %w", err) | ||
| } | ||
| options, err := ensurePeerOutboundsBypassVPN(regResp.ServerConfig) | ||
| if err != nil { | ||
| return fmt.Errorf("patch sing-box options: %w", err) | ||
| } | ||
|
|
||
| c.mu.Lock() | ||
| runCtx := c.runCtx | ||
| c.mu.Unlock() | ||
| if runCtx == nil { | ||
| // Stop happened between the unlock above and here. Skip the | ||
| // build to avoid spinning up a libbox tied to a torn-down ctx. | ||
| // The new register row is harmless — server-side reaper will | ||
| // deprecate it after TTL since no heartbeat will arrive. | ||
| return errors.New("client stopped during rotation") | ||
| } | ||
| newBox, err := c.cfg.BuildBoxService(runCtx, options) | ||
| if err != nil { | ||
| return fmt.Errorf("build new sing-box: %w", err) | ||
| } |
Comment on lines
+519
to
+528
| // If newBox.Start fails after oldBox.Close, we lost the listener | ||
| // and the next heartbeat / rotation tick is the recovery point. | ||
| if closeErr := oldBox.Close(); closeErr != nil { | ||
| slog.Warn("close old box during rotation", "err", closeErr) | ||
| } | ||
| if err := newBox.Start(); err != nil { | ||
| // Catastrophic: port is now unbound. Leave c.box pointing at | ||
| // oldBox so a future Stop tries to close it (idempotent on | ||
| // already-closed); the next rotation tick will try again. | ||
| return fmt.Errorf("start new sing-box: %w", err) |
Comment on lines
+521
to
+529
| if closeErr := oldBox.Close(); closeErr != nil { | ||
| slog.Warn("close old box during rotation", "err", closeErr) | ||
| } | ||
| if err := newBox.Start(); err != nil { | ||
| // Catastrophic: port is now unbound. Leave c.box pointing at | ||
| // oldBox so a future Stop tries to close it (idempotent on | ||
| // already-closed); the next rotation tick will try again. | ||
| return fmt.Errorf("start new sing-box: %w", err) | ||
| } |
| // Each rotation deregisters the prior route — N rotations => | ||
| // N deregisters (initial register is not preceded by one). | ||
| rotations := srv.registerCount.Load() - 1 | ||
| assert.GreaterOrEqual(t, srv.deregisterCount.Load(), rotations-1, |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Closes the C2 finding from the Share My Connection security review (engineering#3437): peer.Client builds the libbox inbound exactly once per Start and holds the same samizdat creds for the entire peer process lifetime, so a leaked credential remains usable for hours or days.
This adds a `credRotationLoop` goroutine on a 1h tick that re-registers, rebuilds the libbox inbound with fresh creds, and deregisters the prior route. Caps blast radius from credential leakage to ~1h regardless of how long the peer has been running.
Sequence per rotation
Steps 4-5 leave a brief (~hundreds of ms) window where the port is unbound; samizdat clients see TCP RST and reconnect via the bandit. Acceptable trade-off vs. the security cost of indefinite cred lifetime.
Failure handling
Rotation is best-effort. A single failure (transient register error, network blip) logs and waits for the next tick. The current box and creds remain serving in the failure case so a one-off transient doesn't kill the session.
Test plan
Related
Part of the Share My Connection security review series:
🤖 Generated with Claude Code