[Bugfix #717] Fall back to Tower terminal_sessions when local state misses a builder#720
[Bugfix #717] Fall back to Tower terminal_sessions when local state misses a builder#720waleedkadous merged 6 commits intomainfrom
Conversation
…ate misses a builder afx attach only consulted the local state.db builders table, so any builder that existed in Tower's terminal_sessions registry but not in local state (e.g., spawned from a different session, or after state.db was cleared) would error out with "Builder not found" — even though afx status and afx send could see and reach it. Added a Tower-registry fallback to findBuilderById and findBuilderByIssue. The fallback queries terminal_sessions for type='builder' rows in the current workspace and reconstructs a minimal Builder from role_id/cwd/label. The reconstructed object is enough to drive findShellperSocket, which is what attach actually needs. Local-state lookup is unchanged when it succeeds. Added 4 regression tests covering the fallback paths for both --issue and --project (exact + substring match), plus the negative case where neither registry has the builder.
…root CMAP review (Gemini, Codex) flagged a pre-existing latent bug that the first commit's Tower fallback inherited: findShellperSocket queried `WHERE workspace_path = ?` using `normalizeWorkspacePath(builder.worktree)` (e.g. .builders/spir-118) but terminal_sessions.workspace_path is set to config.workspaceRoot at spawn time. The query never matched a builder, so socket discovery always fell through to a directory scan that returns the FIRST shellper socket — i.e. potentially the wrong builder when multiple are running. Switched the query to use config.workspaceRoot. The existing test that asserted the broken argument signature was updated, and a new end-to-end terminal-mode test was added that exercises the Tower fallback + socket lookup together.
Architect ReviewExcellent work — thank you. The fix is clean, and the CMAP review process did exactly what it's supposed to: Gemini and Codex caught the related The Tower-row → Builder reconstruction is sensible (placeholder fields only matter in All 6 CI checks green. Approved — please merge. |
|
@timeleft-- this is now merged to |
Summary
Fixes #717
Root Cause
Two related bugs combined to make
afx attachunreliable:Lookup miss:
findBuilderById/findBuilderByIssueonly consulted the localstate.db builderstable. Tower's terminal registry (global.db terminal_sessions) was never consulted, so a builder visible toafx statusandafx sendcould still error as "Builder not found" fromafx attach -p <id>whenever the local row was missing — e.g. spawned in a different session, lost during astate.dbreset, or never written due to a spawn-time race.Socket attribution (uncovered by CMAP review):
findShellperSocketqueriedterminal_sessions WHERE workspace_path = ?usingnormalizeWorkspacePath(builder.worktree)— but Tower storesworkspace_path = config.workspaceRootat spawn time. The SQLite lookup never matched a builder, so socket discovery always fell through to a directory scan that returns the first shellper socket — i.e. potentially the wrong builder when multiple are running. The first commit's Tower fallback inherited this bug.Fix
Commit 1: Tower-registry fallback in
findBuilderByIdandfindBuilderByIssue. When local lookup misses, queryterminal_sessions WHERE workspace_path = ? AND type = 'builder'and reconstruct a minimalBuilderfromrole_id(id),cwd(worktree), andlabel(name). For--issue, parse thebugfix-<n>suffix fromrole_id.Commit 2: Switch
findShellperSocket's SQLite query to usegetConfig().workspaceRootinstead ofbuilder.worktree, so socket discovery correctly identifies the right shellper for the right builder.Local-state lookup is unchanged when it succeeds; existing happy-path behavior is preserved.
Test Plan
--issuefalls back to Tower when local state has no match--projectexact match falls back to Tower--projectsubstring match falls back to Tower--projecterrors when neither registry knows the builderfindShellperSocketarg signature was correctedCMAP Review
findShellperSocketworkspace-root mismatch as critical. Fixed in commit 2.--browsermode. Both fixed in commit 2 (terminal-mode end-to-end test added).findShellperSocketissue as pre-existing/unrelated; commit 2 chooses to fix it anyway since the fallback inherits the same trap.