Skip to content

[Bugfix #717] Fall back to Tower terminal_sessions when local state misses a builder#720

Merged
waleedkadous merged 6 commits intomainfrom
builder/bugfix-717
May 2, 2026
Merged

[Bugfix #717] Fall back to Tower terminal_sessions when local state misses a builder#720
waleedkadous merged 6 commits intomainfrom
builder/bugfix-717

Conversation

@waleedkadous
Copy link
Copy Markdown
Contributor

@waleedkadous waleedkadous commented May 2, 2026

Summary

Fixes #717

Root Cause

Two related bugs combined to make afx attach unreliable:

  1. Lookup miss: findBuilderById / findBuilderByIssue only consulted the local state.db builders table. Tower's terminal registry (global.db terminal_sessions) was never consulted, so a builder visible to afx status and afx send could still error as "Builder not found" from afx attach -p <id> whenever the local row was missing — e.g. spawned in a different session, lost during a state.db reset, or never written due to a spawn-time race.

  2. Socket attribution (uncovered by CMAP review): findShellperSocket queried terminal_sessions WHERE workspace_path = ? using normalizeWorkspacePath(builder.worktree) — but Tower stores workspace_path = config.workspaceRoot at spawn time. The SQLite lookup never matched a builder, so socket discovery always fell through to a directory scan that returns the first shellper socket — i.e. potentially the wrong builder when multiple are running. The first commit's Tower fallback inherited this bug.

Fix

Commit 1: Tower-registry fallback in findBuilderById and findBuilderByIssue. When local lookup misses, query terminal_sessions WHERE workspace_path = ? AND type = 'builder' and reconstruct a minimal Builder from role_id (id), cwd (worktree), and label (name). For --issue, parse the bugfix-<n> suffix from role_id.

Commit 2: Switch findShellperSocket's SQLite query to use getConfig().workspaceRoot instead of builder.worktree, so socket discovery correctly identifies the right shellper for the right builder.

Local-state lookup is unchanged when it succeeds; existing happy-path behavior is preserved.

Test Plan

  • 28 attach tests pass (23 pre-existing + 5 new)
  • New regression tests:
    • --issue falls back to Tower when local state has no match
    • --project exact match falls back to Tower
    • --project substring match falls back to Tower
    • --project errors when neither registry knows the builder
    • End-to-end terminal-mode attach for a Tower-only builder verifies SQLite is queried with workspace ROOT (not worktree) and finds the correct socket
  • Pre-existing test that asserted the broken findShellperSocket arg signature was corrected
  • TypeScript check passes
  • Net diff: 226 LOC across both commits (under 300 LOC bugfix threshold)

CMAP Review

Model Verdict Notes
Gemini REQUEST_CHANGES → addressed Flagged the findShellperSocket workspace-root mismatch as critical. Fixed in commit 2.
Codex REQUEST_CHANGES → addressed Same finding as Gemini, plus noted that the first commit's tests only exercised --browser mode. Both fixed in commit 2 (terminal-mode end-to-end test added).
Claude APPROVE Verified the workspace_path semantics for the new fallback are correct. Noted the findShellperSocket issue as pre-existing/unrelated; commit 2 chooses to fix it anyway since the fallback inherits the same trap.

…ate misses a builder

afx attach only consulted the local state.db builders table, so any
builder that existed in Tower's terminal_sessions registry but not in
local state (e.g., spawned from a different session, or after state.db
was cleared) would error out with "Builder not found" — even though
afx status and afx send could see and reach it.

Added a Tower-registry fallback to findBuilderById and
findBuilderByIssue. The fallback queries terminal_sessions for
type='builder' rows in the current workspace and reconstructs a
minimal Builder from role_id/cwd/label. The reconstructed object is
enough to drive findShellperSocket, which is what attach actually
needs. Local-state lookup is unchanged when it succeeds.

Added 4 regression tests covering the fallback paths for both --issue
and --project (exact + substring match), plus the negative case where
neither registry has the builder.
…root

CMAP review (Gemini, Codex) flagged a pre-existing latent bug that the
first commit's Tower fallback inherited:

  findShellperSocket queried `WHERE workspace_path = ?` using
  `normalizeWorkspacePath(builder.worktree)` (e.g. .builders/spir-118)
  but terminal_sessions.workspace_path is set to config.workspaceRoot
  at spawn time. The query never matched a builder, so socket discovery
  always fell through to a directory scan that returns the FIRST
  shellper socket — i.e. potentially the wrong builder when multiple
  are running.

Switched the query to use config.workspaceRoot. The existing test that
asserted the broken argument signature was updated, and a new
end-to-end terminal-mode test was added that exercises the Tower
fallback + socket lookup together.
@waleedkadous
Copy link
Copy Markdown
Contributor Author

Architect Review

Excellent work — thank you. The fix is clean, and the CMAP review process did exactly what it's supposed to: Gemini and Codex caught the related findShellperSocket workspace_path bug that the original ticket never named, and you addressed both REQUEST_CHANGES with a focused commit 2 plus a proper end-to-end regression test rather than rebutting them.

The Tower-row → Builder reconstruction is sensible (placeholder fields only matter in displayBuilderList, which the fallback path doesn't reach), and the corrected workspace_path query closes a real multi-builder mis-attribution risk.

All 6 CI checks green. Approved — please merge.

@waleedkadous waleedkadous merged commit c809719 into main May 2, 2026
6 checks passed
@waleedkadous waleedkadous deleted the builder/bugfix-717 branch May 2, 2026 03:43
@waleedkadous
Copy link
Copy Markdown
Contributor Author

@timeleft-- this is now merged to main. When you get a chance, could you pull latest and confirm afx attach works for you in the scenario you originally hit? Thanks again for the well-researched bug report — the diagnostic detail in the issue made this a really clean fix.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

afx attach can't find builders registered only in Tower's terminal registry

1 participant