Skip to content

feat: tmux persist mode for CLI runners#285

Open
xukai92 wants to merge 9 commits into
akashgit:mainfrom
xukai92:feat/tmux-persist-mode
Open

feat: tmux persist mode for CLI runners#285
xukai92 wants to merge 9 commits into
akashgit:mainfrom
xukai92:feat/tmux-persist-mode

Conversation

@xukai92
Copy link
Copy Markdown
Collaborator

@xukai92 xukai92 commented May 19, 2026

Summary

  • Add --tmux-persist flag that launches agents interactively in tmux windows instead of headless subprocesses
  • Agents run via script for output capture, with tmux wait-for for async blocking — factory gets (stdout, return_code) as before, but the user can attach to the tmux session and continue chatting after the agent finishes
  • One tmux session per project (factory-persist-<name>), one window per agent invocation
  • Configurable via CLI flag, FACTORY_TMUX_PERSIST env var, or ~/.factory/config.toml (five-tier precedence)
  • Falls back to normal headless mode when tmux is unavailable
  • Bob runner logs a warning (no session resume support)

Test plan

  • 16 unit tests in test_tmux_persist.py covering tmux_available, _strip_ansi, run_in_tmux (session creation, window reuse, timeout, ANSI stripping, error handling), and ClaudeRunner delegation/fallback
  • Updated mock signatures in test_agents.py for tmux_persist kwarg
  • Full suite: 1739 passed, 0 failed
  • Manual: factory agent researcher --task "list files" --project /tmp/test --tmux-persist

🤖 Generated with Claude Code

xukai92 and others added 7 commits May 8, 2026 18:50
The Agent tool doesn't go through the factory runner, so events
aren't emitted automatically. The skill now calls factory emit
before/after each agent invocation for observability parity.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Adds a new --tmux-persist flag to the `ceo` and `agent` CLI subcommands
that opens a tmux window with `claude --resume <session_id>` after each
headless agent invocation completes successfully. This allows operators
to inspect and continue agent sessions interactively after batch runs.

The flag threads through the full call chain: CLI -> ceo_completion ->
agents/runner -> runners (claude/bob). Bob runner logs a warning since
it has no session resume capability.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…audeRunner

Addresses review findings:
- Add capture_output=True to new-window/new-session subprocess calls
- Add 3 integration tests verifying ClaudeRunner passes --session-id
  and calls open_resume_window on success (skips on failure)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Replace the headless-then-resume approach with direct interactive
execution inside tmux windows. Uses `script` for output capture and
`tmux wait-for` for async blocking. Adds FACTORY_TMUX_PERSIST env var
and config.toml support via five-tier precedence resolution.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@codecov
Copy link
Copy Markdown

codecov Bot commented May 19, 2026

Codecov Report

❌ Patch coverage is 92.23301% with 8 lines in your changes missing coverage. Please review.
✅ Project coverage is 87.34%. Comparing base (46e393b) to head (6a2e635).
⚠️ Report is 6 commits behind head on main.

Files with missing lines Patch % Lines
factory/runners/_tmux_persist.py 95.06% 4 Missing ⚠️
factory/runners/claude.py 83.33% 2 Missing ⚠️
factory/cli.py 87.50% 1 Missing ⚠️
factory/runners/bob.py 50.00% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #285      +/-   ##
==========================================
+ Coverage   87.28%   87.34%   +0.06%     
==========================================
  Files          56       57       +1     
  Lines        8009     8166     +157     
==========================================
+ Hits         6991     7133     +142     
- Misses       1018     1033      +15     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@xukai92
Copy link
Copy Markdown
Collaborator Author

xukai92 commented May 19, 2026

@codex can you review this?

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds an optional “tmux persist” execution mode for CLI agent runs, allowing Claude agents to run inside a per-project tmux session/window while still returning (stdout, return_code) to the factory process.

Changes:

  • Introduces a --tmux-persist flag plus config/env resolution and plumbs tmux_persist through agent/CEO invocation APIs.
  • Adds a new tmux-backed runner path for ClaudeRunner.headless() (with fallback when tmux isn’t available) and a warning-only behavior for Bob.
  • Adds unit tests for tmux availability, output capture/ANSI stripping, timeout behavior, and Claude delegation/fallback.

Reviewed changes

Copilot reviewed 10 out of 10 changed files in this pull request and generated 7 comments.

Show a summary per file
File Description
factory/runners/_tmux_persist.py Implements tmux window/session launching, output capture via script, and async waiting via tmux wait-for.
factory/runners/claude.py Adds tmux_persist option to headless() and delegates to tmux implementation when enabled/available.
factory/runners/bob.py Accepts tmux_persist arg and logs a warning that it’s unsupported.
factory/runners/protocol.py Extends runner protocol to include tmux_persist.
factory/agents/runner.py Plumbs tmux_persist through invoke_agent / invoke_agents_parallel.
factory/ceo_completion.py Plumbs tmux_persist through CEO completion-guard respawns.
factory/cli.py Adds --tmux-persist flag, resolves it via config/env, and passes into agent/CEO invocations.
factory/user_config.py Documents tmux_persist in the config template.
tests/test_tmux_persist.py Adds unit tests for tmux persist behavior and ClaudeRunner integration.
tests/test_agents.py Updates mocked invoke_agent signatures to include tmux_persist.
Comments suppressed due to low confidence (1)

factory/cli.py:2836

  • Same as the other --tmux-persist help: it currently says "after headless agent completion" even though the mode runs the agent inside tmux. Please reword for accuracy/consistency with the implementation.
    p.add_argument("--tmux-persist", action="store_true", default=False,
                    help="Open a tmux resume window after headless agent completion (claude only)")

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread factory/runners/_tmux_persist.py Outdated
Comment on lines +103 to +117
try:
wait_proc = await asyncio.create_subprocess_exec(
"tmux", "wait-for", signal,
stdout=asyncio.subprocess.PIPE,
stderr=asyncio.subprocess.PIPE,
)
await asyncio.wait_for(wait_proc.wait(), timeout=timeout)
except asyncio.TimeoutError:
subprocess.run(
["tmux", "kill-window", "-t", f"{session}:{window}"],
capture_output=True,
)
logger.error("tmux agent timed out after %ss: role=%s", timeout, role)
_cleanup(tmpdir)
return f"Agent timed out after {timeout}s", 1
Comment thread factory/runners/protocol.py Outdated
Comment thread factory/agents/runner.py Outdated
Comment thread factory/ceo_completion.py Outdated
Comment thread factory/cli.py Outdated
Comment thread tests/test_tmux_persist.py Outdated
xukai92 and others added 2 commits May 19, 2026 02:30
- Place task positional arg after flags in claude command (consistent
  with interactive_run)
- Kill and await tmux wait-for subprocess on timeout to prevent
  stray processes
- Update all docstrings and CLI help text to reflect the actual
  behavior (run in tmux, not open resume window)
- Rename test to match its assertion

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The user needs to know which tmux session to attach to, and the
default 600s timeout is too short for interactive sessions where
the user is meant to chat.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@xukai92 xukai92 requested a review from akashgit May 19, 2026 03:24
Copy link
Copy Markdown
Owner

@akashgit akashgit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review — PR #285: tmux persist mode

Overview

Adds a --tmux-persist flag that launches agents interactively in tmux windows instead of headless subprocesses, using script for output capture and tmux wait-for for async blocking. Well-structured feature with good test coverage (16 tests). However, there are two bugs that will prevent this from working on macOS, plus a gap that will crash the codex runner.


Critical

1. script -c is Linux-only — will fail on macOS

_tmux_persist.py:82 generates:

script -q -c <cmd> <logfile>

The -c flag is a util-linux extension. macOS's script does not support it:

$ script -c "echo test" /dev/null
script: illegal option -- c

macOS syntax is script -q <logfile> <command ...>. The wrapper script needs platform detection:

import platform
if platform.system() == "Darwin":
    f"script -q {shlex.quote(str(logfile))} {claude_cmd}\n"
else:
    f"script -q -c {shlex.quote(claude_cmd)} {shlex.quote(str(logfile))}\n"

2. CodexRunner.headless() missing tmux_persist parameter — TypeError at runtime

The Runner protocol (protocol.py) and invoke_agent (runner.py:142) now pass tmux_persist=tmux_persist to runner.headless(), but CodexRunner.headless() was not updated to accept it. When using --runner codex, this will crash:

TypeError: CodexRunner.headless() got an unexpected keyword argument 'tmux_persist'

Fix: Add tmux_persist: bool = False to CodexRunner.headless() signature (with a warning like the bob runner, or just ignore it).


Major

3. factory run not wired up

--tmux-persist is added to the factory agent and factory ceo parsers, but not factory run. The _run_single_cycle function (line 2874) calls invoke_agent without tmux_persist. Since factory run is the primary entry point for continuous improvement, this is a gap in feature coverage.

4. timeout silently ignored when tmux_persist is active

ClaudeRunner.headless() receives timeout from the caller but does not forward it to run_in_tmux():

# claude.py — timeout is available but not passed
return await run_in_tmux(
    prompt, task, cwd, role, _find_project_path(cwd),
    model=model,
    dangerously_skip_permissions=dangerously_skip_permissions,
)

run_in_tmux defaults to 86400s (24h). The caller's timeout (e.g. 7200s from cmd_ceo) is silently dropped. Either forward timeout or document the override.


Minor

5. _find_project_path belongs in _tmux_persist.py

This function is added as a module-level utility in claude.py but is only used for the tmux persist path. It duplicates project-root resolution logic. Move it to _tmux_persist.py where it's consumed.

6. Session name collision

session = f"{_SESSION_PREFIX}{project_path.name}"

Two projects at different paths with the same directory name (e.g. /a/myapp and /b/myapp) would share a tmux session. Consider hashing the full path or using a longer suffix.


What looks good

  • Clean separation: _tmux_persist.py is a well-scoped private module
  • The tmux wait-for signaling pattern is solid — non-polling async wait
  • ANSI stripping for clean output capture
  • Graceful fallback when tmux is unavailable
  • Bob runner warns instead of crashing
  • Test coverage is thorough — session creation, window reuse, timeout, fallback, delegation
  • Five-tier config precedence reused correctly via _resolve_tmux_persist

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants