Skip to content

Phase 2: split machine-response transport from human-message transport#160

Merged
wwind123 merged 1 commit into
mainfrom
codex/issue-153-machine-response-transport
May 25, 2026
Merged

Phase 2: split machine-response transport from human-message transport#160
wwind123 merged 1 commit into
mainfrom
codex/issue-153-machine-response-transport

Conversation

@wwind123
Copy link
Copy Markdown
Owner

Summary

  • add additive response_file_text and message_text channels to AgentResult while keeping text as the effective response
  • keep Claude and Gemini on file-first transport, and add the same machine-response-file contract plus explicit fallback priority to Codex
  • extend backend tests to prove noisy stdout is ignored when a valid response file is present and that legacy fallbacks still work

Testing

  • python3 -m pytest tests/test_agent_loop.py

Fixes #153

@wwind123
Copy link
Copy Markdown
Owner Author

Implemented issue #153 on branch codex/issue-153-machine-response-transport and opened PR #160 against main.

The change keeps AgentResult.text as the current effective response while adding separate response_file_text and message_text channels. Claude and Gemini now populate both explicitly, and Codex now uses the same machine-response-file contract with the approved priority chain: response file, then --output-last-message, then raw stdout fallback. The backend tests were extended to cover file-first behavior, Codex fallback tiers, dry-run behavior, and usage extraction from noisy stdout.

Tests: python3 -m pytest tests/test_agent_loop.py -k 'claude_backend_prefers_response_file or gemini_backend_prefers_response_file or codex_backend_prefers_response_file or codex_backend_prefers_last_message or codex_backend_uses_stdout or codex_backend_dry_run or claude_review_loop_prefers_public_response_file_over_stdout or gemini_review_loop_prefers_public_response_file_over_stdout or test_extract_codex_usage_reads_turn_completed_jsonl' passed; python3 -m pytest tests/test_agent_loop.py passed.

-- OpenAI Codex

@wwind123
Copy link
Copy Markdown
Owner Author

Review: Phase 2 — Split machine-response transport from human-message transport

Reviewed HEAD 50a109f. All 325 tests pass locally (python3 -m pytest tests/test_agent_loop.py). GitHub CI is still in-progress at time of review.


Summary

The PR correctly implements the Phase 2 contract from #153/#140:

  • AgentResult gains additive response_file_text and message_text fields; text remains the orchestrator-facing field with no behavior change (fully backward-compatible).
  • Claude and Gemini backends explicitly populate all three fields with correct semantics.
  • Codex gains the machine-response-file contract via with_public_response_file_instruction, with the approved three-tier priority chain: response file → --output-last-message file → raw stdout fallback.
  • The FakeRunner test harness is updated to call _maybe_write_public_response_file in the Codex path of run_with_log, consistent with how it was already handled for Claude and Gemini.
  • No orchestrator behavior is changed (stays scoped to the transport layer).

The 7 new tests cover all code paths (file present/absent for each backend, last-message fallback, stdout fallback, dry-run). The existing 318 tests continue to pass.


Non-blocking notes

_read_codex_message_file duplicates read_public_response_file (codex.py:85–91): The two functions are functionally identical — read file, strip, return None if empty. The distinction is conceptual (one is a "response file", the other a "message file"), but a shared private helper or a rename could eliminate the duplication if desired.

response_path allocated before the dry-run guard (codex.py:114): public_response_path(config, "codex") creates a temporary directory before the if config.dry_run: branch. The dry-run path never reads from or writes to response_path, leaving the directory orphaned. Moving the allocation to after the if config.dry_run: return (or inside the non-dry-run branch) would be cleaner.

Neither item affects correctness; deferring both to a follow-up is fine.


No blocking issues. The implementation is correct, well-scoped, and fully tested.

-- Anthropic Claude

@wwind123
Copy link
Copy Markdown
Owner Author

Phase 2 Review: Split machine-response transport from human-message transport

The changes successfully implement the transport split across all supported backends. The implementation is clean, maintains backward compatibility, and includes comprehensive test coverage for the new transport channels.

Summary of changes

  • AgentResult Enhancement: Added response_file_text and message_text fields to distinguish between machine-readable payloads and human-facing logs/messages.
  • Backend Updates: Claude, Gemini, and Codex backends now all support the public response file contract.
  • Codex Parity: Codex now uses the same machine-response-file pattern as the other backends, using --output-last-message as its legacy fallback channel.
  • Robust Fallbacks: The main text result in AgentResult correctly prefers the response file but falls back to captured message text if the file is absent or empty.
  • Improved Testing: Added backend-specific tests verifying that noisy stdout is ignored when a valid machine response file is provided.

Observations

  • In CodexBackend.run, the dry-run branch continues to use the raw prompt instead of prompt_with_response_file. While functionally harmless for dry runs, using the modified prompt would make the dry-run output more representative of the actual execution. This is a minor point and does not block the PR.

The PR meets all exit criteria and effectively prepares the transport layer for the subsequent live flow migration.

-- Google Gemini

@wwind123
Copy link
Copy Markdown
Owner Author

GitHub PR checks are still pending for PR #160.

Reviewer approvals do not make this PR merge-ready until GitHub PR checks are green, or the PR explicitly states that only a local subset passed.

  • Pending checks: test (in_progress)
  • Required status checks are not configured on the PR base branch.

-- coding-review-agent-loop

@wwind123 wwind123 merged commit 5258dea into main May 25, 2026
1 check passed
@wwind123 wwind123 deleted the codex/issue-153-machine-response-transport branch May 25, 2026 06:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

#140 Phase 2: Split machine-response transport from human-message transport

1 participant