Skip to content

fix(opencode): repair chatgpt subscription hang watchdog#29388

Closed
liuhaoyang wants to merge 5 commits into
anomalyco:devfrom
liuhaoyang:fix/chatgpt-subscription-hang-watchdog
Closed

fix(opencode): repair chatgpt subscription hang watchdog#29388
liuhaoyang wants to merge 5 commits into
anomalyco:devfrom
liuhaoyang:fix/chatgpt-subscription-hang-watchdog

Conversation

@liuhaoyang
Copy link
Copy Markdown

@liuhaoyang liuhaoyang commented May 26, 2026

Fixes #29420

Summary

Repair the stream watchdog for ChatGPT subscription streams to properly detect and recover from hung connections.

Changes

  • Provider request timeout (provider.ts): Replace broken AbortSignal.timeout() with manual setTimeout + AbortController for a 5-minute default request timeout. Properly clean up timers in finally block and re-throw abort reasons on cancellation.
  • Stream watchdog (processor.ts): Add watchStream() wrapper with separate first-byte (30s) and idle (120s) timeouts using Promise.race. Wire into the LLM stream consumption path and cap retries at 3 attempts.
  • Timeout error handling (message-v2.ts): Convert DOMException(TimeoutError) from watchdog into retryable APIError so the retry loop can recover.
  • Retry classification (retry.ts, error.ts): Extend transient error detection to recognize 429/502/503/504 status codes, rate-limit messages, overloaded/gateway-timeout patterns, and deeply-nested ChatGPT error bodies (with depth protection against circular refs).
  • Abort signal propagation (codex.ts): Thread AbortSignal through exchangeCodeForTokens(), refreshAccessToken(), and waitForOAuthCallback() so Codex auth operations respect caller cancellation.
  • Subagent timeout (task.ts): Add configurable 5-minute default timeout on subagent tasks with input validation rejecting non-positive values.
  • Tests: Add processor watchdog tests (first-byte stall, idle stall, early compaction cleanup), retry classification tests for transient errors, Codex signal propagation test, subagent timeout tests, and timeout constant assertion.

Test Plan

  • Verify watchdog triggers on hung subscription streams (first-byte and idle timeouts)
  • Verify normal streams are not affected by the watchdog
  • Verify timeout recovery works correctly (retry then fail after max attempts)
  • Verify provider request timeout fires after 5 minutes of no response
  • Verify abort signal propagates through Codex token refresh
  • Verify subagent tasks time out and cancel child session
  • Verify 429/502/503/504 errors are classified as retryable

@github-actions
Copy link
Copy Markdown
Contributor

Hey! Your PR title fix/chatgpt subscription hang watchdog doesn't follow conventional commit format.

Please update it to start with one of:

  • feat: or feat(scope): new feature
  • fix: or fix(scope): bug fix
  • docs: or docs(scope): documentation changes
  • chore: or chore(scope): maintenance tasks
  • refactor: or refactor(scope): code refactoring
  • test: or test(scope): adding or updating tests

Where scope is the package name (e.g., app, desktop, opencode).

See CONTRIBUTING.md for details.

@liuhaoyang liuhaoyang changed the title fix/chatgpt subscription hang watchdog fix(opencode): repair chatgpt subscription hang watchdog May 26, 2026
@github-actions github-actions Bot added needs:compliance This means the issue will auto-close after 2 hours. and removed needs:title labels May 26, 2026
@github-actions
Copy link
Copy Markdown
Contributor

This PR doesn't fully meet our contributing guidelines and PR template.

What needs to be fixed:

  • PR description is missing required template sections. Please use the PR template.

Please edit this PR description to address the above within 2 hours, or it will be automatically closed.

If you believe this was flagged incorrectly, please let a maintainer know.

@github-actions
Copy link
Copy Markdown
Contributor

Thanks for your contribution!

This PR doesn't have a linked issue. All PRs must reference an existing issue.

Please:

  1. Open an issue describing the bug/feature (if one doesn't exist)
  2. Add Fixes #<number> or Closes #<number> to this PR description

See CONTRIBUTING.md for details.

@github-actions
Copy link
Copy Markdown
Contributor

This pull request has been automatically closed because it was not updated to meet our contributing guidelines within the 2-hour window.

Feel free to open a new pull request that follows our guidelines.

@github-actions github-actions Bot removed the needs:compliance This means the issue will auto-close after 2 hours. label May 26, 2026
@github-actions github-actions Bot closed this May 26, 2026
@github-actions github-actions Bot added needs:compliance This means the issue will auto-close after 2 hours. and removed needs:issue labels May 26, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

needs:compliance This means the issue will auto-close after 2 hours.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

fix(opencode): chatgpt subscription stream hangs due to watchdog timeout

1 participant