Add Codex OAuth provider by JOBOYA · Pull Request #486 · usestrix/strix

JOBOYA · 2026-05-12T08:43:11Z

Summary

Adds a codex/ LLM provider that lets Strix use the local Codex CLI ChatGPT OAuth login instead of requiring an API key.

Users can run:

codex login
export STRIX_LLM="codex/gpt-5.5"

greptile-apps · 2026-05-12T08:46:39Z

Greptile Summary

This PR adds a codex/ LLM provider that authenticates via the local Codex CLI OAuth token stored in ~/.codex/auth.json, allowing users to run Strix without an API key by reusing their ChatGPT login. The implementation covers credential loading, automatic token refresh on 401, OpenAI Responses-API payload building, SSE event parsing, and a bypass of the existing litellm path in LLM._stream.

strix/llm/codex_oauth.py: New self-contained module handling all OAuth credential I/O, HTTP requests to chatgpt.com/backend-api/codex, and SSE/JSON response parsing; logic is solid with good error wrapping.
strix/llm/llm.py: Adds _stream_codex_oauth as an early-return branch in _stream; permanent auth failures unintentionally fall into the existing retry loop because CodexOAuthError has no status_code, causing ~62 s of backoff before the user sees a helpful message.
strix/interface/main.py / strix/llm/config.py: Environment validation and warm-up are cleanly updated to skip API-key requirements for the codex/ prefix.

Confidence Score: 3/5

The new provider works end-to-end, but a first-time user who hasn't run codex login will wait over a minute through retries before seeing an actionable error message.

The core OAuth and SSE logic in codex_oauth.py is well-written and well-tested. The integration point in llm.py has a concrete defect: CodexOAuthError carries no HTTP status code, so the inherited retry loop treats every error as transient and backs off five times before giving up. This affects the most common first-run failure scenario — forgetting to run codex login — turning an instant, clear error into a 62-second delay.

strix/llm/llm.py — specifically the _stream_codex_oauth method and its interaction with the retry loop in generate().

Important Files Changed

Filename	Overview
strix/llm/codex_oauth.py	New module implementing Codex CLI OAuth credential loading, token refresh, payload building, and SSE response parsing; logic is sound with good error handling, minor issue with non-standard `version` request header.
strix/llm/llm.py	Adds `_stream_codex_oauth` as a bypass path in `_stream`; permanent `CodexOAuthError` failures inherit the existing retry loop (unintended, ~62 s delay), and the double-yield pattern differs from the regular streaming path in a way that could confuse display consumers.
strix/llm/config.py	Cleanly detects `codex/` prefix and populates `uses_codex_oauth` / `codex_model` on `LLMConfig`; no issues.
strix/interface/main.py	Skips API-key validation for codex OAuth users and runs a warm-up call via the new provider; straightforward and correct.
tests/llm/test_codex_oauth.py	Good coverage of credential loading, token refresh persistence, payload conversion, and SSE parsing; all tests are well-structured with clear assertions.
README.md	Adds documentation for the Codex OAuth login flow; accurate and concise.

Prompt To Fix All With AI

Fix the following 3 code review issues. Work through them one at a time, proposing concise fixes.

---

### Issue 1 of 3
strix/llm/llm.py:300-328
**Permanent auth errors trigger full retry backlog**

`CodexOAuthError` has no `status_code` attribute, so `_should_retry` returns `True` for every instance (the `code is None` branch). This means errors like "Codex auth file not found. Run `codex login` first." or "Codex OAuth request was unauthorized. Run `codex login` again." will be retried up to `max_retries` (default 5) times with exponential backoff — producing roughly 62 seconds of silent waiting before the error is finally surfaced to the user. A first-time user who hasn't run `codex login` will sit through the full retry loop every time.

### Issue 2 of 3
strix/llm/llm.py:317-328
**Double yield emits full raw content before processing**

When `content` is non-empty, two `LLMResponse` objects are yielded back-to-back: first the complete raw response (including unstripped `<thinking>` blocks and raw tool-call XML), then the processed version. In the regular `_stream` path, intermediate yields are partial incremental chunks, so the "raw" intermediate content is always shorter than the final. Here the first yield is the full, unprocessed text, which means any consumer that renders each yielded response would flash the complete raw output (with thinking tags) and then immediately overwrite it with the cleaned version.

### Issue 3 of 3
strix/llm/codex_oauth.py:205-209
**Non-standard `version` request header**

The `"version": "strix-codex-oauth"` header is not a recognized HTTP header name. Standard practice for identifying a client is `User-Agent`. OpenAI's own SDKs use `User-Agent` for client versioning. Using an arbitrary lowercase key named `version` is unlikely to be read correctly by the server, and if the intent is to mark traffic for observability or routing, it may silently have no effect.

_{Reviews (1): Last reviewed commit: "Add Codex OAuth provider" | Re-trigger Greptile}

greptile-apps · 2026-05-12T08:46:43Z

+    async def _stream_codex_oauth(
+        self, messages: list[dict[str, Any]]
+    ) -> AsyncIterator[LLMResponse]:
+        self._total_stats.requests += 1
+        model = self.config.codex_model or self.config.model_name
+        content, usage = await asyncio.to_thread(
+            complete_codex_oauth,
+            model,
+            messages,
+            self._reasoning_effort,
+            self.config.timeout,
+        )
+
+        if usage:
+            self._total_stats.input_tokens += usage.get("input_tokens", 0)
+            self._total_stats.output_tokens += usage.get("output_tokens", 0)
+
+        if content:
+            yield LLMResponse(content=content)
+
+        content = _THINKING_BLOCK_RE.sub("", content)
+        content = normalize_tool_format(content)
+        content = fix_incomplete_tool_call(_truncate_to_first_function(content))
+
+        yield LLMResponse(
+            content=content,
+            tool_invocations=parse_tool_invocations(content),
+            thinking_blocks=None,
+        )


Permanent auth errors trigger full retry backlog

CodexOAuthError has no status_code attribute, so _should_retry returns True for every instance (the code is None branch). This means errors like "Codex auth file not found. Run codex login first." or "Codex OAuth request was unauthorized. Run codex login again." will be retried up to max_retries (default 5) times with exponential backoff — producing roughly 62 seconds of silent waiting before the error is finally surfaced to the user. A first-time user who hasn't run codex login will sit through the full retry loop every time.

Prompt To Fix With AI

This is a comment left during a code review. Path: strix/llm/llm.py Line: 300-328 Comment: **Permanent auth errors trigger full retry backlog** `CodexOAuthError` has no `status_code` attribute, so `_should_retry` returns `True` for every instance (the `code is None` branch). This means errors like "Codex auth file not found. Run `codex login` first." or "Codex OAuth request was unauthorized. Run `codex login` again." will be retried up to `max_retries` (default 5) times with exponential backoff — producing roughly 62 seconds of silent waiting before the error is finally surfaced to the user. A first-time user who hasn't run `codex login` will sit through the full retry loop every time. How can I resolve this? If you propose a fix, please make it concise.

greptile-apps · 2026-05-12T08:46:45Z

+        if content:
+            yield LLMResponse(content=content)
+
+        content = _THINKING_BLOCK_RE.sub("", content)
+        content = normalize_tool_format(content)
+        content = fix_incomplete_tool_call(_truncate_to_first_function(content))
+
+        yield LLMResponse(
+            content=content,
+            tool_invocations=parse_tool_invocations(content),
+            thinking_blocks=None,
+        )


Double yield emits full raw content before processing

When content is non-empty, two LLMResponse objects are yielded back-to-back: first the complete raw response (including unstripped <thinking> blocks and raw tool-call XML), then the processed version. In the regular _stream path, intermediate yields are partial incremental chunks, so the "raw" intermediate content is always shorter than the final. Here the first yield is the full, unprocessed text, which means any consumer that renders each yielded response would flash the complete raw output (with thinking tags) and then immediately overwrite it with the cleaned version.

Prompt To Fix With AI

This is a comment left during a code review. Path: strix/llm/llm.py Line: 317-328 Comment: **Double yield emits full raw content before processing** When `content` is non-empty, two `LLMResponse` objects are yielded back-to-back: first the complete raw response (including unstripped `<thinking>` blocks and raw tool-call XML), then the processed version. In the regular `_stream` path, intermediate yields are partial incremental chunks, so the "raw" intermediate content is always shorter than the final. Here the first yield is the full, unprocessed text, which means any consumer that renders each yielded response would flash the complete raw output (with thinking tags) and then immediately overwrite it with the cleaned version. How can I resolve this? If you propose a fix, please make it concise.

greptile-apps · 2026-05-12T08:46:47Z

+    headers = {
+        "Authorization": f"Bearer {credentials.access_token}",
+        "Accept": "text/event-stream",
+        "Content-Type": "application/json",
+        "version": "strix-codex-oauth",


Non-standard version request header

The "version": "strix-codex-oauth" header is not a recognized HTTP header name. Standard practice for identifying a client is User-Agent. OpenAI's own SDKs use User-Agent for client versioning. Using an arbitrary lowercase key named version is unlikely to be read correctly by the server, and if the intent is to mark traffic for observability or routing, it may silently have no effect.

Prompt To Fix With AI

This is a comment left during a code review. Path: strix/llm/codex_oauth.py Line: 205-209 Comment: **Non-standard `version` request header** The `"version": "strix-codex-oauth"` header is not a recognized HTTP header name. Standard practice for identifying a client is `User-Agent`. OpenAI's own SDKs use `User-Agent` for client versioning. Using an arbitrary lowercase key named `version` is unlikely to be read correctly by the server, and if the intent is to mark traffic for observability or routing, it may silently have no effect. How can I resolve this? If you propose a fix, please make it concise.

Add Codex OAuth provider

975b2cc

greptile-apps Bot reviewed May 12, 2026

View reviewed changes

Address Codex OAuth review feedback

8022f48

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Codex OAuth provider#486

Add Codex OAuth provider#486
JOBOYA wants to merge 2 commits into
usestrix:mainfrom
JOBOYA:codex/codex-oauth-provider

JOBOYA commented May 12, 2026

Uh oh!

greptile-apps Bot commented May 12, 2026

Uh oh!

greptile-apps Bot May 12, 2026

Uh oh!

greptile-apps Bot May 12, 2026

Uh oh!

greptile-apps Bot May 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

JOBOYA commented May 12, 2026

Summary

Uh oh!

greptile-apps Bot commented May 12, 2026

Greptile Summary

Confidence Score: 3/5

Important Files Changed

Uh oh!

greptile-apps Bot May 12, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps Bot May 12, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps Bot May 12, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant