fix(llm): stream structured fallback and expose fetch error cause by cloud5418 · Pull Request #49 · ExplosiveCoderflome/AI-Novel-Writing-Assistant

cloud5418 · 2026-05-11T03:15:24Z

Problem

Long-output structured invocations (e.g. `novel.volume.skeleton@v2`, volume strategy planning, long chapter macro plans) failed with:

```
[STRUCTURED_OUTPUT:transport_error] [novel.volume.skeleton@v2.fallback] Request timed out after 200000ms.
```

Root cause: third-party OpenAI-compatible aggregator proxies (commonly used to serve gpt-5.x reasoning models in China) cut idle non-stream connections after ~125s. The reasoning + JSON generation regularly exceeds that window for prompts that produce >5KB output, so the request times out without a single response byte and the application can only report a bare `fetch failed` after its own fallback timeout fires.

A separate but related issue: when the underlying `fetch failed` did have a useful Node `error.cause` (e.g. `code=ENOTFOUND`, `UND_ERR_SOCKET`, `ECONNRESET`), the existing classifier dropped it and the user only saw `[STRUCTURED_OUTPUT:transport_error] fetch failed`, making diagnosis impossible.

Fix

Stream the direct-transport fallback

`invokeStructuredPromptJsonViaDirectOpenAICompatible` (the prompt_json fallback path that runs raw `fetch` without LangChain) now:

sends `stream: true` in the request body
parses Server-Sent Events into the same string the existing parser consumes
falls back to JSON parsing when the upstream ignores `stream: true` (i.e., the response `content-type` is not `text/event-stream`/`application/x-ndjson`)

Because chunks keep the socket alive, the 125s idle cap no longer trips on long generations.

Expose `error.cause`

New `describeNetworkErrorCause` walks up to 6 levels of `error.cause` to extract undici/Node fields (`code`/`errno`/`syscall`/`hostname`/`address`/`port`).
New `enrichErrorMessageWithCause` appends `(cause: code=…)` to the message when one is found and not already present.
`wrapStructuredInvokeError` uses `enrichErrorMessageWithCause` instead of bare `error.message` and forwards the original error as the `cause` of `StructuredOutputError`.
`summarizeStructuredOutputFailure` now appends the parsed cause (and a "check upstream / proxy" hint) when category is `transport_error`.

Verification

End-to-end against the production-style aggregator (sub2api → gpt-5.4):

Scenario	Result
Non-stream 10-volume skeleton POST	socket cut at 125s, 0 bytes (curl 56)
Streamed 10-volume skeleton POST (this change)	154s, 1.0MB SSE, finish_reason=stop, 10 volumes parsed
`invokeStructuredLlmDetailed` full strategy seq	strategy[0/1] aborted at ~43s as before, fallback (this change) completes in 137.6s with full 10-volume JSON; previous code hit 200s fallback timeout

`pnpm --filter @ai-novel/server typecheck` clean.

Relevant unit tests pass (these were the ones touching the transport fallback path):

`tests/storyMacroFallback.test.js`
`tests/directorBookContractFallback.test.js`
`tests/directorCandidateFallback.test.js`
`tests/directorRecoverySampleAudit.test.js`

Compatibility / Risk

Adding `stream: true` is OpenAI-protocol-standard; aggregators tested (sub2api, ccp) accept it. Providers that ignore the flag still return JSON and the existing JSON path handles them.
SSE parsing is contained to the fallback transport. The primary LangChain `ChatOpenAI` invocation is unchanged.
The `cause` propagation is additive — message strings are only extended, never replaced. The existing classifier still returns `transport_error` for the same inputs.

Why now

Reported failures on `novel.volume.skeleton@v2` and earlier `auto-director` rhythm/chapter-split stages in production. With this change, long-running structured invocations against proxied gpt-5.x reasoning models can complete instead of timing out at the proxy idle limit.

- require signed public desktop release workflow - add trusted GitHub feed and minimum version gates - validate staged updater metadata and add desktop security tests

- persist minimum update version into packaged runtime config - treat prerelease versions as below the matching stable floor - remove deprecated desktop-v public release path

- remove polluted .gitignore entry `m[[]0])` (between .env.local and dev.db) - ignore .cursor/, *.log, tmp/ to prevent IDE/log noise from leaking - git rm --cached .codex-backups/ and .cursor/ (kept on disk) - delete root-level empty .codex placeholder file

…Coderflome#22)

Equivalent re-implementation of fork c6d30e0's headers feature on top of upstream main. The original commit's task-execution-log routing fix is intentionally dropped because upstream's auto-director rewrite removed the /:taskId vs /execution-logs collision that the fix targeted. - ModelRouteConfig.requestHeadersText column with prisma + sqlite migrations - Server: parseRequestHeadersText utility, threaded through ResolvedModel and resolveLLMClientOptions; applied at anthropicClient, connectivity, factory (defaultHeaders for OpenAI), structuredInvoke, routes/llm - Client: textarea on settings page (per-route + bulk), with deferred connectivity probing carried over from the same upstream commit - Tests: parser unit, modelRouter user override, llmProviders parsing - Release notes: 2026-05-09 entry

Cherry-picked from PR ExplosiveCoderflome#21 squash (e2800a4). Standalone file with no upstream collision; the rest of e2800a4 (README cleanup, package.json dev:log removal) is no longer applicable on top of upstream main.

Set open-pull-requests-limit to 0 temporarily. Dependabot's internal scan state is locked to the pre-sync default branch (chore/dependabot-2026-05-09); all PRs it opens are ahead 30 / behind 148 vs main. Pausing prevents new stale-baseline PRs while we clean the queue, then this commit will be reverted.

Bumps [actions/checkout](https://github.com/actions/checkout) from 4 to 6. - [Release notes](https://github.com/actions/checkout/releases) - [Changelog](https://github.com/actions/checkout/blob/main/CHANGELOG.md) - [Commits](actions/checkout@v4...v6) --- updated-dependencies: - dependency-name: actions/checkout dependency-version: '6' dependency-type: direct:production update-type: version-update:semver-major ... Signed-off-by: dependabot[bot] <support@github.com>

This reverts commit 94b6859.

Bumps the typescript-tooling group with 2 updates: [typescript](https://github.com/microsoft/TypeScript) and [@types/node](https://github.com/DefinitelyTyped/DefinitelyTyped/tree/HEAD/types/node). Updates `typescript` from 5.9.3 to 6.0.3 - [Release notes](https://github.com/microsoft/TypeScript/releases) - [Commits](microsoft/TypeScript@v5.9.3...v6.0.3) Updates `@types/node` from 25.3.3 to 25.6.2 - [Release notes](https://github.com/DefinitelyTyped/DefinitelyTyped/releases) - [Commits](https://github.com/DefinitelyTyped/DefinitelyTyped/commits/HEAD/types/node) --- updated-dependencies: - dependency-name: typescript dependency-version: 6.0.3 dependency-type: direct:development update-type: version-update:semver-major dependency-group: typescript-tooling - dependency-name: "@types/node" dependency-version: 25.6.2 dependency-type: direct:development update-type: version-update:semver-minor dependency-group: typescript-tooling ... Signed-off-by: dependabot[bot] <support@github.com>

…ithub_actions/actions/checkout-6 ci(deps)(deps): bump actions/checkout from 4 to 6

…pm_and_yarn/typescript-tooling-d818b2e23e chore(deps)(deps-dev): bump the typescript-tooling group with 2 updates

…ured-protocol-governance-desktop-fix fix: harden desktop structured auto-director fallback

- direct-transport prompt_json fallback now sends `stream: true` and parses Server-Sent Events; this bypasses third-party aggregator proxies that cut idle non-stream connections after ~125s, which was causing volume-skeleton (and similar long-output) requests to fail with `[STRUCTURED_OUTPUT:transport_error] ... timed out after 200000ms` even though the model was still reasoning. - response handling keeps the existing JSON path as a fallback when the upstream ignores `stream:true` (no `text/event-stream` content type), so providers that do not support streaming still work. - when a low-level `fetch failed` happens, walk `error.cause` to expose undici-style fields (`code`/`errno`/`syscall`/`host`) in the user- visible message instead of bare "fetch failed". - `StructuredOutputError` now accepts/forwards a `cause`, and `summarizeStructuredOutputFailure` appends the parsed cause + a hint to retry/check upstream when category is `transport_error`. Verified end-to-end against an OpenAI-compatible aggregator: - non-stream 10-volume skeleton request: socket cut at 125s, 0 bytes. - streamed request (this change): 154s total, 1.0MB SSE, finish_reason stop, 10 volumes parsed. - `invokeStructuredLlmDetailed` runs the full strategy sequence and the stream-mode fallback returns a complete result in 137.6s where prior runs hit the 200s fallback timeout. `pnpm --filter @ai-novel/server typecheck` clean. Relevant fallback unit tests (story_macro / director_book_contract / director_candidate / director_recovery_sample_audit) still pass.

cloud5418 and others added 16 commits May 9, 2026 15:51

fix: harden desktop auto-update security

9dcd6a0

- require signed public desktop release workflow - add trusted GitHub feed and minimum version gates - validate staged updater metadata and add desktop security tests

fix(desktop): close updater audit review gaps

0233fb5

- persist minimum update version into packaged runtime config - treat prerelease versions as below the matching stable floor - remove deprecated desktop-v public release path

ci(deps): add Dependabot config for npm and github-actions (Explosive…

a7e6ef5

…Coderflome#22)

docs(superpowers): add directory README

aa0c448

Cherry-picked from PR ExplosiveCoderflome#21 squash (e2800a4). Standalone file with no upstream collision; the rest of e2800a4 (README cleanup, package.json dev:log removal) is no longer applicable on top of upstream main.

Revert "chore(deps): pause Dependabot npm group during baseline reset"

7a7dd28

This reverts commit 94b6859.

Merge pull request ExplosiveCoderflome#26 from cloud5418/dependabot/g…

4ba4dde

…ithub_actions/actions/checkout-6 ci(deps)(deps): bump actions/checkout from 4 to 6

Merge pull request ExplosiveCoderflome#27 from cloud5418/dependabot/n…

4329173

…pm_and_yarn/typescript-tooling-d818b2e23e chore(deps)(deps-dev): bump the typescript-tooling group with 2 updates

fix: harden desktop structured auto-director fallback

5473792

Merge pull request ExplosiveCoderflome#33 from cloud5418/codex/struct…

1a02f96

…ured-protocol-governance-desktop-fix fix: harden desktop structured auto-director fallback

fix: stabilize desktop packaging and structured fallback followups

843198a

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(llm): stream structured fallback and expose fetch error cause#49

fix(llm): stream structured fallback and expose fetch error cause#49
cloud5418 wants to merge 16 commits into
ExplosiveCoderflome:mainfrom
cloud5418:fix/structured-fallback-streaming-and-cause

cloud5418 commented May 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

cloud5418 commented May 11, 2026

Problem

Fix

Stream the direct-transport fallback

Expose `error.cause`

Verification

Compatibility / Risk

Why now

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant