fix(core): honor Retry-After header on retried model calls#1283
fix(core): honor Retry-After header on retried model calls#1283truffle-dev wants to merge 2 commits into
Conversation
The retry loop in `executeWithModelFallback` always used local exponential backoff capped at 10 seconds, regardless of what the server asked for. Under shared provider contention this caused concurrent agents to converge their retry windows into the same window the provider had just told them to wait past, amplifying load on already-overloaded endpoints. Move the retry-delay math into a small `retry-after` module that parses both delta-seconds and HTTP-date forms (RFC 7231 §7.1.3), takes the server hint as a floor, keeps the exponential floor as a backpressure baseline, and caps at 5 minutes so a misconfigured or hostile server cannot pin the agent for hours. Closes VoltAgent#1276.
🦋 Changeset detectedLatest commit: 8aa662b The changes in this PR will be included in the next version bump. This PR includes changesets to release 1 package
Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (2)
📝 WalkthroughWalkthroughThis PR adds RFC 7231 ChangesRetry-After Header Support
🎯 3 (Moderate) | ⏱️ ~25 minutes
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Warning There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure. 🔧 ESLint
ESLint skipped: no ESLint configuration detected in root package.json. To enable, add Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@packages/core/src/agent/retry-after.ts`:
- Around line 69-76: getRetryAfterMs currently only checks
headers["retry-after"] and headers["Retry-After"], which misses mixed-case
names; change the lookup to be case-insensitive by normalizing header keys
(e.g., iterate Object.keys(responseHeaders) and compare key.toLowerCase() ===
"retry-after") or build a lower-cased map before fetching the value, then pass
the found raw value to parseRetryAfter; update references in getRetryAfterMs to
use the normalized lookup of responseHeaders rather than the two exact keys.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 12f4e233-59be-4d06-ac53-723b2cc3d0dd
📒 Files selected for processing (5)
.changeset/honor-retry-after-header.mdpackages/core/src/agent/agent.spec.tspackages/core/src/agent/agent.tspackages/core/src/agent/retry-after.spec.tspackages/core/src/agent/retry-after.ts
`responseHeaders` is normalized to lowercase by the AI SDK, but providers that build the bag from a raw `fetch` Response can leak any casing through, so the lookup needs to match RFC 7230 §3.2 case-insensitively instead of only checking `retry-after` and `Retry-After`. Adds three regression tests for mixed-case spellings.
PR Checklist
Bugs / Features
What is the current behavior?
The retry loop in
executeWithModelFallback(the single retry-delay site forstreamText/generateText/streamObject/generateObjectafter their AI-SDK-internal retries are disabled withmaxRetries: 0) always used local exponential backoff capped at 10 seconds:APICallErrorcarries the provider's response headers, but they are dropped on the floor. So when a provider responds 429 withRetry-After: 30, the agent tries again in 1–10 seconds and gets rate-limited again, and N concurrent agents under the same provider key converge their retry windows into roughly the same instant.What is the new behavior?
Move the retry-delay math into a small
retry-aftermodule:parseRetryAfter(value, nowMs?)understands both forms in RFC 7231 §7.1.3 (delta-seconds and HTTP-date).getRetryAfterMs(error, nowMs?)pulls the header offerror.responseHeadersin either case (lowercase or canonical).computeRetryDelayMs(error, attemptIndex, nowMs?)returnsmax(serverHint, exponentialFloor)when a header is present, keeping the exponential floor as a backpressure baseline soRetry-After: 0still spaces things out. Result is capped at 5 minutes so a misconfigured or hostile server can't pin the agent.Then
agent.tscallscomputeRetryDelayMs(error, attemptIndex)instead of computing the delay inline. The hook surface, log shape, and retry-vs-fallback decision are unchanged.Tests added:
retry-after.spec.ts— 18 unit tests covering parsing edge cases (delta-seconds, HTTP-date, malformed values, past dates, safety cap, missing header, lowercase/canonical precedence).agent.spec.ts— one integration test that verifies aRetry-After: 30on a 429-shaped error flows through tosetTimeoutas 30000 ms.fixes #1276
Notes for reviewers
Math.max(serverHint, exponentialFloor)choice is deliberate: a server that returnsRetry-After: 0should still wait the exponential floor on subsequent attempts, otherwise a hot-loop retry storm is possible. If you prefer "server hint wins absolutely," I'm happy to flip it.MAX_RETRY_AFTER_MS) is tunable; the value matches what most HTTP clients use as a sane upper bound. I kept it as a module-local constant rather than a config knob to avoid expanding the public surface in this PR.executeWithModelFallbackalready disables AI SDK internal retries (maxRetries: 0) for all four entry points, so this is the single retry-delay site that needs the change.Summary by cubic
Honor the provider’s
Retry-Afterheader on model retries to respect server backoff and reduce retry storms. Parses both header forms, matches the header name case-insensitively, uses the server hint as a floor with a 5-minute cap; no API changes.Retry-Afterfromerror.responseHeadersinexecuteWithModelFallback(delta-seconds or HTTP-date); case-insensitive lookup.retry-afterhelpers (parseRetryAfter,getRetryAfterMs,computeRetryDelayMs) and tests; agent now callscomputeRetryDelayMs. Fixes agent.ts — 429 retry path ignores Retry-After, coordinated amplification under shared provider contention #1276.Written for commit 8aa662b. Summary will update on new commits. Review in cubic
Summary by CodeRabbit
Bug Fixes
Retry-Afterguidance as a minimum, while preserving exponential backoff and enforcing a 5-minute safety cap.Tests
Retry-Aftervalues, exponential fallback behavior, and delay clamping.Documentation