Skip to content

fix(core): honor Retry-After header on retried model calls#1283

Open
truffle-dev wants to merge 2 commits into
VoltAgent:mainfrom
truffle-dev:fix/retry-after-header-in-429-path-1276
Open

fix(core): honor Retry-After header on retried model calls#1283
truffle-dev wants to merge 2 commits into
VoltAgent:mainfrom
truffle-dev:fix/retry-after-header-in-429-path-1276

Conversation

@truffle-dev
Copy link
Copy Markdown
Contributor

@truffle-dev truffle-dev commented May 14, 2026

PR Checklist

  • The commit message follows the conventional-commit convention

Bugs / Features

What is the current behavior?

The retry loop in executeWithModelFallback (the single retry-delay site for streamText / generateText / streamObject / generateObject after their AI-SDK-internal retries are disabled with maxRetries: 0) always used local exponential backoff capped at 10 seconds:

const retryDelayMs = Math.min(1000 * 2 ** attemptIndex, 10000);

APICallError carries the provider's response headers, but they are dropped on the floor. So when a provider responds 429 with Retry-After: 30, the agent tries again in 1–10 seconds and gets rate-limited again, and N concurrent agents under the same provider key converge their retry windows into roughly the same instant.

What is the new behavior?

Move the retry-delay math into a small retry-after module:

  • parseRetryAfter(value, nowMs?) understands both forms in RFC 7231 §7.1.3 (delta-seconds and HTTP-date).
  • getRetryAfterMs(error, nowMs?) pulls the header off error.responseHeaders in either case (lowercase or canonical).
  • computeRetryDelayMs(error, attemptIndex, nowMs?) returns max(serverHint, exponentialFloor) when a header is present, keeping the exponential floor as a backpressure baseline so Retry-After: 0 still spaces things out. Result is capped at 5 minutes so a misconfigured or hostile server can't pin the agent.

Then agent.ts calls computeRetryDelayMs(error, attemptIndex) instead of computing the delay inline. The hook surface, log shape, and retry-vs-fallback decision are unchanged.

Tests added:

  • retry-after.spec.ts — 18 unit tests covering parsing edge cases (delta-seconds, HTTP-date, malformed values, past dates, safety cap, missing header, lowercase/canonical precedence).
  • agent.spec.ts — one integration test that verifies a Retry-After: 30 on a 429-shaped error flows through to setTimeout as 30000 ms.

fixes #1276

Notes for reviewers

  • The Math.max(serverHint, exponentialFloor) choice is deliberate: a server that returns Retry-After: 0 should still wait the exponential floor on subsequent attempts, otherwise a hot-loop retry storm is possible. If you prefer "server hint wins absolutely," I'm happy to flip it.
  • 5-minute safety cap (MAX_RETRY_AFTER_MS) is tunable; the value matches what most HTTP clients use as a sane upper bound. I kept it as a module-local constant rather than a config knob to avoid expanding the public surface in this PR.
  • executeWithModelFallback already disables AI SDK internal retries (maxRetries: 0) for all four entry points, so this is the single retry-delay site that needs the change.

Summary by cubic

Honor the provider’s Retry-After header on model retries to respect server backoff and reduce retry storms. Parses both header forms, matches the header name case-insensitively, uses the server hint as a floor with a 5-minute cap; no API changes.

Written for commit 8aa662b. Summary will update on new commits. Review in cubic

Summary by CodeRabbit

  • Bug Fixes

    • Retry logic for model calls now respects provider Retry-After guidance as a minimum, while preserving exponential backoff and enforcing a 5-minute safety cap.
  • Tests

    • Added unit tests covering parsing and enforcement of Retry-After values, exponential fallback behavior, and delay clamping.
  • Documentation

    • Added a changeset summarizing the updated retry behavior.

Review Change Stack

The retry loop in `executeWithModelFallback` always used local exponential
backoff capped at 10 seconds, regardless of what the server asked for.
Under shared provider contention this caused concurrent agents to converge
their retry windows into the same window the provider had just told them
to wait past, amplifying load on already-overloaded endpoints.

Move the retry-delay math into a small `retry-after` module that parses
both delta-seconds and HTTP-date forms (RFC 7231 §7.1.3), takes the server
hint as a floor, keeps the exponential floor as a backpressure baseline,
and caps at 5 minutes so a misconfigured or hostile server cannot pin the
agent for hours.

Closes VoltAgent#1276.
@changeset-bot
Copy link
Copy Markdown

changeset-bot Bot commented May 14, 2026

🦋 Changeset detected

Latest commit: 8aa662b

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 1 package
Name Type
@voltagent/core Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 14, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: b834cf8e-a026-497a-b3d7-f8191c3b8b7f

📥 Commits

Reviewing files that changed from the base of the PR and between b6f5b8c and 8aa662b.

📒 Files selected for processing (2)
  • packages/core/src/agent/retry-after.spec.ts
  • packages/core/src/agent/retry-after.ts

📝 Walkthrough

Walkthrough

This PR adds RFC 7231 Retry-After parsing utilities and integrates them into Agent retry logic so server-provided Retry-After values serve as a minimum delay alongside exponential backoff, with a 5-minute safety cap; tests and a changeset were added.

Changes

Retry-After Header Support

Layer / File(s) Summary
Retry-After RFC 7231 parsing utilities
packages/core/src/agent/retry-after.ts, packages/core/src/agent/retry-after.spec.ts
parseRetryAfter parses delta-seconds or HTTP-date values to milliseconds (validates, normalizes past dates to 0, clamps to a 5-minute max); getRetryAfterMs extracts Retry-After from error responseHeaders case-insensitively; computeRetryDelayMs returns max(exponentialBackoff, serverHint). Tests cover parsing, header lookup, floor semantics, and clamping.
Agent model retry integration
packages/core/src/agent/agent.ts, packages/core/src/agent/agent.spec.ts
Agent imports computeRetryDelayMs and replaces its inline exponential backoff with the utility; a new unit test verifies that a retryable 429 with retry-after: 30 schedules a 30s delay before retry.
Changeset documentation
.changeset/honor-retry-after-header.md
Documents switch from local exponential backoff (10s cap) to honoring provider Retry-After hints as a minimum delay with a 5-minute safety cap.

🎯 3 (Moderate) | ⏱️ ~25 minutes

🐰 I sniffed the header on the breeze,

"Wait a while," it said with ease.
No more huddled rush and race—
We'll hop back gently, keep our pace. 🥕✨

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed The PR title clearly and concisely describes the main change: honoring the Retry-After header on retried model calls, which directly addresses the core issue being fixed.
Description check ✅ Passed The description is comprehensive and well-structured, covering current vs. new behavior, design choices, test additions, and reviewer notes, with proper checklist completion.
Linked Issues check ✅ Passed The PR successfully implements all core objectives from #1276: reads and honors Retry-After header, uses it as minimum retry delay with exponential backoff floor, caps at 5 minutes, and falls back to exponential backoff when absent.
Out of Scope Changes check ✅ Passed All code changes are directly related to honoring the Retry-After header. The new retry-after module, agent.ts modification, and comprehensive tests are all in-scope for the stated objective.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Warning

There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure.

🔧 ESLint

If the error stems from missing dependencies, add them to the package.json file. For unrecoverable errors (e.g., due to private dependencies), disable the tool in the CodeRabbit configuration.

ESLint skipped: no ESLint configuration detected in root package.json. To enable, add eslint to devDependencies.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@packages/core/src/agent/retry-after.ts`:
- Around line 69-76: getRetryAfterMs currently only checks
headers["retry-after"] and headers["Retry-After"], which misses mixed-case
names; change the lookup to be case-insensitive by normalizing header keys
(e.g., iterate Object.keys(responseHeaders) and compare key.toLowerCase() ===
"retry-after") or build a lower-cased map before fetching the value, then pass
the found raw value to parseRetryAfter; update references in getRetryAfterMs to
use the normalized lookup of responseHeaders rather than the two exact keys.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 12f4e233-59be-4d06-ac53-723b2cc3d0dd

📥 Commits

Reviewing files that changed from the base of the PR and between 08414ed and b6f5b8c.

📒 Files selected for processing (5)
  • .changeset/honor-retry-after-header.md
  • packages/core/src/agent/agent.spec.ts
  • packages/core/src/agent/agent.ts
  • packages/core/src/agent/retry-after.spec.ts
  • packages/core/src/agent/retry-after.ts

Comment thread packages/core/src/agent/retry-after.ts
Copy link
Copy Markdown
Contributor

@cubic-dev-ai cubic-dev-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No issues found across 5 files

`responseHeaders` is normalized to lowercase by the AI SDK, but providers
that build the bag from a raw `fetch` Response can leak any casing
through, so the lookup needs to match RFC 7230 §3.2 case-insensitively
instead of only checking `retry-after` and `Retry-After`.

Adds three regression tests for mixed-case spellings.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

agent.ts — 429 retry path ignores Retry-After, coordinated amplification under shared provider contention

1 participant