Skip to content

fix(design): honor Retry-After header in variants 429 handler#1337

Closed
stedfn wants to merge 1 commit intogarrytan:mainfrom
stedfn:fix/design-variants-retry-after-1244
Closed

fix(design): honor Retry-After header in variants 429 handler#1337
stedfn wants to merge 1 commit intogarrytan:mainfrom
stedfn:fix/design-variants-retry-after-1244

Conversation

@stedfn
Copy link
Copy Markdown
Contributor

@stedfn stedfn commented May 6, 2026

Closes #1244.

Problem

design/src/variants.ts's 429 handler in generateVariant discards the Retry-After response header and falls through to a local exponential schedule (2s/4s/8s). On image-generation batches, that can burn all retry attempts inside the provider's cooldown window and the request never recovers.

Fix

Parse Retry-After per RFC 7231 — both delta-seconds and HTTP-date. Honored waits capped at 60s to bound stalls from hostile or buggy headers. Delta-seconds validated as digits-only (rejects 2abc). When honored (including 0 and past dates — "retry now"), skip the next iteration's leading exponential sleep so we don't double-wait. Invalid or missing headers fall through to the existing exponential schedule unchanged.

Behavior matrix

Header Behavior
Retry-After: 5 wait 5s, skip leading on next attempt
Retry-After: 999999 capped to 60s, skip leading
Retry-After: 2abc invalid, fall through to exponential
Retry-After: 0 wait 0, skip leading (retry immediately)
Retry-After: <past HTTP-date> wait 0, skip leading
Retry-After: <future date> wait diff capped at 60s, skip leading
no header fall through to existing exponential

Test plan

  • New: design/test/variants-retry-after.test.ts — 5 cases covering delta-seconds, HTTP-date, invalid, no-header, and Retry-After: 0. Each asserts both the call count and the 1st-to-2nd call timing gap so the fall-through and the no-double-wait properties are both verified.
  • generateVariant now accepts an optional fetchFn parameter (default globalThis.fetch) so the tests can inject a stub. Production call sites unchanged.
  • Run: bun test design/test/variants-retry-after.test.ts — all 5 pass in ~8s locally.

Notes

  • Cap value (60s) is exercised by code review (Math.min(..., MAX_RETRY_AFTER_MS)); not asserted at runtime to keep the test under 10s.
  • HTTP-date test uses a 3s offset to avoid second-resolution flake (HTTP dates have second resolution; Date.now() is ms).

Closes garrytan#1244.

The 429 handler in `generateVariant` discarded the `Retry-After` response
header and fell straight through to a local exponential schedule (2s/4s/8s).
In image-generation batches, that burns retry attempts inside the provider's
cooldown window and the request never recovers.

Now we parse `Retry-After` per RFC 7231 — both delta-seconds (`Retry-After: 5`)
and HTTP-date (`Retry-After: Fri, 31 Dec 1999 23:59:59 GMT`). Honored waits
are capped at 60s to bound stalls from hostile or buggy headers. Delta-seconds
are validated as digits-only (rejects `2abc`). When `Retry-After` is honored
(including 0 / past-date "retry now"), the next iteration's leading exponential
sleep is skipped so we don't double-wait. Invalid or missing headers fall
through to the existing exponential schedule unchanged.

Behavior matrix:

| Header                          | Behavior                                  |
|---------------------------------|-------------------------------------------|
| Retry-After: 5                  | wait 5s, skip leading on next attempt     |
| Retry-After: 999999             | capped to 60s, skip leading               |
| Retry-After: 2abc               | invalid, fall through to exponential      |
| Retry-After: 0                  | wait 0, skip leading (retry immediately)  |
| Retry-After: <past HTTP-date>   | wait 0, skip leading                      |
| Retry-After: <future date>      | wait diff capped at 60s, skip leading     |
| no header                       | fall through to existing exponential      |

`generateVariant` now accepts an optional `fetchFn` parameter (defaults to
`globalThis.fetch`) so tests can inject a stub. Production call sites are
unchanged.

Tests cover the five behavior buckets above, asserting both the 1st-to-2nd
call timing gap and call counts. All five pass in ~8s.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@garrytan
Copy link
Copy Markdown
Owner

Thanks @stedfn — your fix shipped in v1.30.0.0 (#1391) with credit in the CHANGELOG. Closing since it's already on main. Appreciate the contribution.

@garrytan garrytan closed this May 10, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

design/src/variants.ts: 429 retry path ignores Retry-After header

2 participants