Skip to content

fix(core): prevent uncontrolled retry loop during fallback#27008

Open
adamfweidman wants to merge 5 commits into
mainfrom
fix-retry-loop-429
Open

fix(core): prevent uncontrolled retry loop during fallback#27008
adamfweidman wants to merge 5 commits into
mainfrom
fix-retry-loop-429

Conversation

@adamfweidman
Copy link
Copy Markdown
Contributor

Summary

Fixes an uncontrolled retry loop bug where a single user message gets re-sent 30-50+ times to the fallback model when the primary model returns a 429 error.

Details

The bug was caused by retryWithBackoff resetting the attempt counter and bypassing the delay on fallback, even when falling back to the same model. The fix ensures that it only skips the delay when transitioning to a new model, and respects the backoff delay when retrying the same model.

Related Issues

Fixes #23889

How to Validate

Run the reproduction test added in packages/core/src/utils/retry.test.ts:
npm test -w @google/gemini-cli-core -- src/utils/retry.test.ts
Verify that it waits between retries when falling back to the same model.

Pre-Merge Checklist

  • Updated relevant documentation and README (if needed)
  • Added/updated tests (if needed)
  • Noted breaking changes (if any)
  • Validated on required platforms/methods:
    • MacOS
      • npm run
      • npx
      • Docker
      • Podman
      • Seatbelt
    • Windows
      • npm run
      • npx
      • Docker
    • Linux
      • npm run
      • npx
      • Docker

The retry/fallback path was re-enqueuing the pending user message without deduplication or delay when falling back to the same model, causing a multiplication loop. This fix ensures that we only retry immediately when transitioning to a new model, and respect the backoff delay when retrying the same model.

TAG=agy

CONV=37b3cdb6-8437-4ced-ae4e-3d04054bacb1
@adamfweidman adamfweidman requested a review from a team as a code owner May 13, 2026 18:23
@github-actions
Copy link
Copy Markdown

⚠️ Invocation Drift Warning

The following file pairs should generally be kept in sync during the AgentSession migration:

  • non-interactive CLI: packages/cli/src/nonInteractiveCli.ts was modified but packages/cli/src/nonInteractiveCliAgentSession.ts was not.

If this is intentional (e.g., a bug fix specific to one implementation), you can ignore this comment.

This check will be removed once the legacy implementations are deleted.

@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 13, 2026

Size Change: +1.1 kB (0%)

Total Size: 34.1 MB

Filename Size Change
./bundle/chunk-CVJA4NGW.js 0 B -2.79 MB (removed) 🏆
./bundle/chunk-DTYQGBIG.js 0 B -49.2 kB (removed) 🏆
./bundle/chunk-LS2TQJ35.js 0 B -3.8 kB (removed) 🏆
./bundle/chunk-OCY5N34Q.js 0 B -3.43 kB (removed) 🏆
./bundle/chunk-TPN6AQRQ.js 0 B -659 kB (removed) 🏆
./bundle/chunk-UWM6DCB5.js 0 B -14.8 MB (removed) 🏆
./bundle/chunk-YHFK23HJ.js 0 B -12.5 kB (removed) 🏆
./bundle/chunk-ZDASTAYR.js 0 B -19.5 kB (removed) 🏆
./bundle/core-T7OOWSQO.js 0 B -49.3 kB (removed) 🏆
./bundle/devtoolsService-WC4FSMQQ.js 0 B -28 kB (removed) 🏆
./bundle/gemini-3NCAN3KL.js 0 B -587 kB (removed) 🏆
./bundle/interactiveCli-RD2SJ4LZ.js 0 B -1.3 MB (removed) 🏆
./bundle/liteRtServerManager-QOEWEFPI.js 0 B -2.11 kB (removed) 🏆
./bundle/oauth2-provider-K4WBTHNC.js 0 B -9.16 kB (removed) 🏆
./bundle/chunk-4XWYMCLE.js 2.79 MB +2.79 MB (new file) 🆕
./bundle/chunk-ASUFRT44.js 14.8 MB +14.8 MB (new file) 🆕
./bundle/chunk-D3ZCTCE2.js 19.5 kB +19.5 kB (new file) 🆕
./bundle/chunk-EGGVUBVW.js 659 kB +659 kB (new file) 🆕
./bundle/chunk-FUXXYIEQ.js 3.43 kB +3.43 kB (new file) 🆕
./bundle/chunk-G54EV4DR.js 12.5 kB +12.5 kB (new file) 🆕
./bundle/chunk-PTK3QWOE.js 49.2 kB +49.2 kB (new file) 🆕
./bundle/chunk-YRLE2FNI.js 3.8 kB +3.8 kB (new file) 🆕
./bundle/core-R3WQBOZ2.js 49.3 kB +49.3 kB (new file) 🆕
./bundle/devtoolsService-6BL247S4.js 28 kB +28 kB (new file) 🆕
./bundle/gemini-5UWZYUJ6.js 587 kB +587 kB (new file) 🆕
./bundle/interactiveCli-HMM5O5XS.js 1.3 MB +1.3 MB (new file) 🆕
./bundle/liteRtServerManager-BILPACCQ.js 2.11 kB +2.11 kB (new file) 🆕
./bundle/oauth2-provider-4PZESDEV.js 9.16 kB +9.16 kB (new file) 🆕
ℹ️ View Unchanged
Filename Size Change
./bundle/bundled/third_party/index.js 8 MB 0 B
./bundle/chunk-34MYV7JD.js 2.45 kB 0 B
./bundle/chunk-5AUYMPVF.js 858 B 0 B
./bundle/chunk-5PS3AYFU.js 1.18 kB 0 B
./bundle/chunk-664ZODQF.js 124 kB 0 B
./bundle/chunk-7BNI4AA2.js 1.98 MB 0 B
./bundle/chunk-DAHVX5MI.js 206 kB 0 B
./bundle/chunk-IUUIT4SU.js 56.5 kB 0 B
./bundle/chunk-RJTRUG2J.js 39.8 kB 0 B
./bundle/cleanup-PEUPB5A6.js 0 B -932 B (removed) 🏆
./bundle/devtools-36NN55EP.js 696 kB 0 B
./bundle/dist-T73EYRDX.js 356 B 0 B
./bundle/events-XB7DADIJ.js 418 B 0 B
./bundle/examples/hooks/scripts/on-start.js 188 B 0 B
./bundle/examples/mcp-server/example.js 1.43 kB 0 B
./bundle/gemini.js 5.1 kB 0 B
./bundle/getMachineId-bsd-TXG52NKR.js 1.55 kB 0 B
./bundle/getMachineId-darwin-7OE4DDZ6.js 1.55 kB 0 B
./bundle/getMachineId-linux-SHIFKOOX.js 1.34 kB 0 B
./bundle/getMachineId-unsupported-5U5DOEYY.js 1.06 kB 0 B
./bundle/getMachineId-win-6KLLGOI4.js 1.72 kB 0 B
./bundle/memoryDiscovery-URNH4AK4.js 980 B 0 B
./bundle/multipart-parser-KPBZEGQU.js 11.7 kB 0 B
./bundle/node_modules/@google/gemini-cli-devtools/dist/client/main.js 222 kB 0 B
./bundle/node_modules/@google/gemini-cli-devtools/dist/src/_client-assets.js 229 kB 0 B
./bundle/node_modules/@google/gemini-cli-devtools/dist/src/index.js 13.4 kB 0 B
./bundle/node_modules/@google/gemini-cli-devtools/dist/src/types.js 132 B 0 B
./bundle/sandbox-macos-permissive-open.sb 890 B 0 B
./bundle/sandbox-macos-permissive-proxied.sb 1.31 kB 0 B
./bundle/sandbox-macos-restrictive-open.sb 3.36 kB 0 B
./bundle/sandbox-macos-restrictive-proxied.sb 3.56 kB 0 B
./bundle/sandbox-macos-strict-open.sb 4.82 kB 0 B
./bundle/sandbox-macos-strict-proxied.sb 5.02 kB 0 B
./bundle/src-QVCVGIUX.js 47 kB 0 B
./bundle/start-TWOLEIBW.js 0 B -652 B (removed) 🏆
./bundle/tree-sitter-7U6MW5PS.js 274 kB 0 B
./bundle/tree-sitter-bash-34ZGLXVX.js 1.84 MB 0 B
./bundle/cleanup-26S6KJQZ.js 932 B +932 B (new file) 🆕
./bundle/start-QZ3VTWDD.js 652 B +652 B (new file) 🆕

compressed-size-action

@gemini-code-assist
Copy link
Copy Markdown
Contributor

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request addresses an issue where persistent 429 errors caused an uncontrolled retry loop when falling back to the same model. By differentiating between switching to a new model and retrying the existing one, the system now correctly applies backoff delays, preventing excessive request spamming.

Highlights

  • Retry Loop Correction: Modified the retry logic to prevent uncontrolled loops by ensuring that retries to the same model respect backoff delays, rather than bypassing them.
  • Fallback Logic Update: Updated the fallback handler to return the specific model name instead of a boolean, allowing the retry mechanism to distinguish between switching to a new model and retrying the current one.
  • Regression Testing: Added a new test case in retry.test.ts to verify that retries to the same model correctly implement wait times.
Ignored Files
  • Ignored by pattern: .gemini/** (1)
    • .gemini/settings.json
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize the Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counterproductive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Extracts the duplicated delay and backoff logic in retryWithBackoff into a helper function applyBackoffDelay to improve maintainability.

TAG=agy

CONV=37b3cdb6-8437-4ced-ae4e-3d04054bacb1
@adamfweidman
Copy link
Copy Markdown
Contributor Author

/gemini review

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request adds debug logging for non-interactive mode and updates the fallback handler to return model names instead of booleans. The core change modifies the retry logic to apply exponential backoff when retrying with the same model, rather than retrying immediately. Review feedback identified a critical logic flaw where the attempt counter is reset even when the model remains the same, which could lead to infinite retry loops and bypass maximum attempt limits. The reviewer recommended unifying the model resolution logic and ensuring the attempt counter is only reset when a new model is actually introduced.

Comment on lines +310 to 335
const currentModel = getAvailabilityContext?.()?.policy.model;

const fallbackModel = await onPersistent429(
authType,
classifiedError,
);
if (fallbackModel) {
attempt = 0; // Reset attempts and retry with the new model.
currentDelay = initialDelayMs;
continue;

// Only continue (immediate retry) if fallbackModel is a NEW model
if (
typeof fallbackModel === 'string' &&
fallbackModel !== currentModel
) {
continue;
} else {
// If it's the same model (or a boolean retry signal), wait before retrying
currentDelay = await applyBackoffDelay(
currentDelay,
maxDelayMs,
signal,
);
continue;
}
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

security-high high

The fallback logic in retryWithBackoff contains a logic flaw that can lead to an infinite retry loop, bypassing the maxAttempts limit. This occurs because the attempt counter is unconditionally reset to 0 when a fallback model is suggested, even if the model remains the same. This violates the requirement for recursive handlers to have a mechanism to limit retry attempts. Additionally, the logic should use the resolved primary model from the availability context to ensure consistency and avoid re-evaluating conditions. Resetting currentDelay when falling back to the same model is too aggressive; exponential backoff should continue unless a new model is actually introduced.

            const fallbackModel = await onPersistent429(
              authType,
              classifiedError,
            );
            if (fallbackModel) {
              const currentContext = getAvailabilityContext?.();
              const currentModel = currentContext?.policy.model;

              if (
                typeof fallbackModel === 'string' &&
                fallbackModel !== currentModel
              ) {
                attempt = 0;
                currentDelay = initialDelayMs;
                continue;
              } else {
                const jitter = currentDelay * 0.3 * (Math.random() * 2 - 1);
                const delayWithJitter = Math.max(0, currentDelay + jitter);
                await delay(delayWithJitter, signal);
                currentDelay = Math.min(maxDelayMs, currentDelay * 2);
                continue;
              }
            }
References
  1. A recursive error/reconnect handler is acceptable as long as it includes a mechanism to limit the maximum number of retry attempts to prevent infinite loops.
  2. Avoid overlapping checks in model resolution logic by relying on condensed sources of truth. Use the resolved primary model directly rather than re-evaluating conditions.
  3. In retry loops, re-evaluate dynamic conditions like the maximum number of attempts on each iteration to ensure the logic adapts to potential state changes.

Comment on lines +376 to 401
const currentModel = getAvailabilityContext?.()?.policy.model;

const fallbackModel = await onPersistent429(
authType,
classifiedError,
);
if (fallbackModel) {
attempt = 0; // Reset attempts and retry with the new model.
currentDelay = initialDelayMs;
continue;

// Only continue (immediate retry) if fallbackModel is a NEW model
if (
typeof fallbackModel === 'string' &&
fallbackModel !== currentModel
) {
continue;
} else {
// If it's the same model (or a boolean retry signal), wait before retrying
currentDelay = await applyBackoffDelay(
currentDelay,
maxDelayMs,
signal,
);
continue;
}
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

security-high high

This block duplicates the logic flaw where the attempt counter is prematurely reset, potentially causing infinite loops. To ensure consistency and reduce logic duplication, this model resolution and retry logic should be unified. The check should rely on the resolved primary model from the context rather than static constants or redundant evaluations.

              const fallbackModel = await onPersistent429(
                authType,
                classifiedError,
              );
              if (fallbackModel) {
                const currentContext = getAvailabilityContext?.();
                const currentModel = currentContext?.policy.model;

                if (
                  typeof fallbackModel === 'string' &&
                  fallbackModel !== currentModel
                ) {
                  attempt = 0;
                  currentDelay = initialDelayMs;
                  continue;
                } else {
                  const jitter = currentDelay * 0.3 * (Math.random() * 2 - 1);
                  const delayWithJitter = Math.max(0, currentDelay + jitter);
                  await delay(delayWithJitter, signal);
                  currentDelay = Math.min(maxDelayMs, currentDelay * 2);
                  continue;
                }
              }
References
  1. Recursive handlers must include a mechanism to limit the maximum number of retry attempts to prevent infinite loops.
  2. Avoid overlapping checks in model resolution logic and reduce logic duplication by using the resolved primary model directly.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request updates the fallback handler to return model identifiers and introduces a backoff mechanism in the retry logic when the fallback model is identical to the current one. A new test case was added to verify this behavior. Review feedback identified a potential infinite loop vulnerability if the current model is undefined, which would cause the logic to skip the necessary backoff delay and reset retry attempts prematurely.

Comment on lines +321 to +334
if (
typeof fallbackModel === 'string' &&
fallbackModel !== currentModel
) {
continue;
} else {
// If it's the same model (or a boolean retry signal), wait before retrying
currentDelay = await applyBackoffDelay(
currentDelay,
maxDelayMs,
signal,
);
continue;
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

security-high high

The retry logic is vulnerable to a Denial of Service (DoS) due to uncontrolled retry loops and non-exponential backoff. This occurs because the attempt counter and currentDelay are reset before checking if the fallbackModel is new. Specifically, if currentModel is undefined, the condition fallbackModel !== currentModel incorrectly evaluates to true, leading to immediate retries without delay. This violates the repository rule requiring a mechanism to limit retry attempts to prevent infinite loops.

              // Only continue (immediate retry) if fallbackModel is a NEW model
              if (
                typeof fallbackModel === 'string' &&
                currentModel !== undefined &&
                fallbackModel !== currentModel
              ) {
                continue;
              } else {
                // If it's the same model (or a boolean retry signal), wait before retrying
                currentDelay = await applyBackoffDelay(
                  currentDelay,
                  maxDelayMs,
                  signal,
                );
                continue;
              }
References
  1. A recursive error/reconnect handler is acceptable as long as it includes a mechanism to limit the maximum number of retry attempts to prevent infinite loops.

Comment on lines +387 to +400
if (
typeof fallbackModel === 'string' &&
fallbackModel !== currentModel
) {
continue;
} else {
// If it's the same model (or a boolean retry signal), wait before retrying
currentDelay = await applyBackoffDelay(
currentDelay,
maxDelayMs,
signal,
);
continue;
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

security-high high

This section contains a similar Denial of Service (DoS) vulnerability. The attempt counter and currentDelay are reset before verifying if the fallbackModel is new. This, combined with the backoff delay being skipped if currentModel is undefined, can lead to an infinite loop and broken exponential backoff. This violates the repository rule requiring a mechanism to limit retry attempts to prevent infinite loops.

                // Only continue (immediate retry) if fallbackModel is a NEW model
                if (
                  typeof fallbackModel === 'string' &&
                  currentModel !== undefined &&
                  fallbackModel !== currentModel
                ) {
                  continue;
                } else {
                  // If it's the same model (or a boolean retry signal), wait before retrying
                  currentDelay = await applyBackoffDelay(
                    currentDelay,
                    maxDelayMs,
                    signal,
                  );
                  continue;
                }
References
  1. A recursive error/reconnect handler is acceptable as long as it includes a mechanism to limit the maximum number of retry attempts to prevent infinite loops.

@gemini-cli gemini-cli Bot added priority/p1 Important and should be addressed in the near term. area/core Issues related to User Interface, OS Support, Core Functionality area/agent Issues related to Core Agent, Tools, Memory, Sub-Agents, Hooks, Agent Quality labels May 13, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/agent Issues related to Core Agent, Tools, Memory, Sub-Agents, Hooks, Agent Quality area/core Issues related to User Interface, OS Support, Core Functionality priority/p1 Important and should be addressed in the near term.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Bug: User message duplicated 30-50x during pro → flash model fallback

1 participant