Implement automated CI failure investigator by shralex · Pull Request #3861 · AI-Hypercomputer/maxtext

shralex · 2026-05-09T15:46:35Z

This PR implements a Gemini-powered automated investigator to diagnose CI test failures in the repository. It enables the @gemini-cli /investigate command and runs automatically on pipeline failures to provide actionable diagnostic feedback.

Key Features:

Automated Failure Diagnostics: Formulates a systematic diagnostics report containing the specific test flavor/jobs that failed, exact traceback logs, and logical context of the failure.
Diagnostics Protocol: Guides the model to perform efficient diagnostics—starting with log parsing, searching git history for flakes, and analyzing recent code changes in the PR or main branch.
Targeted Fix Recommendations: Directs the model to generate high-confidence code diff suggestions when the root cause is identified.
Integrated Reporting: Automatically writes the final markdown report to .gemini/findings.md and posts it as a comment directly to the target PR or failure issue.

Checklist

Before submitting this PR, please make sure (put X in square brackets):

I have performed a self-review of my code. For an optional AI review, add the gemini-review label.
I have necessary comments in my code, particularly in hard-to-understand areas.
I have run end-to-end tests tests and provided workload links above if applicable.
I have made or will make corresponding changes to the doc if needed, including adding new documentation pages to the relevant Table of Contents (toctree directive) as explained in our documentation.

codecov · 2026-05-09T15:52:13Z

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

github-actions · 2026-05-11T18:00:22Z

🤖 Hi @RissyRan, I've received your request, and I'm working on it now! You can track my progress in the logs for more details.

github-actions

## 📋 Review Summary

This Pull Request successfully implements an automated CI failure investigator powered by Gemini. The diagnostics protocol is well-defined and systematic, offering a great balance between efficiency and depth.

🔍 General Feedback

Regression Alert: There is a regression in gemini-dispatch.yml where additional_context parsing was removed for the /review command. I've suggested a fix to restore this and extend it to /investigate.
Workflow Robustness: The fallback logic for identifying the failed run ID could be made more specific to the current PR branch to avoid false positives in high-traffic periods.
Protocol Excellence: The systematic approach in gemini-investigate.toml (cheap actions first, grouping failures) is a highlight and should serve as a model for other automated commands.

github-actions · 2026-05-11T18:02:20Z

-              const additionalContext = request.replace(/^@gemini-cli \/review/, '').trim();
-              core.setOutput('additional_context', additionalContext);
+              core.setOutput('additional_context', '');
+            } else if (request.startsWith("@gemini-cli /investigate")) {


🟠 This change removes the ability to provide additional context to the `/review` command, which appears to be an unintended regression. The previous implementation correctly parsed and passed the context.

Suggested change

} else if (request.startsWith("@gemini-cli /investigate")) {

} else if (request.startsWith("@gemini-cli /review")) {

core.setOutput('command', 'review');

const additionalContext = request.replace(/^@gemini-cli \/review/, '').trim();

core.setOutput('additional_context', additionalContext);

} else if (request.startsWith("@gemini-cli /investigate")) {

core.setOutput('command', 'investigate');

const additionalContext = request.replace(/^@gemini-cli \/investigate/, '').trim();

core.setOutput('additional_context', additionalContext);

Its intended, to protect against context injection.

github-actions · 2026-05-11T18:02:20Z

+
+          # Determine target run ID
+          if [ -z "$RUN_ID" ]; then
+            # Fallback to finding the latest failed run for this PR if triggered manually


🟡 The fallback logic here will retrieve the latest failed run for the entire repository, which might not be the run associated with the current Pull Request if multiple PRs are running CI concurrently.
Consider filtering by the head branch or PR to ensure the investigator analyzes the relevant failure. If this is triggered via issue_comment on a PR, you can retrieve the head branch name and pass it to gh run list --branch <branch>.

github-actions · 2026-05-11T18:02:20Z

+  workflow_call:
+    inputs:
+      additional_context:
+        type: 'string'


🟡 The `workflow_call` trigger defines `additional_context` as an input, but `failed_run_id` is missing from the `workflow_call` inputs definition. This means `github.event.inputs.failed_run_id` will likely be undefined when called from another workflow.

Suggested change

type: 'string'

workflow_call:

inputs:

additional_context:

type: 'string'

required: false

failed_run_id:

type: 'string'

required: false

github-actions · 2026-05-11T18:02:20Z

+                  "args": [
+                    "run",
+                    "-i",
+                    "--rm",


🟢 It's good practice to ensure that the investigator has access to the most relevant tools. Since the prompt mentions searching git history and exploring files, ensure `mcpServers` configuration includes all necessary permissions if they are not already covered by the defaults or the explicitly listed tools.
The current list is good, but for "searching git history", you might eventually want tools that can run git log or git blame if the shell tool is too restricted (though here you've allowed cat, grep, etc., which is a good start).

RissyRan

@gemini-cli /investigate

RissyRan added the gemini-review label May 11, 2026

github-actions Bot reviewed May 11, 2026

View reviewed changes

RissyRan reviewed May 11, 2026

View reviewed changes

Implement secure automated CI failure investigator

a58ee37

shralex force-pushed the shralex_investigate branch from 7f80543 to a58ee37 Compare May 11, 2026 21:20

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement automated CI failure investigator#3861

Implement automated CI failure investigator#3861
shralex wants to merge 1 commit into
mainfrom
shralex_investigate

shralex commented May 9, 2026

Uh oh!

codecov Bot commented May 9, 2026

Uh oh!

github-actions Bot commented May 11, 2026

Uh oh!

github-actions Bot left a comment

Uh oh!

github-actions Bot May 11, 2026

Uh oh!

shralex May 11, 2026

Uh oh!

github-actions Bot May 11, 2026

Uh oh!

github-actions Bot May 11, 2026

Uh oh!

github-actions Bot May 11, 2026

Uh oh!

RissyRan left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

shralex commented May 9, 2026

Checklist

Uh oh!

codecov Bot commented May 9, 2026

Codecov Report

Uh oh!

github-actions Bot commented May 11, 2026

Uh oh!

github-actions Bot left a comment

Choose a reason for hiding this comment

🔍 General Feedback

Uh oh!

github-actions Bot May 11, 2026

Choose a reason for hiding this comment

Uh oh!

shralex May 11, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions Bot May 11, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions Bot May 11, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions Bot May 11, 2026

Choose a reason for hiding this comment

Uh oh!

RissyRan left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants