Skip to content

feat(skills): add ci skill for automated failure replication#23720

Merged
mattKorwel merged 12 commits intomainfrom
mk-ci-skill
Mar 25, 2026
Merged

feat(skills): add ci skill for automated failure replication#23720
mattKorwel merged 12 commits intomainfrom
mk-ci-skill

Conversation

@mattKorwel
Copy link
Collaborator

@mattKorwel mattKorwel commented Mar 24, 2026

Description

This PR introduces a specialized CI Skill to the repository, designed to streamline the developer feedback loop during the final stages of PR preparation.

The Problem (Before)

Previously, a developer pushing a change had a fragmented experience:

  1. Passive Waiting: Pushing to CI meant switching to a browser to monitor GitHub Actions or periodically running gh run list.
  2. Manual Triage: If a job failed, the developer had to manually dig through multi-megabyte logs to find the specific failing test file or lint error.
  3. Context Switching: Once a failure was identified, the developer had to manually construct the correct npm test -w <package> -- <path> command to reproduce it locally.
  4. Noise: CI logs are often filled with NPM deprecation warnings and Git sync noise, making it easy to miss the actual root cause.

The Solution (After)

The new ci skill provides a high-signal, automated bridge between remote CI and local development:

  • Fail-Fast Monitoring: The status workflow provides a real-time, single-line status in the terminal and exits immediately upon the first failure.
  • Automated Replication: The replicate workflow (default) not only monitors but automatically executes the necessary local commands (tests, lint) to reproduce the CI failure as soon as it happens.
  • Smart Extraction: It uses a specialized script (ci.mjs) to parse logs via the GitHub API, filter out noise, and generate exact, copy-pasteable (or auto-executed) commands tailored to the monorepo structure.

Developer Journey

  1. Push: Developer pushes their final commits for a PR.
  2. Activate: Developer asks Gemini CLI to "monitor CI" or "replicate failures."
  3. Automatic Triage: The agent enters a fail-fast monitor. If CI fails, the agent immediately reproduces the failure locally without the developer needing to touch a browser.
  4. Immediate Fix: The developer (or the agent) can immediately apply the fix based on the local reproduction.

Changes

  • Created .gemini/skills/ci/SKILL.md defining the status and replicate workflows.
  • Added .gemini/skills/ci/scripts/ci.mjs, a portable Node.js tool for high-performance CI monitoring and log parsing.

Verification

  • Verified the script correctly identifies failures in this repository's monorepo structure (mapping paths to @google/gemini-cli and @google/gemini-cli-core).
  • Packaged and validated the skill using the skill-creator toolchain.

Technical Challenges & Why This Wasn't Easy

A significant hurdle in building this tool was handling the "race condition" of GitHub Actions logs.

  • The "Incomplete Logs" Problem: Standard gh run view --log commands frequently fail with the message "logs are not complete until run is done", making it impossible to perform fail-fast analysis while jobs are still in progress.
  • API Deep Dive: To bypass this, the tool was architected to bypass the standard CLI log-view and instead target the GitHub Actions REST API directly for individual job logs. This ensures we can extract failures while the workflow is still running, which is critical for a "fail-fast" monitor.
  • Monorepo Mapping: The script also manages the complexity of mapping absolute paths from CI logs back to the correct npm workspace (@google/gemini-cli vs @google/gemini-cli-core) to generate valid local commands.

@github-actions
Copy link

github-actions bot commented Mar 24, 2026

Size Change: -4 B (0%)

Total Size: 26.3 MB

Filename Size Change
./bundle/chunk-23DI7L5V.js 0 B -3.4 kB (removed) 🏆
./bundle/chunk-RGAW74XF.js 0 B -14.6 MB (removed) 🏆
./bundle/chunk-XN6LIP7Z.js 0 B -3.64 MB (removed) 🏆
./bundle/core-V2XIMZSL.js 0 B -43.4 kB (removed) 🏆
./bundle/devtoolsService-UUX2LQ3U.js 0 B -27.7 kB (removed) 🏆
./bundle/gemini-O3YTTRVA.js 0 B -521 kB (removed) 🏆
./bundle/interactiveCli-R244XGXP.js 0 B -1.62 MB (removed) 🏆
./bundle/oauth2-provider-PTOK7KZE.js 0 B -9.16 kB (removed) 🏆
./bundle/chunk-7FRIZULL.js 14.6 MB +14.6 MB (new file) 🆕
./bundle/chunk-GJ3N3BT2.js 3.64 MB +3.64 MB (new file) 🆕
./bundle/chunk-S5FQZGLM.js 3.4 kB +3.4 kB (new file) 🆕
./bundle/core-HM6E57XQ.js 43.4 kB +43.4 kB (new file) 🆕
./bundle/devtoolsService-UBGMEFGP.js 27.7 kB +27.7 kB (new file) 🆕
./bundle/gemini-IQME5ZET.js 521 kB +521 kB (new file) 🆕
./bundle/interactiveCli-LKEJ2JBY.js 1.62 MB +1.62 MB (new file) 🆕
./bundle/oauth2-provider-U2THIJBO.js 9.16 kB +9.16 kB (new file) 🆕
ℹ️ View Unchanged
Filename Size Change
./bundle/chunk-34MYV7JD.js 2.45 kB 0 B
./bundle/chunk-5AUYMPVF.js 858 B 0 B
./bundle/chunk-664ZODQF.js 124 kB 0 B
./bundle/chunk-DAHVX5MI.js 206 kB 0 B
./bundle/chunk-IUUIT4SU.js 56.5 kB 0 B
./bundle/chunk-IV2KUFMZ.js 1.96 MB 0 B
./bundle/chunk-RJTRUG2J.js 39.8 kB 0 B
./bundle/cleanup-R4BVQ3OU.js 0 B -856 B (removed) 🏆
./bundle/devtools-36NN55EP.js 696 kB 0 B
./bundle/dist-T73EYRDX.js 356 B 0 B
./bundle/gemini.js 2.06 kB 0 B
./bundle/getMachineId-bsd-TXG52NKR.js 1.55 kB 0 B
./bundle/getMachineId-darwin-7OE4DDZ6.js 1.55 kB 0 B
./bundle/getMachineId-linux-SHIFKOOX.js 1.34 kB 0 B
./bundle/getMachineId-unsupported-5U5DOEYY.js 1.06 kB 0 B
./bundle/getMachineId-win-6KLLGOI4.js 1.72 kB 0 B
./bundle/memoryDiscovery-CXSTQXLK.js 922 B 0 B
./bundle/multipart-parser-KPBZEGQU.js 11.7 kB 0 B
./bundle/node_modules/@google/gemini-cli-devtools/dist/client/main.js 221 kB 0 B
./bundle/node_modules/@google/gemini-cli-devtools/dist/src/_client-assets.js 227 kB 0 B
./bundle/node_modules/@google/gemini-cli-devtools/dist/src/index.js 11.5 kB 0 B
./bundle/node_modules/@google/gemini-cli-devtools/dist/src/types.js 132 B 0 B
./bundle/sandbox-macos-permissive-open.sb 890 B 0 B
./bundle/sandbox-macos-permissive-proxied.sb 1.31 kB 0 B
./bundle/sandbox-macos-restrictive-open.sb 3.36 kB 0 B
./bundle/sandbox-macos-restrictive-proxied.sb 3.56 kB 0 B
./bundle/sandbox-macos-strict-open.sb 4.82 kB 0 B
./bundle/sandbox-macos-strict-proxied.sb 5.02 kB 0 B
./bundle/src-QVCVGIUX.js 47 kB 0 B
./bundle/tree-sitter-7U6MW5PS.js 274 kB 0 B
./bundle/tree-sitter-bash-34ZGLXVX.js 1.84 MB 0 B
./bundle/cleanup-2G26VFWA.js 856 B +856 B (new file) 🆕

compressed-size-action

@mattKorwel
Copy link
Collaborator Author

Skill in action:

image

Happy Path:

image

Catching a failure mode and detailing what tests to run:

image

@mattKorwel
Copy link
Collaborator Author

mattKorwel commented Mar 24, 2026

ci.skill.zip

📦 Skill Distribution & Packaging

To facilitate sharing and installation, I have packaged this skill into a .skill file. While the source is available in the repository, users can also distribute this as a standalone archive.

🛠️ How to Package

If you make changes to the source in .gemini/skills/ci, you can re-package it using the built-in skill-creator toolchain:

# node <path-to-creator>/package_skill.cjs <source-folder> <output-folder>
node /Users/mattkorwel/.gcli/nightly/node_modules/@google/gemini-cli/bundle/builtin/skill-creator/scripts/package_skill.cjs .gemini/skills/ci .gemini/dist

🚀 How to Install

Reviewers can install and test this skill immediately using the following commands:

Project-Specific (Workspace Scope):

gemini skills install .gemini/dist/ci.skill --scope workspace

Machine-Wide (User Scope):

gemini skills install .gemini/dist/ci.skill --scope user

🔄 Activation

After installation, reload your session to enable the expertise:

  1. Run /skills reload in your interactive Gemini CLI session.
  2. Verify with /skills list.

@mattKorwel mattKorwel marked this pull request as ready for review March 24, 2026 22:34
@mattKorwel mattKorwel requested a review from a team as a code owner March 24, 2026 22:34
@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces a new CI skill to enhance the developer experience by automating the detection and local replication of continuous integration failures. It aims to eliminate manual monitoring and log parsing, allowing developers to quickly identify and fix issues by reproducing them instantly in their local environment.

Highlights

  • New CI Skill: Introduced a new CI Skill designed to automate the replication of CI failures locally, significantly improving the developer feedback loop.
  • Fail-Fast Monitoring: Implemented a 'status' workflow that provides real-time CI status in the terminal and exits immediately upon the first failure.
  • Automated Replication: Developed a 'replicate' workflow that automatically executes local commands to reproduce CI failures as soon as they occur, eliminating manual triage.
  • Advanced Log Parsing: Utilized a specialized Node.js script ('ci.mjs') to parse GitHub Actions logs directly via the REST API, bypassing limitations of standard CLI tools and enabling fail-fast analysis.
  • Monorepo Support: Incorporated logic to correctly map CI log paths to the appropriate npm workspace within the monorepo, ensuring accurate local command generation.
Ignored Files
  • Ignored by pattern: .gemini/** (2)
    • .gemini/skills/ci/SKILL.md
    • .gemini/skills/ci/scripts/ci.mjs
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

gemini-code-assist[bot]

This comment was marked as outdated.

@mattKorwel
Copy link
Collaborator Author

/gemini review

@gemini-cli gemini-cli bot added the status/need-issue Pull requests that need to have an associated issue. label Mar 24, 2026
@mattKorwel mattKorwel enabled auto-merge March 24, 2026 23:47
@mattKorwel mattKorwel self-assigned this Mar 24, 2026
@@ -0,0 +1,66 @@
---
name: ci
description:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIRC there's a command npm run watch which watches for run failures. Is the idea that your skill saves us time by not waiting for the run to complete before marking it failed?

Copy link
Collaborator Author

@mattKorwel mattKorwel Mar 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that is the core posit yes. npm run watch --compact will list any jobs that have failed (but not provide details on what or why). but npm run view --log and npm run view --job --log (which show test details)
block until all jobs are done. that means even though a test might fail at minute two, we wait another 10 plus to get the details, unless you manually go dig through the ui, which is very un agentic ;)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also, at least in my experience, gh run view --logs and the model do a really really bad job of efficiently parsing through the logs. this is orders of magnitude quicker and uses multiple round trips less.

process.exit(0);
}

while (true) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: maybe we can have the agent break this up into subroutines for readability.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

great idea for future revision!

@gundermanc
Copy link
Member

Maybe a good idea to have the pr-creator skill leverage this skill to watch the PR to make sure it passes all checks?

@mattKorwel
Copy link
Collaborator Author

the failing e2e test is a known flake. will merge main as soon as that fix lands to skip the flake.

@mattKorwel mattKorwel disabled auto-merge March 25, 2026 00:42
@mattKorwel mattKorwel merged commit f74f2b0 into main Mar 25, 2026
7 checks passed
@mattKorwel mattKorwel deleted the mk-ci-skill branch March 25, 2026 00:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

status/need-issue Pull requests that need to have an associated issue.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants