feat(bot): implement issue-fixer skill and mandate selection#26951
feat(bot): implement issue-fixer skill and mandate selection#26951gundermanc wants to merge 59 commits into
Conversation
|
Size Change: -4 B (0%) Total Size: 33.8 MB
ℹ️ View Unchanged
|
- Updates the `issue-fixer` skill to strictly forbid the use of local tools (like lint, tsc, grep) for finding tasks. - Updates `scheduled.md` to clarify that local tools are for verification only.
- Replaces `gh issue list` with the `--json` flag to prevent default pagers (like `less`) from waiting for user input and hanging the CI run. - Adds `--no-pager` to `gh run view`.
- Sets `GH_PAGER: ''` in the workflow environment variables to ensure any `gh` CLI commands executed by the bot or in bash scripts do not hang waiting for user input in the headless CI environment.
This adds a settings.json file to the bot's configuration to explicitly allow GH_TOKEN and GITHUB_TOKEN to bypass the environment variable redaction engine. This is required because the bot runs in GitHub Actions, which enforces strict redaction mode by default. The workflow file was also simplified by removing the wrapper script that was previously used to bypass redaction.
- Update the publish workflow to read labels from a pr-labels.txt file and apply them to the PR using the gh CLI. - Update the prs skill to instruct the bot to write labels to pr-labels.txt. - Update the issue-fixer and metrics skills to explicitly request the application of their respective labels in pr-labels.txt.
This removes the `NEVER_ALLOWED_NAME_PATTERNS` filter from `getSecureSanitizationConfig`. Previously, if a user explicitly added a variable like `GH_TOKEN` to their `allowedEnvironmentVariables` in `settings.json`, it would be silently dropped during configuration parsing because it matched the `NEVER_ALLOWED_NAME_PATTERNS` regex. This change ensures that explicit user allowlists take precedence over heuristic name-based pattern matching, while still maintaining the strict blocklist for known highly-sensitive system variables (`NEVER_ALLOWED_ENVIRONMENT_VARIABLES`).
The pr-labels.txt file was being correctly generated by the bot but was not being passed to the Publish Artifacts job because it was omitted from the Archive Brain Data step.
- Added a 60 minute timeout to the Reasoning job in the workflow. - Updated the issue-fixer skill to instruct the bot to wrap 'npm run preflight' with the linux 'timeout' utility to prevent infinite loops from hanging the entire LLM agent.
The non-interactive CLI defaults to infinite `maxSessionTurns`. This causes the bot to get stuck in infinite trial-and-error loops when it struggles to fix a failing test suite, only stopping when the 60-minute GitHub Actions timeout kills the runner. By setting `maxSessionTurns: 30` in the bot's configuration, the LLM session will gracefully abort with an error if it cannot find a solution within a reasonable number of attempts.
Updated the issue-fixer skill to instruct the bot to search one top-level folder at a time and avoid problematic directories with large data files (memory-tests, last_brain_data) to prevent grep_search timeouts.
- Removed the `maxSessionTurns: 30` limit from the bot's configuration as requested. - Added `if: always()` to the `Archive Brain Data` step in the workflow to ensure logs (telemetry, debug) are available even when the agent fails or times out.
This fix was recovered from a timed-out bot run. It addresses issue #26979 where the CLI would crash if a user provided an extremely long path string in an @ command (e.g. @/aaa...a). Changes: - Updated 'robustRealpath' in 'packages/core/src/utils/paths.ts' to catch and gracefully handle 'ENAMETOOLONG' and 'EINVAL' errors from fs.realpathSync and fs.lstatSync. - Added a defensive try-catch block to 'checkPermissions' in 'packages/cli/src/ui/hooks/atCommandProcessor.ts' to prevent long path strings from crashing the CLI during @ command parsing. - Added regression unit tests to verify the fix.
- Pin publish job to the specific SHA from reasoning job to avoid drift. - Use git apply --3way --ignore-whitespace for better conflict resolution.
The reasoning job now captures the HEAD commit SHA prior to patch generation (`patch_base_sha`) and passes it to the publish job. The publish job then checks out this precise SHA rather than the original `target_sha`. This ensures that if the agent checks out an existing PR branch to resolve comments, the generated patch is applied cleanly against the tip of that branch instead of causing a merge conflict with main.
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request enhances the Gemini CLI Bot by implementing a specialized 'issue-fixer' skill and providing maintainers with greater control over bot behavior through mandate selection. These changes streamline the bot's ability to manage repository backlogs while reinforcing security policies and refining the internal logic for environment variable handling and tool usage. Highlights
Ignored Files
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize the Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counterproductive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request updates the environment sanitization logic, refines bot skill workflows with stricter operational constraints—such as mandatory skill activation and test timeouts—and adjusts the CI policy. Review feedback highlights critical security regressions in the sanitization service, specifically noting that moving allowlist checks before secret pattern matching permits sensitive data leakage. The reviewer also flagged the removal of 'value-first' security tests as a major regression and pointed out that allowlisting tokens in settings is both ineffective due to name filtering and poses a security risk.
I am having trouble creating individual review comments. Click here to see my feedback.
packages/core/src/services/environmentSanitization.ts (172)
The allowedSet check should be restored here, after the value-based pattern matching, to maintain the 'value-first' security policy.
if (allowedSet?.has(key)) {
return false;
}
packages/core/src/services/environmentSanitization.test.ts (233-273)
The removal of the 'value-first security' test suite is a major regression. These tests ensure that high-entropy secrets are caught regardless of the variable name. Deleting them, especially alongside the logic changes that weaken this protection, significantly reduces the security posture of the environment sanitization service. These tests must be restored.
packages/core/src/services/environmentSanitization.ts (153-155)
The change in sanitizeEnvironment to check the allowedSet before NEVER_ALLOWED_VALUE_PATTERNS introduces a significant security regression. This 'value-first' security model is crucial for defense-in-depth. Previously, even allowlisted variables were checked for sensitive data. Now, any variable in the user-configurable allowedSet can bypass these safety checks, potentially leaking sensitive values like private keys, GitHub tokens, JWTs, or certificates, even if their names don't match forbidden patterns. The allowedSet check should remain after the value-based pattern matching to ensure all values are scanned for secrets.
Additionally, note that adding GH_TOKEN to the allowed list in settings.json will not work as intended because getSecureSanitizationConfig (line 217) filters out any variables matching NEVER_ALLOWED_NAME_PATTERNS (which includes /TOKEN/i).
if (value) {
for (const pattern of NEVER_ALLOWED_VALUE_PATTERNS) {
if (pattern.test(value)) {
return true;
}
}
}
if (allowedSet?.has(key)) {
return false;
}References
- Security checks, such as an extension allowlist, should be implemented in a 'fail-closed' manner. If an item's validity cannot be verified, it should be rejected by default.
tools/gemini-cli-bot/.gemini/settings.json (4)
Adding GH_TOKEN and GITHUB_TOKEN to the allowed list here will not have the desired effect. The getSecureSanitizationConfig function in environmentSanitization.ts explicitly filters out any variables matching NEVER_ALLOWED_NAME_PATTERNS, which includes /TOKEN/i. Furthermore, allowing these tokens to pass through sanitization is a security risk if the sanitized environment is sent to the LLM. The bot should access these tokens through a secure tool execution mechanism rather than by exposing them in the sanitized environment context.
References
- Security checks should be implemented in a 'fail-closed' manner. If an item's validity cannot be verified, it should be rejected by default.
| - Use the `prs` skill to list all open PRs labeled `bot-fix`. | ||
| - If any require attention (CI failure, requested changes), focus your entire run on resolving ONE of them. | ||
| - Do NOT start a new issue fix if an existing PR needs work. | ||
| 2. **Search for Candidates**: If no PRs need attention, search for `effort/small` issues: `gh issue list --label "effort/small" --limit 10 --json number,title,url`. |
There was a problem hiding this comment.
i think we only care about open issues? so adding status:open to the search query?
|
|
||
| 1. **Inventory & Drive PRs**: | ||
| - **ACTIVATE SKILLS**: You MUST call `activate_skill(name="memory")` and `activate_skill(name="prs")` before continuing. | ||
| - Use the `prs` skill to list all open PRs labeled `bot-fix`. |
There was a problem hiding this comment.
it might worth also checking if the issue associated with the pr is still open or closed already.
| @@ -21,17 +38,22 @@ If you are proposing fixes and PR creation is enabled (per the System Directive) | |||
| unrelated refactor, or a metrics script update. Metrics and fixes MUST | |||
| be in separate PRs. | |||
| 2. **Generate PR Description**: Use the `write_file` tool to create | |||
There was a problem hiding this comment.
also worth using pr template to keep consistency?
Summary
This PR implements the
issue-fixerskill for the Gemini CLI Bot and adds the ability to manually select the bot's mandate (auto, issue-fixer, metrics, interactive) when triggering the workflow viaworkflow_dispatch. It also updates the CI policy to allow theactivate_skilltool.Details
issue-fixerskill that enables the bot to proactively identify and fixeffort/smallissues and maintain existingbot-fixPRs.gemini-cli-bot-brain.ymlworkflow to include amandatechoice input. This allows maintainers to explicitly test specific bot behaviors.PROMPT_FILEandMANDATEbased on the input override and improved logging for better visibility in Action logs.activate_skillto theci-policy.tomlto ensure the bot can utilize its specialized skills in the CI environment.main.Related Issues
Related to #26717
How to Validate
🧠 Gemini CLI Bot: Brainworkflow on this branch.issue-fixerorinteractivefrom the mandate dropdown and verify in the logs that the correct prompt and mandate are selected.activate_skilltool call succeeds (or no longer fails with "Tool not found").Pre-Merge Checklist