Add commit analyst for commit-driven pipeline orchestration#60
Add commit analyst for commit-driven pipeline orchestration#60
Conversation
Add the ability to trigger docs-orchestrator workflows from code repository commits instead of JIRA tickets. This enables automated documentation generation when code changes land on main. New components: - git_commit_reader.py: commit-level API script (list, diff, range-diff, files) - docs-workflow-commits-ready: gate skill to poll for new commits - commit-analyst agent: extracts change signals and grades doc impact - docs-workflow-commit-analysis: step skill wrapping the commit-analyst - docs-workflow-create-pr: optional step to create PR/MR in docs repo - docs-commit-workflow.yaml: 8-step workflow definition Modified components: - docs-orchestrator: --commits and --create-pr flags, short-circuit on None impact, commit marker updates, source_type in progress file - requirements-analyst: accepts commit analysis as input source - docs-workflow-requirements: --commit-analysis argument and prompt Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Commits where all changed files are excluded by git_filters.yaml (tests, CI, lock files, etc.) cannot have doc impact. The new --drop-empty flag on git_commit_reader.py list removes these commits before the LLM sees them, saving analysis time and cost. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
WalkthroughAdds a commit-driven documentation workflow to docs-tools: new commit-analysis agent/spec, commit-ready gate script, git commit reader CLI, commit-oriented orchestrator steps and workflow YAML, create-PR automation, README additions, and plugin version bumps to 0.0.16. Changes
Sequence DiagramsequenceDiagram
participant CI as CI/CD Pipeline
participant Gate as commits-ready Gate
participant Reader as git_commit_reader
participant Orch as docs-orchestrator
participant CommitA as commit-analyst Agent
participant Steps as Workflow Steps
participant PRAPI as GitHub/GitLab API
participant Marker as Repo Marker File
CI->>Gate: call commits-ready-check.sh (--repo, --branch, --base-path)
Gate->>Marker: read last_processed_sha
Gate->>Reader: list commits (--drop-empty ...)
Reader-->>Gate: commit list / filtered_stats
Gate->>Marker: check for existing workflow progress
Gate-->>CI: return JSON {ready: true/false, batch}
alt commits ready
CI->>Orch: invoke docs-orchestrator (--commits <repo> <shas> [, --create-pr])
Orch->>CommitA: dispatch commit-analyst (repo + SHAs)
CommitA-->>Orch: write analysis.md (includes DOC_IMPACT)
alt DOC_IMPACT: None
Orch->>Marker: update marker (last_processed_sha, last_processed_at)
Orch-->>CI: complete (short-circuit)
else impact != None
Orch->>Steps: run requirements → planning → prepare-branch → writing → reviews
Steps-->>Orch: produce artifacts
Orch->>PRAPI: call create_pr.py to open PR/MR
PRAPI-->>Orch: return PR/MR URL
Orch->>Marker: update marker & commit state
Orch-->>CI: complete with PR/MR URL
end
else no new commits
Gate-->>CI: return ready: false
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~65 minutes Possibly related PRs
Important Pre-merge checks failedPlease resolve all errors before merging. Addressing warnings is optional. ❌ Failed checks (1 error)
✅ Passed checks (7 passed)
✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 9
🧹 Nitpick comments (2)
plugins/docs-tools/skills/docs-orchestrator/defaults/docs-commit-workflow.yaml (1)
44-48: Consider addingtechnical-reviewas an input dependency forcreate-pr.The
create-prstep currently depends onstyle-reviewandprepare-branch, but nottechnical-review. This means a PR could be created before technical accuracy is confirmed. If the workflow intends to ensure both reviews pass before PR creation, consider:- name: create-pr skill: docs-tools:docs-workflow-create-pr description: Create PR/MR with documentation changes when: create_pr - inputs: [style-review, prepare-branch] + inputs: [technical-review, style-review, prepare-branch]If the current behavior is intentional (e.g., style review implicitly follows technical review in practice), this can be ignored.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@plugins/docs-tools/skills/docs-orchestrator/defaults/docs-commit-workflow.yaml` around lines 44 - 48, The create-pr step (name: create-pr, skill: docs-tools:docs-workflow-create-pr) currently lists inputs [style-review, prepare-branch]; update its inputs to include technical-review so the step waits for the technical review to complete before creating the PR—i.e., modify the inputs array on the create-pr step to include "technical-review" alongside "style-review" and "prepare-branch".plugins/docs-tools/skills/docs-workflow-create-pr/scripts/create_pr.py (1)
31-36: Cross-skill import is fragile but functional.The relative path resolution for importing
load_env_filefromgit-pr-readerworks but is brittle—it will break if directory structure changes. Consider documenting this dependency or extractingload_env_fileto a shared utilities module.sys.path.insert( 0, str(pathlib.Path(__file__).resolve().parent.parent.parent / "git-pr-reader" / "scripts"), ) from git_pr_reader import load_env_fileThis is acceptable for now given the internal plugin structure.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@plugins/docs-tools/skills/docs-workflow-create-pr/scripts/create_pr.py` around lines 31 - 36, The cross-skill import using sys.path.insert and a relative path built from pathlib.Path(__file__).resolve().parent.parent.parent / "git-pr-reader" / "scripts" to import load_env_file is brittle; either extract load_env_file into a shared utilities package used by both plugins or add a clear comment and documentation near the sys.path.insert explaining the dependency and why the path is chosen (include the exact relative path and that it must remain in sync), and ensure the import line (from git_pr_reader import load_env_file) remains after the documented sys.path insertion so future refactors know to update or replace it with the shared module.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@plugins/docs-tools/agents/commit-analyst.md`:
- Around line 102-105: The short-circuit marker is inconsistent: Step 5
specifies `<!-- DOC_IMPACT: None -->` but the output template still shows `<!--
DOC_IMPACT: ... | NONE -->`; update the documentation in commit-analyst.md so
both the short-circuit step and the output template use the exact token `<!--
DOC_IMPACT: None -->` (replace any occurrences of `<!-- DOC_IMPACT: ... | NONE
-->` or `NONE` variants with the exact `None` token) to ensure the orchestrator
matches deterministically.
In `@plugins/docs-tools/agents/requirements-analyst.md`:
- Line 124: Update the cross-reference text so it uses fully qualified skill
names: replace mentions of jira-reader and git-pr-reader with
docs-tools:jira-reader and docs-tools:git-pr-reader respectively (so the line
reads that additional context should be fetched using docs-tools:jira-reader and
docs-tools:git-pr-reader). Ensure this rule is applied exactly as typed to match
the repo-wide skill naming convention referenced in the guidelines.
In `@plugins/docs-tools/README.md`:
- Around line 321-380: The CI examples don’t persist
`.claude/docs/.commit-markers/<repo-slug>.json`, causing repeated discovery of
the same commits; update the pipeline steps that run the claude commands (the
shell block that sets RESULT/READY/IDENTIFIER/COMMITS and calls the
docs-orchestrator) to restore the marker directory before checkout processing
and save it after processing (use your CI’s cache/artifact/volume actions), or
alternatively add and document a persistent base path flag to the orchestrator
invocation (e.g., pass a --base-path or env var pointing to a mounted persistent
directory instead of the repo working tree) and ensure the same path is restored
between scheduled runs so last_processed_sha is preserved.
In `@plugins/docs-tools/skills/docs-workflow-commit-analysis/SKILL.md`:
- Around line 39-60: The dispatch step currently relies on the subagent
docs-tools:commit-analyst to write <OUTPUT_FILE>, which is non-deterministic;
change the step so the parent skill captures the dispatched agent's analysis
response and writes it itself to a deterministic file name (analysis.md).
Specifically, after calling the agent with subagent_type
docs-tools:commit-analyst and the given prompt, read the agent's returned
content (response body), validate it, and perform the file write to analysis.md
(instead of assuming the subagent will create <OUTPUT_FILE>); ensure the
verification step checks the locally written analysis.md. Use the existing
dispatch/verify flow and reference the dispatch invocation and <OUTPUT_FILE>
placeholders when updating SKILL.md instructions.
In
`@plugins/docs-tools/skills/docs-workflow-commits-ready/scripts/commits-ready-check.sh`:
- Around line 30-33: The script currently sets COMMIT_READER by resolving a
hard-coded relative path via SCRIPT_DIR and "../../git-pr-reader/..."; update it
to use the repo-wide cross-skill root variable instead: replace the
COMMIT_READER assignment to reference
"${CLAUDE_PLUGIN_ROOT}/skills/git-pr-reader/scripts/git_commit_reader.py"
(preserve quoting) so the script uses ${CLAUDE_PLUGIN_ROOT} for cross-skill
lookup rather than ../../ traversal; keep SCRIPT_DIR if used elsewhere but
ensure COMMIT_READER points to the CLAUDE_PLUGIN_ROOT-based path and still
points to the correct git_commit_reader.py filename.
- Around line 81-84: The script currently swallows jq failures when reading the
marker file (the SINCE_SHA assignment block using jq on MARKER_FILE), causing
malformed markers to be ignored and triggering a full replay; change this to
detect and fail on malformed marker instead: run jq without the "|| true"
fallback, capture jq's exit code or output, and if jq fails or returns
empty/invalid JSON, log an error (using your existing logging or stderr) and
exit non-zero to stop the gate; specifically modify the SINCE_SHA assignment
block that calls jq -r '.last_processed_sha // empty' "$MARKER_FILE" so it
validates jq succeeded and the value is non-empty before proceeding.
In `@plugins/docs-tools/skills/docs-workflow-create-pr/scripts/create_pr.py`:
- Around line 94-112: The fallback regex in read_branch_info currently uses a
character class that only lists lowercase letters which can fail for uppercase
branch names; update the fallback pattern (the second re.search assigned to
branch_match) to use an explicit case-inclusive class like [A-Za-z0-9_/-]+ (or a
more permissive pattern such as \S+ if you prefer) so uppercase characters are
accepted, keep or remove re.IGNORECASE as desired, and ensure the
branch_match.group(1) extraction still returns the branch name from branch_file
content.
In `@plugins/docs-tools/skills/git-pr-reader/scripts/git_commit_reader.py`:
- Around line 406-415: If reader.get_commit_files(c["sha"]) raises, don't
fabricate counts and keep the commit actionable; instead, in the except block
set c["files_changed"]=0 and c["relevant_files_changed"]=0, do not increment
total_files or relevant_files, and add a marker like c["files_fetch_error"]=True
(and log the exception) so downstream logic (e.g., --drop-empty handling) can
drop or skip the commit; update the except branch that currently surrounds
get_commit_files/should_include_file to implement this behavior rather than
using c.get("files_changed", 0).
- Around line 78-93: The platform detection is too permissive for GitHub and too
strict for self‑hosted GitLab: change the conditional that currently uses if
"github.com" in host to an exact check (host == "github.com") so unrelated hosts
like notgithub.com don't match, keep an explicit host == "gitlab.com" branch,
and add a fallback branch that treats any other non‑GitHub host as a self‑hosted
GitLab (reusing the existing GitLab logic that builds owner, repo, owner_repo
from path_parts); update the conditions around host, owner, repo, and owner_repo
to reflect this.
---
Nitpick comments:
In
`@plugins/docs-tools/skills/docs-orchestrator/defaults/docs-commit-workflow.yaml`:
- Around line 44-48: The create-pr step (name: create-pr, skill:
docs-tools:docs-workflow-create-pr) currently lists inputs [style-review,
prepare-branch]; update its inputs to include technical-review so the step waits
for the technical review to complete before creating the PR—i.e., modify the
inputs array on the create-pr step to include "technical-review" alongside
"style-review" and "prepare-branch".
In `@plugins/docs-tools/skills/docs-workflow-create-pr/scripts/create_pr.py`:
- Around line 31-36: The cross-skill import using sys.path.insert and a relative
path built from pathlib.Path(__file__).resolve().parent.parent.parent /
"git-pr-reader" / "scripts" to import load_env_file is brittle; either extract
load_env_file into a shared utilities package used by both plugins or add a
clear comment and documentation near the sys.path.insert explaining the
dependency and why the path is chosen (include the exact relative path and that
it must remain in sync), and ensure the import line (from git_pr_reader import
load_env_file) remains after the documented sys.path insertion so future
refactors know to update or replace it with the shared module.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: cab2e3e5-c379-4a9c-b881-4c2de96bbd6b
📒 Files selected for processing (14)
.claude-plugin/marketplace.jsonplugins/docs-tools/.claude-plugin/plugin.jsonplugins/docs-tools/README.mdplugins/docs-tools/agents/commit-analyst.mdplugins/docs-tools/agents/requirements-analyst.mdplugins/docs-tools/skills/docs-orchestrator/SKILL.mdplugins/docs-tools/skills/docs-orchestrator/defaults/docs-commit-workflow.yamlplugins/docs-tools/skills/docs-workflow-commit-analysis/SKILL.mdplugins/docs-tools/skills/docs-workflow-commits-ready/SKILL.mdplugins/docs-tools/skills/docs-workflow-commits-ready/scripts/commits-ready-check.shplugins/docs-tools/skills/docs-workflow-create-pr/SKILL.mdplugins/docs-tools/skills/docs-workflow-create-pr/scripts/create_pr.pyplugins/docs-tools/skills/docs-workflow-requirements/SKILL.mdplugins/docs-tools/skills/git-pr-reader/scripts/git_commit_reader.py
| ```yaml | ||
| docs-commit-check: | ||
| stage: docs | ||
| rules: | ||
| - if: $CI_PIPELINE_SOURCE == "schedule" | ||
| script: | ||
| - | | ||
| RESULT=$(claude -p "Skill: docs-tools:docs-workflow-commits-ready, \ | ||
| args: \"--repo https://github.com/org/code-repo --branch main\"") | ||
| READY=$(echo "$RESULT" | jq -r '.ready') | ||
| if [ "$READY" != "true" ]; then | ||
| echo "No doc-impacting commits found." | ||
| exit 0 | ||
| fi | ||
| IDENTIFIER=$(echo "$RESULT" | jq -r '.batch.identifier') | ||
| COMMITS=$(echo "$RESULT" | jq -r '.batch.commits | join(",")') | ||
| claude -p "Skill: docs-tools:docs-orchestrator, \ | ||
| args: \"$IDENTIFIER --commits https://github.com/org/code-repo $COMMITS --create-pr\"" | ||
| ``` | ||
|
|
||
| #### GitHub Actions example | ||
|
|
||
| ```yaml | ||
| name: Docs from Commits | ||
| on: | ||
| schedule: | ||
| - cron: '0 */6 * * *' # Every 6 hours | ||
|
|
||
| jobs: | ||
| docs-from-commits: | ||
| runs-on: ubuntu-latest | ||
| steps: | ||
| - uses: actions/checkout@v4 | ||
|
|
||
| - name: Install dependencies | ||
| run: | | ||
| npm install -g @anthropic-ai/claude-code | ||
| python3 -m pip install PyGithub python-gitlab jira pyyaml ratelimit requests beautifulsoup4 html2text | ||
|
|
||
| - name: Check and process commits | ||
| env: | ||
| ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }} | ||
| GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }} | ||
| run: | | ||
| RESULT=$(claude -p "Skill: docs-tools:docs-workflow-commits-ready, \ | ||
| args: \"--repo https://github.com/org/code-repo --branch main\"") | ||
| READY=$(echo "$RESULT" | jq -r '.ready') | ||
| if [ "$READY" != "true" ]; then | ||
| echo "No doc-impacting commits found." | ||
| exit 0 | ||
| fi | ||
| IDENTIFIER=$(echo "$RESULT" | jq -r '.batch.identifier') | ||
| COMMITS=$(echo "$RESULT" | jq -r '.batch.commits | join(",")') | ||
| claude -p "Skill: docs-tools:docs-orchestrator, \ | ||
| args: \"$IDENTIFIER --commits https://github.com/org/code-repo $COMMITS --create-pr\"" | ||
| ``` | ||
|
|
||
| ### Marker files | ||
|
|
||
| The commit-driven workflow tracks which commits have been processed using marker files at `.claude/docs/.commit-markers/<repo-slug>.json`. The orchestrator updates the marker on workflow completion, so failed workflows don't skip commits on retry. |
There was a problem hiding this comment.
The scheduled CI examples need persistent marker storage.
Both examples start from a fresh checkout but never restore or save .claude/docs/.commit-markers/. As written, the next scheduled run loses last_processed_sha and rediscovers the same commit range, which can reopen duplicate workflows or PRs. Please show a cache/artifact/volume restore-save step, or move marker storage to a persistent --base-path.
🧰 Tools
🪛 markdownlint-cli2 (0.21.0)
[warning] 321-321: Code block style
Expected: indented; Actual: fenced
(MD046, code-block-style)
[warning] 343-343: Code block style
Expected: indented; Actual: fenced
(MD046, code-block-style)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@plugins/docs-tools/README.md` around lines 321 - 380, The CI examples don’t
persist `.claude/docs/.commit-markers/<repo-slug>.json`, causing repeated
discovery of the same commits; update the pipeline steps that run the claude
commands (the shell block that sets RESULT/READY/IDENTIFIER/COMMITS and calls
the docs-orchestrator) to restore the marker directory before checkout
processing and save it after processing (use your CI’s cache/artifact/volume
actions), or alternatively add and document a persistent base path flag to the
orchestrator invocation (e.g., pass a --base-path or env var pointing to a
mounted persistent directory instead of the repo working tree) and ensure the
same path is restored between scheduled runs so last_processed_sha is preserved.
| ### 2. Dispatch agent | ||
|
|
||
| Dispatch the `docs-tools:commit-analyst` agent with the following prompt. | ||
|
|
||
| **Agent tool parameters:** | ||
| - `subagent_type`: `docs-tools:commit-analyst` | ||
| - `description`: `Analyze commits for documentation impact` | ||
|
|
||
| **Prompt:** | ||
|
|
||
| > Analyze the following commits for documentation impact. | ||
| > | ||
| > **Repository**: `<REPO_URL>` | ||
| > **Commits**: `<SHA1>`, `<SHA2>`, ... | ||
| > | ||
| > Save your complete analysis to: `<OUTPUT_FILE>` | ||
| > | ||
| > Follow your standard analysis methodology (fetch commit data, parse messages, analyze diffs for change signals, grade doc impact). Format the output as structured markdown for the requirements-analyst. | ||
|
|
||
| ### 3. Verify output | ||
|
|
||
| After the agent completes, verify the output file exists at `<OUTPUT_FILE>`. |
There was a problem hiding this comment.
Have the step write analysis.md itself.
This step verifies <OUTPUT_FILE> after dispatch, but the actual write is delegated to docs-tools:commit-analyst. That agent does not advertise Write/Edit tools in the provided context, so file creation depends on ad-hoc shell output inside the subagent. Capture the agent response here and write it in the step to make the contract deterministic.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@plugins/docs-tools/skills/docs-workflow-commit-analysis/SKILL.md` around
lines 39 - 60, The dispatch step currently relies on the subagent
docs-tools:commit-analyst to write <OUTPUT_FILE>, which is non-deterministic;
change the step so the parent skill captures the dispatched agent's analysis
response and writes it itself to a deterministic file name (analysis.md).
Specifically, after calling the agent with subagent_type
docs-tools:commit-analyst and the given prompt, read the agent's returned
content (response body), validate it, and perform the file write to analysis.md
(instead of assuming the subagent will create <OUTPUT_FILE>); ensure the
verification step checks the locally written analysis.md. Use the existing
dispatch/verify flow and reference the dispatch invocation and <OUTPUT_FILE>
placeholders when updating SKILL.md instructions.
| # Resolve git_commit_reader.py relative to this script | ||
| SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)" | ||
| COMMIT_READER="${SCRIPT_DIR}/../../git-pr-reader/scripts/git_commit_reader.py" | ||
|
|
There was a problem hiding this comment.
🛠️ Refactor suggestion | 🟠 Major
Resolve the sibling reader via CLAUDE_PLUGIN_ROOT.
This reaches into another skill with ../../git-pr-reader/.... The repo convention is to use ${CLAUDE_PLUGIN_ROOT} for cross-skill script calls instead of hard-coded relative traversal.
Suggested fix
-# Resolve git_commit_reader.py relative to this script
-SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
-COMMIT_READER="${SCRIPT_DIR}/../../git-pr-reader/scripts/git_commit_reader.py"
+: "${CLAUDE_PLUGIN_ROOT:?CLAUDE_PLUGIN_ROOT must be set}"
+COMMIT_READER="${CLAUDE_PLUGIN_ROOT}/skills/git-pr-reader/scripts/git_commit_reader.py"As per coding guidelines "Cross-skill and cross-command script calls must use ${CLAUDE_PLUGIN_ROOT} for paths like python3 ${CLAUDE_PLUGIN_ROOT}/skills/git-pr-reader/scripts/git_pr_reader.py".
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| # Resolve git_commit_reader.py relative to this script | |
| SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)" | |
| COMMIT_READER="${SCRIPT_DIR}/../../git-pr-reader/scripts/git_commit_reader.py" | |
| : "${CLAUDE_PLUGIN_ROOT:?CLAUDE_PLUGIN_ROOT must be set}" | |
| COMMIT_READER="${CLAUDE_PLUGIN_ROOT}/skills/git-pr-reader/scripts/git_commit_reader.py" |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In
`@plugins/docs-tools/skills/docs-workflow-commits-ready/scripts/commits-ready-check.sh`
around lines 30 - 33, The script currently sets COMMIT_READER by resolving a
hard-coded relative path via SCRIPT_DIR and "../../git-pr-reader/..."; update it
to use the repo-wide cross-skill root variable instead: replace the
COMMIT_READER assignment to reference
"${CLAUDE_PLUGIN_ROOT}/skills/git-pr-reader/scripts/git_commit_reader.py"
(preserve quoting) so the script uses ${CLAUDE_PLUGIN_ROOT} for cross-skill
lookup rather than ../../ traversal; keep SCRIPT_DIR if used elsewhere but
ensure COMMIT_READER points to the CLAUDE_PLUGIN_ROOT-based path and still
points to the correct git_commit_reader.py filename.
| def read_branch_info(base_path: str) -> dict: | ||
| """Read branch info from the prepare-branch step output.""" | ||
| branch_file = os.path.join(base_path, "prepare-branch", "branch-info.md") | ||
| if not os.path.exists(branch_file): | ||
| raise FileNotFoundError(f"Branch info not found at {branch_file}") | ||
|
|
||
| with open(branch_file) as f: | ||
| content = f.read() | ||
|
|
||
| # Extract branch name from the markdown | ||
| branch_match = re.search(r"\*\*Branch\*\*:\s*`?([^\s`]+)`?", content) | ||
| if not branch_match: | ||
| # Try alternative format | ||
| branch_match = re.search(r"branch[:\s]+`?([a-z0-9_/-]+)`?", content, re.IGNORECASE) | ||
|
|
||
| if not branch_match: | ||
| raise ValueError(f"Cannot extract branch name from {branch_file}") | ||
|
|
||
| return {"branch": branch_match.group(1), "raw": content} |
There was a problem hiding this comment.
Branch name extraction regex may be too permissive.
The fallback regex on line 107 (branch[:\s]+\?([a-z0-9_/-]+)`?`) uses case-insensitive matching but the character class only allows lowercase. Branch names with uppercase letters would fail to match.
Suggested fix
- branch_match = re.search(r"branch[:\s]+`?([a-z0-9_/-]+)`?", content, re.IGNORECASE)
+ branch_match = re.search(r"branch[:\s]+`?([A-Za-z0-9_/-]+)`?", content, re.IGNORECASE)📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| def read_branch_info(base_path: str) -> dict: | |
| """Read branch info from the prepare-branch step output.""" | |
| branch_file = os.path.join(base_path, "prepare-branch", "branch-info.md") | |
| if not os.path.exists(branch_file): | |
| raise FileNotFoundError(f"Branch info not found at {branch_file}") | |
| with open(branch_file) as f: | |
| content = f.read() | |
| # Extract branch name from the markdown | |
| branch_match = re.search(r"\*\*Branch\*\*:\s*`?([^\s`]+)`?", content) | |
| if not branch_match: | |
| # Try alternative format | |
| branch_match = re.search(r"branch[:\s]+`?([a-z0-9_/-]+)`?", content, re.IGNORECASE) | |
| if not branch_match: | |
| raise ValueError(f"Cannot extract branch name from {branch_file}") | |
| return {"branch": branch_match.group(1), "raw": content} | |
| def read_branch_info(base_path: str) -> dict: | |
| """Read branch info from the prepare-branch step output.""" | |
| branch_file = os.path.join(base_path, "prepare-branch", "branch-info.md") | |
| if not os.path.exists(branch_file): | |
| raise FileNotFoundError(f"Branch info not found at {branch_file}") | |
| with open(branch_file) as f: | |
| content = f.read() | |
| # Extract branch name from the markdown | |
| branch_match = re.search(r"\*\*Branch\*\*:\s*`?([^\s`]+)`?", content) | |
| if not branch_match: | |
| # Try alternative format | |
| branch_match = re.search(r"branch[:\s]+`?([A-Za-z0-9_/-]+)`?", content, re.IGNORECASE) | |
| if not branch_match: | |
| raise ValueError(f"Cannot extract branch name from {branch_file}") | |
| return {"branch": branch_match.group(1), "raw": content} |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@plugins/docs-tools/skills/docs-workflow-create-pr/scripts/create_pr.py`
around lines 94 - 112, The fallback regex in read_branch_info currently uses a
character class that only lists lowercase letters which can fail for uppercase
branch names; update the fallback pattern (the second re.search assigned to
branch_match) to use an explicit case-inclusive class like [A-Za-z0-9_/-]+ (or a
more permissive pattern such as \S+ if you prefer) so uppercase characters are
accepted, keep or remove re.IGNORECASE as desired, and ensure the
branch_match.group(1) extraction still returns the branch name from branch_file
content.
| if "github.com" in host: | ||
| platform = "github" | ||
| owner = path_parts[0] | ||
| repo = path_parts[1].removesuffix(".git") | ||
| owner_repo = f"{owner}/{repo}" | ||
| elif "gitlab" in host: | ||
| platform = "gitlab" | ||
| # GitLab supports nested groups: group/subgroup/project | ||
| repo = path_parts[-1].removesuffix(".git") | ||
| owner = "/".join(path_parts[:-1]) | ||
| owner_repo = f"{owner}/{repo}" | ||
| else: | ||
| raise ValueError( | ||
| f"Cannot determine platform from URL: {url}\n" | ||
| "Supported: github.com, gitlab.com (or self-hosted GitLab)" | ||
| ) |
There was a problem hiding this comment.
The URL parser does not actually support generic self-hosted GitLab.
elif "gitlab" in host only works when the hostname literally contains gitlab, while if "github.com" in host also accepts unrelated hosts like notgithub.com. That contradicts the supported-URL contract in the docstring/help text and can route valid repos to the wrong backend.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@plugins/docs-tools/skills/git-pr-reader/scripts/git_commit_reader.py` around
lines 78 - 93, The platform detection is too permissive for GitHub and too
strict for self‑hosted GitLab: change the conditional that currently uses if
"github.com" in host to an exact check (host == "github.com") so unrelated hosts
like notgithub.com don't match, keep an explicit host == "gitlab.com" branch,
and add a fallback branch that treats any other non‑GitHub host as a self‑hosted
GitLab (reusing the existing GitLab logic that builds owner, repo, owner_repo
from path_parts); update the conditions around host, owner, repo, and owner_repo
to reflect this.
| try: | ||
| files = reader.get_commit_files(c["sha"]) | ||
| all_count = len(files) | ||
| rel_count = sum(1 for f in files if should_include_file(f["filename"], filters)) | ||
| c["files_changed"] = all_count | ||
| c["relevant_files_changed"] = rel_count | ||
| total_files += all_count | ||
| relevant_files += rel_count | ||
| except Exception: | ||
| c["relevant_files_changed"] = c.get("files_changed", 0) |
There was a problem hiding this comment.
Don't turn file lookup failures into actionable commits.
If get_commit_files() fails here, the code keeps the commit and fabricates relevant_files_changed from a fallback field. With --drop-empty, an auth/rate-limit/API failure can therefore start the workflow on unverified commits.
Suggested fix
- except Exception:
- c["relevant_files_changed"] = c.get("files_changed", 0)
+ except Exception as exc:
+ raise RuntimeError(
+ f"Failed to load changed files for commit {c['sha']}: {exc}"
+ ) from exc🧰 Tools
🪛 Ruff (0.15.6)
[warning] 414-414: Do not catch blind exception: Exception
(BLE001)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@plugins/docs-tools/skills/git-pr-reader/scripts/git_commit_reader.py` around
lines 406 - 415, If reader.get_commit_files(c["sha"]) raises, don't fabricate
counts and keep the commit actionable; instead, in the except block set
c["files_changed"]=0 and c["relevant_files_changed"]=0, do not increment
total_files or relevant_files, and add a marker like c["files_fetch_error"]=True
(and log the exception) so downstream logic (e.g., --drop-empty handling) can
drop or skip the commit; update the except branch that currently surrounds
get_commit_files/should_include_file to implement this behavior rather than
using c.get("files_changed", 0).
Repos use either main or master as their default branch. Instead of hardcoding main, detect the default branch from the GitHub/GitLab API when --branch is not specified. The gate script also omits --branch by default so auto-detection flows through. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Separate stderr from stdout in commits-ready-check.sh so python-gitlab warnings don't corrupt the JSON output. Suppress the pagination warning by passing get_all=False. Use the resolved branch name from the reader output in all gate script JSON responses. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (2)
plugins/docs-tools/skills/git-pr-reader/scripts/git_commit_reader.py (2)
224-226: Unusedbranchparameter inget_range_filesmethods.Both
GitHubCommitReader.get_range_files(line 225) andGitLabCommitReader.get_range_files(line 352) declare abranchparameter that is never used in the method body. Consider removing it or documenting why it's reserved for future use.♻️ Proposed fix for GitHubCommitReader
def get_range_files( - self, from_sha: str, to_sha: str, branch: str = "main" + self, from_sha: str, to_sha: str ) -> List[Dict[str, Any]]:Also applies to: 351-353
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@plugins/docs-tools/skills/git-pr-reader/scripts/git_commit_reader.py` around lines 224 - 226, The branch parameter is declared but unused in both GitHubCommitReader.get_range_files and GitLabCommitReader.get_range_files; remove the branch: str = "main" parameter from both method signatures (and any corresponding interface/abstract method declaration) and update all internal callers to stop passing a branch argument, or if removal breaks external compatibility, mark it as intentionally unused by renaming to _branch and add a short comment/docstring stating it's reserved for future use; ensure the change is mirrored across overridden/related methods so signatures remain consistent.
166-177: Remove unusedfound_sincevariable.The variable
found_sinceis assigned on line 167 but never used anywhere in the function. This appears to be leftover from an earlier implementation.♻️ Proposed fix
results = [] - found_since = since_sha is None collected = 0 for commit in commits_iter:🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@plugins/docs-tools/skills/git-pr-reader/scripts/git_commit_reader.py` around lines 166 - 177, Remove the unused variable assignment "found_since = since_sha is None" in the function that iterates over commits (the loop using commits_iter); it's dead code leftover—delete that single assignment and ensure no other references to found_since remain in the function (e.g., around the for commit in commits_iter block and related logic that checks since_sha).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In
`@plugins/docs-tools/skills/docs-workflow-commits-ready/scripts/commits-ready-check.sh`:
- Around line 52-53: Remove unused variables or implement their intended
behavior: if you want a dry-run mode keep the --dry-run case and the DRY_RUN
variable and wire it into the script's side-effecting steps (e.g., in functions
that mutate state or call external commands like git push/notify, check DRY_RUN
and only print the action instead of executing); otherwise remove the --dry-run
branch and DRY_RUN assignment. Likewise either remove the ALL_SHAS,
FILTERED_COUNT, and FILTERED_MAP variables if they are vestigial, or use them
where filtering/aggregation is intended (populate ALL_SHAS with the list of
candidate commit SHAs, update FILTERED_MAP when filtering commits, and set
FILTERED_COUNT from its length) so downstream logic references them. Ensure
references use the exact names ALL_SHAS, FILTERED_MAP, FILTERED_COUNT and update
any conditional or reporting output to reflect dry-run behavior if kept.
---
Nitpick comments:
In `@plugins/docs-tools/skills/git-pr-reader/scripts/git_commit_reader.py`:
- Around line 224-226: The branch parameter is declared but unused in both
GitHubCommitReader.get_range_files and GitLabCommitReader.get_range_files;
remove the branch: str = "main" parameter from both method signatures (and any
corresponding interface/abstract method declaration) and update all internal
callers to stop passing a branch argument, or if removal breaks external
compatibility, mark it as intentionally unused by renaming to _branch and add a
short comment/docstring stating it's reserved for future use; ensure the change
is mirrored across overridden/related methods so signatures remain consistent.
- Around line 166-177: Remove the unused variable assignment "found_since =
since_sha is None" in the function that iterates over commits (the loop using
commits_iter); it's dead code leftover—delete that single assignment and ensure
no other references to found_since remain in the function (e.g., around the for
commit in commits_iter block and related logic that checks since_sha).
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: ddbc1748-8e43-4a6c-900b-c3e7b70dfd86
📒 Files selected for processing (3)
plugins/docs-tools/skills/docs-workflow-commits-ready/SKILL.mdplugins/docs-tools/skills/docs-workflow-commits-ready/scripts/commits-ready-check.shplugins/docs-tools/skills/git-pr-reader/scripts/git_commit_reader.py
✅ Files skipped from review due to trivial changes (1)
- plugins/docs-tools/skills/docs-workflow-commits-ready/SKILL.md
- Fix DOC_IMPACT marker case inconsistency (NONE → None) to match orchestrator short-circuit pattern - Add CI marker persistence tip to README commit-driven workflow docs - Log warning to stderr on file enrichment failures in git_commit_reader Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (1)
plugins/docs-tools/skills/git-pr-reader/scripts/git_commit_reader.py (1)
427-430:⚠️ Potential issue | 🟡 MinorError handling on file fetch still masks failures for
--drop-emptylogic.When
get_commit_files()fails, the code falls back to usingc.get("files_changed", 0)(which may be the line-change count fromstats.total) for bothfiles_changedandrelevant_files_changed. This means:
- Failed fetches are indistinguishable from successful ones in the output
- With
--drop-empty, commits where the API failed won't be dropped even if they have no relevant filesConsider adding an explicit error marker (e.g.,
c["files_fetch_error"] = True) so downstream consumers can identify and handle these cases appropriately.🛡️ Suggested improvement
except Exception as exc: print(f"Warning: failed to fetch files for {c['sha'][:7]}: {exc}", file=sys.stderr) - c["files_changed"] = c.get("files_changed", 0) - c["relevant_files_changed"] = c.get("files_changed", 0) + c["files_changed"] = 0 + c["relevant_files_changed"] = 0 + c["files_fetch_error"] = True🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@plugins/docs-tools/skills/git-pr-reader/scripts/git_commit_reader.py` around lines 427 - 430, The except block that handles failures from get_commit_files currently masks errors by populating c["files_changed"] and c["relevant_files_changed"] from stats, making failed fetches indistinguishable and preventing --drop-empty from dropping those commits; update that except handler to set a clear error marker c["files_fetch_error"]=True, keep or set c["files_changed"] to the available fallback value, and explicitly set c["relevant_files_changed"]=0 (or None if upstream expects that) so downstream logic (including --drop-empty) can detect the fetch failure via files_fetch_error and treat the commit as having no relevant files; locate this change in the except handling around the get_commit_files call where c is modified.
🧹 Nitpick comments (1)
plugins/docs-tools/skills/git-pr-reader/scripts/git_commit_reader.py (1)
167-167: Dead code:found_sinceis assigned but never used.This variable is set but never referenced in the function body.
♻️ Suggested fix
results = [] - found_since = since_sha is None collected = 0🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@plugins/docs-tools/skills/git-pr-reader/scripts/git_commit_reader.py` at line 167, The variable found_since is assigned but never used; remove the unused assignment "found_since = since_sha is None" from the function to eliminate dead code, or if the original intent was to track whether we've encountered since_sha, use found_since in the function's control flow (e.g., replace any implicit checks with this boolean) so the variable is actually referenced; ensure references are to the exact variable names found_since and since_sha when applying the fix.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@plugins/docs-tools/skills/git-pr-reader/scripts/git_commit_reader.py`:
- Line 189: The field "files_changed" is populated with commit.stats.total
(which is total line changes), so update the code that builds commit records
(the object using "files_changed") to use len(commit.files) instead of
commit.stats.total to represent the number of files changed; alternatively, if
you intend to keep line-change counts, rename the field to something like
"lines_changed" and preserve commit.stats.total. Locate the creation of the
commit dict/record where "files_changed" is assigned (the usage of
commit.stats.total) and replace with len(commit.files) or rename the key
consistently, and ensure any downstream callers such as list_commits() or
cmd_list() are updated to expect the new key or value semantics.
---
Duplicate comments:
In `@plugins/docs-tools/skills/git-pr-reader/scripts/git_commit_reader.py`:
- Around line 427-430: The except block that handles failures from
get_commit_files currently masks errors by populating c["files_changed"] and
c["relevant_files_changed"] from stats, making failed fetches indistinguishable
and preventing --drop-empty from dropping those commits; update that except
handler to set a clear error marker c["files_fetch_error"]=True, keep or set
c["files_changed"] to the available fallback value, and explicitly set
c["relevant_files_changed"]=0 (or None if upstream expects that) so downstream
logic (including --drop-empty) can detect the fetch failure via
files_fetch_error and treat the commit as having no relevant files; locate this
change in the except handling around the get_commit_files call where c is
modified.
---
Nitpick comments:
In `@plugins/docs-tools/skills/git-pr-reader/scripts/git_commit_reader.py`:
- Line 167: The variable found_since is assigned but never used; remove the
unused assignment "found_since = since_sha is None" from the function to
eliminate dead code, or if the original intent was to track whether we've
encountered since_sha, use found_since in the function's control flow (e.g.,
replace any implicit checks with this boolean) so the variable is actually
referenced; ensure references are to the exact variable names found_since and
since_sha when applying the fix.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 9e2abf59-391b-421c-8a21-128d806120f7
📒 Files selected for processing (3)
plugins/docs-tools/README.mdplugins/docs-tools/agents/commit-analyst.mdplugins/docs-tools/skills/git-pr-reader/scripts/git_commit_reader.py
✅ Files skipped from review due to trivial changes (1)
- plugins/docs-tools/README.md
🚧 Files skipped from review as they are similar to previous changes (1)
- plugins/docs-tools/agents/commit-analyst.md
| "author": commit.commit.author.name if commit.commit.author else "unknown", | ||
| "date": commit.commit.author.date.isoformat() if commit.commit.author else None, | ||
| "parents": [p.sha for p in commit.parents], | ||
| "files_changed": commit.stats.total if commit.stats else 0, |
There was a problem hiding this comment.
🧩 Analysis chain
🌐 Web query:
PyGithub commit stats.total vs files count
💡 Result:
In PyGithub, for a Commit object, commit.stats is a CommitStats object from the GitHub API's /repos/{owner}/{repo}/commits/{ref} endpoint. The GitHub API response includes a 'stats' object with fields: 'additions' (lines added), 'deletions' (lines deleted), and 'total' (total lines changed, i.e., additions + deletions). There is also a separate 'files' array listing changed files, each with per-file stats. PyGithub's CommitStats.total corresponds to the API's stats.total (total line changes). The number of files changed is len(commit.files), which may be limited to 300 by default (paginated up to 3000). They differ: stats.total is total lines changed across all files; files count is the number of unique files modified in the commit.
Citations:
- 1: https://pygithub.readthedocs.io/en/latest/github_objects/CommitStats.html
- 2: https://pygithub.readthedocs.io/en/v1.57/github_objects/Commit.html
- 3: https://github.com/PyGithub/PyGithub/blob/main/github/Commit.py
- 4: https://docs.github.com/en/rest/commits/commits
🏁 Script executed:
cat -n plugins/docs-tools/skills/git-pr-reader/scripts/git_commit_reader.py | sed -n '180,200p'Repository: redhat-documentation/redhat-docs-agent-tools
Length of output: 1144
🏁 Script executed:
# Check the get_commit_files method to understand file handling
cat -n plugins/docs-tools/skills/git-pr-reader/scripts/git_commit_reader.py | sed -n '199,220p'Repository: redhat-documentation/redhat-docs-agent-tools
Length of output: 1051
🏁 Script executed:
# Search for usages of "files_changed" in the codebase
rg "files_changed" plugins/docs-tools/skills/git-pr-reader/ -nRepository: redhat-documentation/redhat-docs-agent-tools
Length of output: 1207
🏁 Script executed:
# Check lines around 423-424 to see what all_count and rel_count are
cat -n plugins/docs-tools/skills/git-pr-reader/scripts/git_commit_reader.py | sed -n '400,440p'Repository: redhat-documentation/redhat-docs-agent-tools
Length of output: 1889
🏁 Script executed:
# Check what list_commits returns and if it's used before being enriched
cat -n plugins/docs-tools/skills/git-pr-reader/scripts/git_commit_reader.py | sed -n '1,50p'Repository: redhat-documentation/redhat-docs-agent-tools
Length of output: 2015
🏁 Script executed:
# Look at the full list_commits method to understand the flow
rg "def list_commits" plugins/docs-tools/skills/git-pr-reader/scripts/git_commit_reader.py -A 30Repository: redhat-documentation/redhat-docs-agent-tools
Length of output: 2146
commit.stats.total returns total line changes, not file count.
In PyGithub, stats.total is the sum of additions and deletions (total line changes), not the number of files changed. This creates a semantic mismatch with the field name files_changed. Using commit.stats.total here means the field contains incorrect data when consumed directly from list_commits() or if the downstream enrichment in cmd_list() fails. Consider using len(commit.files) for accurate file counts, or rename the field to clarify that it represents line changes rather than file count.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@plugins/docs-tools/skills/git-pr-reader/scripts/git_commit_reader.py` at line
189, The field "files_changed" is populated with commit.stats.total (which is
total line changes), so update the code that builds commit records (the object
using "files_changed") to use len(commit.files) instead of commit.stats.total to
represent the number of files changed; alternatively, if you intend to keep
line-change counts, rename the field to something like "lines_changed" and
preserve commit.stats.total. Locate the creation of the commit dict/record where
"files_changed" is assigned (the usage of commit.stats.total) and replace with
len(commit.files) or rename the key consistently, and ensure any downstream
callers such as list_commits() or cmd_list() are updated to expect the new key
or value semantics.
- Remove unused variables in commits-ready-check.sh (DRY_RUN, ALL_SHAS, FILTERED_COUNT, FILTERED_MAP) - Remove dead found_since assignment in git_commit_reader.py - Prefix unused branch parameter as _branch in get_range_files - Fix files_changed initial value (was line count, now 0 like GitLab) - Simplify exception handler to not mask --drop-empty defaults Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (4)
plugins/docs-tools/skills/git-pr-reader/scripts/git_commit_reader.py (2)
418-429:⚠️ Potential issue | 🟠 MajorDo not keep commits actionable when file fetch fails.
If
get_commit_files()fails, the commit keeps default include behavior (relevant_files_changedunset + fallback1in drop logic), which can trigger downstream analysis on unverified commits.Proposed fix
except Exception as exc: print(f"Warning: failed to fetch files for {c['sha'][:7]}: {exc}", file=sys.stderr) - # Leave relevant_files_changed unset; --drop-empty defaults to keeping commit + c["files_changed"] = 0 + c["relevant_files_changed"] = 0 + c["files_fetch_error"] = True🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@plugins/docs-tools/skills/git-pr-reader/scripts/git_commit_reader.py` around lines 418 - 429, When reader.get_commit_files(c["sha"]) raises, don't leave c["relevant_files_changed"] unset (which currently falls back to 1); instead explicitly mark the commit as having no relevant files so it won't be considered actionable: inside the except block set c["relevant_files_changed"] = 0 (and also set c["files_changed"] = 0 if you want total_files accounting to exclude this commit), and avoid incrementing total_files or relevant_files for this commit (update/remove the total_files/relevant_files increments accordingly). This change touches the error handler around reader.get_commit_files, and references c["relevant_files_changed"], c["files_changed"], total_files, relevant_files and should_include_file for locating the logic.
78-93:⚠️ Potential issue | 🟠 MajorFix platform detection for GitHub and self-hosted GitLab hosts.
if "github.com" in hostmatches unrelated hosts (e.g.,notgithub.com), whileelif "gitlab" in hostrejects valid self-hosted GitLab domains that do not containgitlab.Proposed fix
- if "github.com" in host: + if host == "github.com": platform = "github" owner = path_parts[0] repo = path_parts[1].removesuffix(".git") owner_repo = f"{owner}/{repo}" - elif "gitlab" in host: + elif host == "gitlab.com" or host != "github.com": platform = "gitlab" # GitLab supports nested groups: group/subgroup/project repo = path_parts[-1].removesuffix(".git") owner = "/".join(path_parts[:-1]) owner_repo = f"{owner}/{repo}" - else: - raise ValueError( - f"Cannot determine platform from URL: {url}\n" - "Supported: github.com, gitlab.com (or self-hosted GitLab)" - )🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@plugins/docs-tools/skills/git-pr-reader/scripts/git_commit_reader.py` around lines 78 - 93, The platform detection incorrectly uses substring checks on host; update the logic that examines host (the host variable used with path_parts and url) to use more precise domain/suffix matching: detect GitHub when host == "github.com" or host.endswith(".github.com"); detect GitLab when host == "gitlab.com" or host.endswith(".gitlab.com"); for self-hosted GitLab allow a configurable list (e.g., GITLAB_HOSTS) or as a fallback treat any non-GitHub host with a standard repo path (path_parts length >=2 and repo.endswith(".git")/no suffix) as GitLab; keep the existing owner/repo extraction (owner = path_parts[0] / repo = path_parts[1].removesuffix(".git") for GitHub, and repo = path_parts[-1].removesuffix(".git"), owner = "/".join(path_parts[:-1]) for GitLab) and raise the same ValueError only when none of these detection methods match.plugins/docs-tools/skills/docs-workflow-commits-ready/scripts/commits-ready-check.sh (2)
29-31:⚠️ Potential issue | 🟠 MajorUse
${CLAUDE_PLUGIN_ROOT}for cross-skill script resolution.Hard-coded
../../git-pr-reader/...is brittle and violates the repository rule for cross-skill script calls.Proposed fix
-# Resolve git_commit_reader.py relative to this script -SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)" -COMMIT_READER="${SCRIPT_DIR}/../../git-pr-reader/scripts/git_commit_reader.py" +: "${CLAUDE_PLUGIN_ROOT:?CLAUDE_PLUGIN_ROOT must be set}" +COMMIT_READER="${CLAUDE_PLUGIN_ROOT}/skills/git-pr-reader/scripts/git_commit_reader.py"As per coding guidelines "Cross-skill and cross-command script calls must use
${CLAUDE_PLUGIN_ROOT}for paths likepython3 ${CLAUDE_PLUGIN_ROOT}/skills/git-pr-reader/scripts/git_pr_reader.py".🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@plugins/docs-tools/skills/docs-workflow-commits-ready/scripts/commits-ready-check.sh` around lines 29 - 31, Replace the brittle relative path assigned to COMMIT_READER (currently built from SCRIPT_DIR and "../../git-pr-reader/...") with a cross-skill-aware invocation using the CLAUDE_PLUGIN_ROOT environment variable; update the COMMIT_READER reference so it invokes the git_pr_reader script via python3 ${CLAUDE_PLUGIN_ROOT}/skills/git-pr-reader/scripts/git_pr_reader.py (keep SCRIPT_DIR unchanged if still needed elsewhere) so all cross-skill script calls follow the repository guideline.
79-82:⚠️ Potential issue | 🟠 MajorFail fast on malformed marker JSON instead of silently replaying.
jq ... || trueon Line 81 suppresses marker corruption and causes reprocessing from scratch.Proposed fix
if [[ -z "$SINCE_SHA" ]]; then if [[ -f "$MARKER_FILE" ]]; then - SINCE_SHA=$(jq -r '.last_processed_sha // empty' "$MARKER_FILE" 2>/dev/null || true) + if ! jq empty "$MARKER_FILE" >/dev/null 2>&1; then + echo "{\"error\": \"Invalid marker file: $MARKER_FILE\"}" + exit 1 + fi + SINCE_SHA=$(jq -r '.last_processed_sha // empty' "$MARKER_FILE") fi fi🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@plugins/docs-tools/skills/docs-workflow-commits-ready/scripts/commits-ready-check.sh` around lines 79 - 82, The script silently ignores jq errors when reading MARKER_FILE into SINCE_SHA (the command using jq -r '.last_processed_sha // empty' "$MARKER_FILE" 2>/dev/null || true), which causes corrupted marker JSON to be treated as empty and replayed; change this so malformed JSON causes failure: remove the "|| true" suppression, propagate jq failures (or use jq -e) and on error print a clear message including MARKER_FILE and exit with a non-zero status so the problem is surfaced instead of silently replaying.
🧹 Nitpick comments (1)
plugins/docs-tools/skills/docs-workflow-commits-ready/scripts/commits-ready-check.sh (1)
86-95: Build reader invocation as an argument array to avoid splitting bugs.Constructing
READER_ARGSas a single string and expanding unquoted can break on edge-case argument values.Proposed refactor
-READER_ARGS="list $REPO --max $MAX_COMMITS --no-merges --drop-empty --json" +READER_ARGS=(list "$REPO" --max "$MAX_COMMITS" --no-merges --drop-empty --json) if [[ -n "$BRANCH" ]]; then - READER_ARGS="$READER_ARGS --branch $BRANCH" + READER_ARGS+=(--branch "$BRANCH") fi if [[ -n "$SINCE_SHA" ]]; then - READER_ARGS="$READER_ARGS --since $SINCE_SHA" + READER_ARGS+=(--since "$SINCE_SHA") fi @@ -COMMIT_OUTPUT=$(python3 "$COMMIT_READER" $READER_ARGS 2>"$COMMIT_STDERR") || { +COMMIT_OUTPUT=$(python3 "$COMMIT_READER" "${READER_ARGS[@]}" 2>"$COMMIT_STDERR") || {🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@plugins/docs-tools/skills/docs-workflow-commits-ready/scripts/commits-ready-check.sh` around lines 86 - 95, The READER_ARGS string expansion is unsafe and can split on spaces; change READER_ARGS into a bash array (e.g., declare -a READER_ARGS) and build it as READER_ARGS=(list "$REPO" --max "$MAX_COMMITS" --no-merges --drop-empty --json), append branch/sha with READER_ARGS+=(--branch "$BRANCH") and READER_ARGS+=(--since "$SINCE_SHA") when set, then invoke the reader with python3 "$COMMIT_READER" "${READER_ARGS[@]}" 2>"$COMMIT_STDERR" so arguments are preserved; keep COMMIT_STDERR and COMMIT_OUTPUT usage the same.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@plugins/docs-tools/skills/git-pr-reader/scripts/git_commit_reader.py`:
- Around line 170-176: The loop currently breaks on collected >= max_commits
only when since_sha is None, causing --max to be ignored if --since is provided;
update the logic in the loop that inspects commits (variables: collected,
max_commits, since_sha, commit.sha) so that you check and break on collected >=
max_commits regardless of since_sha (i.e., perform the max_commits check before
or independently of the since_sha checks), ensuring you still break if
commit.sha equals or startswith since_sha.
---
Duplicate comments:
In
`@plugins/docs-tools/skills/docs-workflow-commits-ready/scripts/commits-ready-check.sh`:
- Around line 29-31: Replace the brittle relative path assigned to COMMIT_READER
(currently built from SCRIPT_DIR and "../../git-pr-reader/...") with a
cross-skill-aware invocation using the CLAUDE_PLUGIN_ROOT environment variable;
update the COMMIT_READER reference so it invokes the git_pr_reader script via
python3 ${CLAUDE_PLUGIN_ROOT}/skills/git-pr-reader/scripts/git_pr_reader.py
(keep SCRIPT_DIR unchanged if still needed elsewhere) so all cross-skill script
calls follow the repository guideline.
- Around line 79-82: The script silently ignores jq errors when reading
MARKER_FILE into SINCE_SHA (the command using jq -r '.last_processed_sha //
empty' "$MARKER_FILE" 2>/dev/null || true), which causes corrupted marker JSON
to be treated as empty and replayed; change this so malformed JSON causes
failure: remove the "|| true" suppression, propagate jq failures (or use jq -e)
and on error print a clear message including MARKER_FILE and exit with a
non-zero status so the problem is surfaced instead of silently replaying.
In `@plugins/docs-tools/skills/git-pr-reader/scripts/git_commit_reader.py`:
- Around line 418-429: When reader.get_commit_files(c["sha"]) raises, don't
leave c["relevant_files_changed"] unset (which currently falls back to 1);
instead explicitly mark the commit as having no relevant files so it won't be
considered actionable: inside the except block set c["relevant_files_changed"] =
0 (and also set c["files_changed"] = 0 if you want total_files accounting to
exclude this commit), and avoid incrementing total_files or relevant_files for
this commit (update/remove the total_files/relevant_files increments
accordingly). This change touches the error handler around
reader.get_commit_files, and references c["relevant_files_changed"],
c["files_changed"], total_files, relevant_files and should_include_file for
locating the logic.
- Around line 78-93: The platform detection incorrectly uses substring checks on
host; update the logic that examines host (the host variable used with
path_parts and url) to use more precise domain/suffix matching: detect GitHub
when host == "github.com" or host.endswith(".github.com"); detect GitLab when
host == "gitlab.com" or host.endswith(".gitlab.com"); for self-hosted GitLab
allow a configurable list (e.g., GITLAB_HOSTS) or as a fallback treat any
non-GitHub host with a standard repo path (path_parts length >=2 and
repo.endswith(".git")/no suffix) as GitLab; keep the existing owner/repo
extraction (owner = path_parts[0] / repo = path_parts[1].removesuffix(".git")
for GitHub, and repo = path_parts[-1].removesuffix(".git"), owner =
"/".join(path_parts[:-1]) for GitLab) and raise the same ValueError only when
none of these detection methods match.
---
Nitpick comments:
In
`@plugins/docs-tools/skills/docs-workflow-commits-ready/scripts/commits-ready-check.sh`:
- Around line 86-95: The READER_ARGS string expansion is unsafe and can split on
spaces; change READER_ARGS into a bash array (e.g., declare -a READER_ARGS) and
build it as READER_ARGS=(list "$REPO" --max "$MAX_COMMITS" --no-merges
--drop-empty --json), append branch/sha with READER_ARGS+=(--branch "$BRANCH")
and READER_ARGS+=(--since "$SINCE_SHA") when set, then invoke the reader with
python3 "$COMMIT_READER" "${READER_ARGS[@]}" 2>"$COMMIT_STDERR" so arguments are
preserved; keep COMMIT_STDERR and COMMIT_OUTPUT usage the same.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: e4a17921-d1be-4320-b516-0fb360920374
📒 Files selected for processing (2)
plugins/docs-tools/skills/docs-workflow-commits-ready/scripts/commits-ready-check.shplugins/docs-tools/skills/git-pr-reader/scripts/git_commit_reader.py
| if collected >= max_commits and since_sha is None: | ||
| break | ||
|
|
||
| if since_sha and commit.sha == since_sha: | ||
| break | ||
| if since_sha and commit.sha.startswith(since_sha): | ||
| break |
There was a problem hiding this comment.
--max is currently ignored for GitHub when --since is provided.
On Line 170, the cap only applies when since_sha is None, so a long history can return unbounded commits despite --max.
Proposed fix
- if collected >= max_commits and since_sha is None:
+ if collected >= max_commits:
break🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@plugins/docs-tools/skills/git-pr-reader/scripts/git_commit_reader.py` around
lines 170 - 176, The loop currently breaks on collected >= max_commits only when
since_sha is None, causing --max to be ignored if --since is provided; update
the logic in the loop that inspects commits (variables: collected, max_commits,
since_sha, commit.sha) so that you check and break on collected >= max_commits
regardless of since_sha (i.e., perform the max_commits check before or
independently of the since_sha checks), ensuring you still break if commit.sha
equals or startswith since_sha.
The create-pr step previously assumed the branch was already pushed to the remote. Add commit_and_push() to stage, commit, and push documentation changes before creating the PR/MR via the API. This closes the gap that prevented --create-pr from working in CI. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Add technical-review to create-pr input dependencies in workflow YAML - Add --draft + --create-pr mutual exclusion check in orchestrator - Replace git add -A with targeted staging that excludes process artifacts (analysis, requirements, plan, reviews) from the PR commit - Fix setup-hooks.sh to use SCRIPT_DIR instead of CLAUDE_PLUGIN_ROOT env var, which is not guaranteed outside hook/MCP subprocesses Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
| @@ -0,0 +1,195 @@ | |||
| #!/bin/bash | |||
There was a problem hiding this comment.
Could we just update the existing git-pr-reader to fetch commits?
|
Could the existing https://github.com/redhat-documentation/redhat-docs-agent-tools/blob/main/plugins/docs-tools/agents/requirements-analyst.md be updated to also take code repo commit ranges as an input? |
The story of a code change becoming documentation:
are new commits after that, it says "yes, ready!" and hands over the list of commit SHAs.
If None → stop here, update the bookmark, done.
the manifests page" and "remove deleted targets from the reference table."
The whole thing is like a factory assembly line: each step reads the output of the previous step and produces input for the next one. If a step fails, the progress is saved and
you can resume from where it left off.
Summary by CodeRabbit
New Features
Documentation
Chores