Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
73 changes: 73 additions & 0 deletions .github/workflows/drift-check.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,73 @@
name: Config drift check

on:
schedule:
# Weekly Monday 06:00 UTC. Matches the existing weekly hygiene cadence.
- cron: '0 6 * * 1'
workflow_dispatch:
inputs:
policy:
description: 'Policy file path'
type: string
default: 'scripts/drift-policy.yaml'

permissions:
contents: read
issues: write

jobs:
check:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4

- uses: actions/setup-python@v5
with:
python-version: '3.x'

- name: Install PyYAML
run: pip install pyyaml

- name: Run drift detector
id: drift
env:
GH_TOKEN: ${{ secrets.DRIFT_CHECK_TOKEN || secrets.GITHUB_TOKEN }}
run: |
# The script reads gh CLI which is preinstalled on ubuntu-latest.
# GH_TOKEN above gives it auth.
python scripts/drift_check.py \
--policy "${{ inputs.policy || 'scripts/drift-policy.yaml' }}" \
--output drift-report.md \
--manifest drift-manifest.json || echo "DRIFT_DETECTED=1" >> $GITHUB_ENV
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 HIGH RISK

The workflow incorrectly sets DRIFT_DETECTED=1 for any non-zero exit code. Since the script uses exit code 1 for drift and 2+ for system/config errors, this conflates actual drift with infrastructure failures (e.g., GITHUB_TOKEN expiration). Update the 'Run drift detector' step to only set the flag when the exit code is exactly 1.

echo "----- report start -----"
Comment on lines +31 to +42
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

🧩 Analysis chain

🏁 Script executed:

# Locate drift_check.py
find . -name "drift_check.py" -o -name "*drift*" -type f | head -20

Repository: ANcpLua/github-settings-automation

Length of output: 167


🏁 Script executed:

# Look for the script
git ls-files | grep -i drift

Repository: ANcpLua/github-settings-automation

Length of output: 161


🏁 Script executed:

cat -n scripts/drift_check.py

Repository: ANcpLua/github-settings-automation

Length of output: 15777


Differentiate detector exit code 1 (drift) from 2 (failure).

The || echo "DRIFT_DETECTED=1" handler maps every non-zero exit to drift, including auth/config/script failures. The detector explicitly returns exit code 2 for configuration or authentication errors (documented in the script's docstring), but the workflow conflates this with drift detection. This causes real breakage to be silently misclassified as drift and allows the workflow to continue instead of failing.

Capture the exit code and only set DRIFT_DETECTED=1 when the exit code is exactly 1; propagate exit code 2 to fail the workflow.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In @.github/workflows/drift-check.yml around lines 31 - 42, Replace the blanket
"|| echo DRIFT_DETECTED=1" with explicit exit-code handling: run python
scripts/drift_check.py (the detector), capture its exit code, and if the code is
exactly 1 write "DRIFT_DETECTED=1" to $GITHUB_ENV; if the code is 2 re-exit with
code 2 so the job fails (propagate failures), otherwise exit/continue normally
for code 0. Target the Run drift detector step and the invocation of
drift_check.py, and ensure you reference DRIFT_DETECTED and $GITHUB_ENV when
setting the environment variable.

cat drift-report.md
echo "----- report end -----"

- name: Upload report
if: always()
uses: actions/upload-artifact@v4
with:
name: drift-report
path: |
drift-report.md
drift-manifest.json
retention-days: 90

- name: Open issue on drift
if: env.DRIFT_DETECTED == '1'
env:
GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
run: |
# Find existing open drift issue to update rather than spam new ones
existing=$(gh issue list --repo "$GITHUB_REPOSITORY" \
--label config-drift --state open \
--json number --jq '.[0].number')
if [ -n "$existing" ]; then
gh issue comment "$existing" --repo "$GITHUB_REPOSITORY" \
--body-file drift-report.md
else
gh issue create --repo "$GITHUB_REPOSITORY" \
--title "Config drift detected ($(date -I))" \
--label config-drift \
--body-file drift-report.md
fi
65 changes: 65 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -153,3 +153,68 @@ carry every scope above.
`renovate-config`, `ancplua-docs`, `dotcov`) is a separate
coordinated move; sibling `O-ANcppLua/ANcpLua.OtelConventions.Api`
already lives in the org.


## Config drift detector

`scripts/drift_check.py` audits a watchlist of shared configuration files across all listed repositories and reports **semantic** drift — not byte-level. Two files with different whitespace, different JSON key order, or different YAML flow style collapse to the same equivalence class.

## What "semantic" means per file type

| File type | Normaliser | Ignores |
|---|---|---|
| `*.json`, `renovate.json`, `package.json`, `.markdownlint.json` | `json` | Whitespace, key order, trailing newlines |
| `*.yaml`, `*.yml`, `.coderabbit.yaml`, `dependabot.yml` | `yaml` | Whitespace, key order, flow vs block style |
| `*.xml`, `*.props`, `*.targets`, `nuget.config`, `Directory.Build.props` | `xml` | Insignificant whitespace, attribute order |
| `.editorconfig`, `.gitmodules`, `.npmrc`, `.globalconfig` | `ini` | Comments, section ordering, key ordering |
| `.gitignore`, `.dockerignore`, `.gitattributes`, `.markdownlintignore` | `lines` | Comments, blank lines, duplicate lines, ordering |
| `LICENSE`, `build.sh`, `build.cmd` | `raw` | Trailing whitespace |
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Align raw normalizer docs with actual behavior.

norm_raw uses .strip(), so it drops leading and trailing whitespace, not just trailing whitespace. Document this precisely to avoid report misinterpretation.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@README.md` at line 171, Update the README entry for the `raw` normalizer to
accurately reflect `norm_raw`'s behavior: state that `norm_raw` uses Python's
`.strip()` to remove both leading and trailing whitespace (not just trailing),
and adjust the table cell text from "Trailing whitespace" to something like
"Removes leading and trailing whitespace" so docs match the actual `norm_raw`
implementation.


Override the auto-detected normaliser per entry in `scripts/drift-policy.yaml` with `normalizer: <name>`.

## Running locally

```bash
cd <this repo>
pip install pyyaml
python scripts/drift_check.py \
--policy scripts/drift-policy.yaml \
--output drift-report.md \
--manifest drift-manifest.json
```

Exit code `0` if all paths have one semantic cluster, `1` if any drift found, `2` on configuration or auth error.

Reads GitHub via the `gh` CLI; needs `gh auth status` to be authenticated.

## CI

`.github/workflows/drift-check.yml` runs the detector every Monday 06:00 UTC. On drift, it opens (or updates) an issue labelled `config-drift` with the report inline. Artefacts (`drift-report.md`, `drift-manifest.json`) are retained for 90 days.

To trigger manually: Actions → Config drift check → Run workflow.

## Adding a repo or a file path

Edit `scripts/drift-policy.yaml`:

```yaml
repos:
- ANcpLua/<new-repo> # add here

watch:
- path: <new-file> # add here
# optional: normalizer: json
```

The detector auto-picks the right normaliser from the file name; only specify `normalizer:` if you want to override.

## Interpreting the report

The report groups each watched path into **semantic clusters**. The largest cluster is marked `**canonical** (majority)`; smaller clusters are `drift #N`.

- **One cluster** → clean, all repos semantically equivalent for this file.
- **Two or more clusters** → drift. The smaller ones likely need to be aligned with the canonical (or the variation is legitimate and should be allowlisted — see below).

### Legitimate variation

Some drift is intentional (e.g. anti-self-bump rules in `renovate.json` referencing each repo's own package name). Currently you read the report and ignore those rows; future revision can add an allowlist mechanism if false-positives become a maintenance burden.
98 changes: 98 additions & 0 deletions scripts/drift-policy.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,98 @@
# drift-policy.yaml — Watchset for the semantic config-drift detector.
#
# Add repos to the `repos:` list and file paths to the `watch:` list.
# Normalisers are auto-detected from filename; override per entry with
# `normalizer: <name>` if needed.
#
# Available normalisers:
# json — parse + sort-keys + canonical dump (whitespace and key-order
# ignored)
# yaml — parse + sort-keys + canonical dump
# xml — parse + sort attributes + strip insignificant whitespace
# ini — section[key]=value pairs, comments stripped, sorted
# lines — line-based; comments and blank lines stripped, lines deduped
# and sorted (.gitignore, .dockerignore, .gitattributes etc.)
# raw — bytes as UTF-8 text, .strip()

repos:
# ANcpLua framework
- ANcpLua/ANcpLua.NET.Sdk
- ANcpLua/ANcpLua.Roslyn.Utilities
- ANcpLua/ANcpLua.Analyzers
- ANcpLua/ANcpLua.Agents
- ANcpLua/ANcpLua.OpenTelemetry.SemanticConventions.Analyzers

# OTel-adjacent
- O-ANcppLua/ANcpLua.OtelConventions.Api
- O-ANcppLua/Nuke.OpenTelemetry.Conventions
- ANcpLua/typespec-otel-semconv

# Apps and libs
- ANcpLua/ErrorOrX
- ANcpLua/dotcov
- ANcpLua/TourPlanner
- ANcpLua/TourPlanner-Angular
- ANcpLua/Paperless
- ANcpLua/nhmw-digital-collection

# Docs and meta
- ANcpLua/ancplua-claude-plugins
- ANcpLua/ancplua-docs

# qyl — to enable later
# - O-ANcppLua/qyl

watch:
# Renovate + dependency-bot configs
- path: renovate.json
- path: .github/dependabot.yml

# Editor + linter configs
- path: .editorconfig
- path: .globalconfig
- path: .markdownlint.json
- path: .markdownlintignore
- path: .prettierrc

# Reviewer/automation configs
- path: .coderabbit.yaml
- path: .codecov.yml
- path: codecov.yml

# Repo hygiene
- path: .gitattributes
- path: .gitignore
- path: .dockerignore
- path: .gitmodules

# .NET specific
- path: nuget.config
- path: Directory.Build.props
- path: Directory.Packages.props
- path: Directory.Build.targets
- path: Version.props
- path: global.json

# Node
- path: .npmrc
- path: package.json
# legitimate to differ — each repo has its own deps; surface for awareness
# only. Drift is expected.

# Shared workflows
- path: .github/workflows/auto-merge.yml
- path: .github/workflows/coderabbit-autofix.yml
- path: .github/workflows/claude-code-review.yml
- path: .github/workflows/coderabbit.yml
- path: .github/workflows/ci.yml

# Build host
- path: build.sh
- path: build.cmd
- path: build.ps1

# License / agent docs (compared raw — surfaces accidental Mintlify-style
# default attribution leftovers)
- path: LICENSE
- path: CLAUDE.md
- path: AGENTS.md
Loading
Loading