refactor(crossref_review): consume TSV from mathlib4's post-build lint#39
Open
kim-em wants to merge 36 commits into
Open
refactor(crossref_review): consume TSV from mathlib4's post-build lint#39kim-em wants to merge 36 commits into
kim-em wants to merge 36 commits into
Conversation
kim-em
added a commit
to kim-em/mathlib4
that referenced
this pull request
May 22, 2026
When a PR touches `Mathlib/**/*.lean` and adds `@[stacks ...]`, `@[kerodon ...]`, or `@[wikidata ...]` attributes, this workflow posts (or updates, or deletes) a single bot comment with one row per added tag, showing the upstream label / description fetched from the source database, the Mathlib declaration's first-line signature, and the author's optional attribute comment. CI fails when any tag cannot be resolved upstream and emits a warning (not a failure) when only transient network errors occur. Trusted-script execution pattern (same as `PR_summary.yml`): * The PR head is checked out into `pr-branch/` and used as DATA only — no code from it ever runs in this `pull_request_target` context. * The base ref is checked out into `base/`; its `scripts/extract-crossref-tags.lean` and `scripts/crossref-snippet.lean` are the trusted versions we actually invoke, and its `lean-toolchain` is what `lake env lean --run` picks up. * `mathlib-ci` is checked out via the shared `get-mathlib-ci` action; its `scripts/crossref_review/crossref-pr-comment.py` orchestrator (added in leanprover-community/mathlib-ci#39) drives the extract → snippet → format pipeline. * The resulting comment file is piped through `scripts/pr_summary/update_PR_comment.sh` for post-once / update-in-place behaviour. A startup gate exits cleanly when the base ref pre-dates the companion Lean scripts (e.g. against an older master), so the workflow doesn't fail spuriously. `docs/workflows.md` gains a new entry; `scripts/README.md` points at the mathlib-ci location for the orchestrator. Stacked on leanprover-community#39664 (LSP widget) and depends on the orchestrator landing in leanprover-community/mathlib-ci#39. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2 tasks
kim-em
added a commit
to kim-em/mathlib4
that referenced
this pull request
May 22, 2026
When a PR touches `Mathlib/**/*.lean` and adds `@[stacks ...]`, `@[kerodon ...]`, or `@[wikidata ...]` attributes, this workflow posts (or updates, or deletes) a single bot comment with one row per added tag, showing the upstream label / description fetched from the source database, the Mathlib declaration's first-line signature, and the author's optional attribute comment. CI fails when any tag cannot be resolved upstream and emits a warning (not a failure) when only transient network errors occur. Trusted-script execution pattern (same as `PR_summary.yml`): * The PR head is checked out into `pr-branch/` and used as DATA only — no code from it ever runs in this `pull_request_target` context. * The base ref is checked out into `base/`; its `scripts/extract-crossref-tags.lean` and `scripts/crossref-snippet.lean` are the trusted versions we actually invoke, and its `lean-toolchain` is what `lake env lean --run` picks up. * `mathlib-ci` is checked out via the shared `get-mathlib-ci` action; its `scripts/crossref_review/crossref-pr-comment.py` orchestrator (added in leanprover-community/mathlib-ci#39) drives the extract → snippet → format pipeline. * The resulting comment file is piped through `scripts/pr_summary/update_PR_comment.sh` for post-once / update-in-place behaviour. A startup gate exits cleanly when the base ref pre-dates the companion Lean scripts (e.g. against an older master), so the workflow doesn't fail spuriously. `docs/workflows.md` gains a new entry; `scripts/README.md` points at the mathlib-ci location for the orchestrator. Stacked on leanprover-community#39664 (LSP widget) and depends on the orchestrator landing in leanprover-community/mathlib-ci#39. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Contributor
Author
|
Here's what the bot's comment renders as, generated by feeding the orchestrator a synthetic file with one valid + one bogus tag from each database. (No quoting trick — this is the literal contents of Cross-reference tags added by this PRWikidata
Stacks
Kerodon
⚠ One or more tags above could not be found upstream. Please verify the tag identifier or remove the attribute. |
kim-em
added a commit
to kim-em/mathlib4
that referenced
this pull request
May 22, 2026
When a PR touches `Mathlib/**/*.lean` and adds `@[stacks ...]`, `@[kerodon ...]`, or `@[wikidata ...]` attributes, this workflow posts (or updates, or deletes) a single bot comment with one row per added tag, showing the upstream label / description fetched from the source database, the Mathlib declaration's first-line signature, and the author's optional attribute comment. CI fails when any tag cannot be resolved upstream and emits a warning (not a failure) when only transient network errors occur. Trusted-script execution pattern (same as `PR_summary.yml`): * The PR head is checked out into `pr-branch/` and used as DATA only — no code from it ever runs in this `pull_request_target` context. * The base ref is checked out into `base/`; its `scripts/extract-crossref-tags.lean` and `scripts/crossref-snippet.lean` are the trusted versions we actually invoke, and its `lean-toolchain` is what `lake env lean --run` picks up. * `mathlib-ci` is checked out via the shared `get-mathlib-ci` action; its `scripts/crossref_review/crossref-pr-comment.py` orchestrator (added in leanprover-community/mathlib-ci#39) drives the extract → snippet → format pipeline. * The resulting comment file is piped through `scripts/pr_summary/update_PR_comment.sh` for post-once / update-in-place behaviour. A startup gate exits cleanly when the base ref pre-dates the companion Lean scripts (e.g. against an older master), so the workflow doesn't fail spuriously. `docs/workflows.md` gains a new entry; `scripts/README.md` points at the mathlib-ci location for the orchestrator. Stacked on leanprover-community#39664 (LSP widget) and depends on the orchestrator landing in leanprover-community/mathlib-ci#39. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
kim-em
added a commit
to kim-em/mathlib4
that referenced
this pull request
May 22, 2026
When a PR touches `Mathlib/**/*.lean` and adds `@[stacks ...]`, `@[kerodon ...]`, or `@[wikidata ...]` attributes, this workflow posts (or updates, or deletes) a single bot comment with one row per added tag, showing the upstream label / description fetched from the source database, the Mathlib declaration's first-line signature, and the author's optional attribute comment. CI fails when any tag cannot be resolved upstream and emits a warning (not a failure) when only transient network errors occur. Trusted-script execution pattern (same as `PR_summary.yml`): * The PR head is checked out into `pr-branch/` and used as DATA only — no code from it ever runs in this `pull_request_target` context. * The base ref is checked out into `base/`; its `scripts/extract-crossref-tags.lean` and `scripts/crossref-snippet.lean` are the trusted versions we actually invoke, and its `lean-toolchain` is what `lake env lean --run` picks up. * `mathlib-ci` is checked out via the shared `get-mathlib-ci` action; its `scripts/crossref_review/crossref-pr-comment.py` orchestrator (added in leanprover-community/mathlib-ci#39) drives the extract → snippet → format pipeline. * The resulting comment file is piped through `scripts/pr_summary/update_PR_comment.sh` for post-once / update-in-place behaviour. A startup gate exits cleanly when the base ref pre-dates the companion Lean scripts (e.g. against an older master), so the workflow doesn't fail spuriously. `docs/workflows.md` gains a new entry; `scripts/README.md` points at the mathlib-ci location for the orchestrator. Stacked on leanprover-community#39664 (LSP widget) and depends on the orchestrator landing in leanprover-community/mathlib-ci#39. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
kim-em
added a commit
to kim-em/mathlib4
that referenced
this pull request
May 22, 2026
When a PR touches `Mathlib/**/*.lean` and adds `@[stacks ...]`, `@[kerodon ...]`, or `@[wikidata ...]` attributes, this workflow posts (or updates, or deletes) a single bot comment with one row per added tag, showing the upstream label / description fetched from the source database, the Mathlib declaration's first-line signature, and the author's optional attribute comment. CI fails when any tag cannot be resolved upstream and emits a warning (not a failure) when only transient network errors occur. Trusted-script execution pattern (same as `PR_summary.yml`): * The PR head is checked out into `pr-branch/` and used as DATA only — no code from it ever runs in this `pull_request_target` context. * The base ref is checked out into `base/`; its `scripts/extract-crossref-tags.lean` and `scripts/crossref-snippet.lean` are the trusted versions we actually invoke, and its `lean-toolchain` is what `lake env lean --run` picks up. * `mathlib-ci` is checked out via the shared `get-mathlib-ci` action; its `scripts/crossref_review/crossref-pr-comment.py` orchestrator (added in leanprover-community/mathlib-ci#39) drives the extract → snippet → format pipeline. * The resulting comment file is piped through `scripts/pr_summary/update_PR_comment.sh` for post-once / update-in-place behaviour. A startup gate exits cleanly when the base ref pre-dates the companion Lean scripts (e.g. against an older master), so the workflow doesn't fail spuriously. `docs/workflows.md` gains a new entry; `scripts/README.md` points at the mathlib-ci location for the orchestrator. Stacked on leanprover-community#39664 (LSP widget) and depends on the orchestrator landing in leanprover-community/mathlib-ci#39. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
kim-em
added a commit
to kim-em/mathlib4
that referenced
this pull request
May 22, 2026
When a PR touches `Mathlib/**/*.lean` and adds `@[stacks ...]`, `@[kerodon ...]`, or `@[wikidata ...]` attributes, this workflow posts (or updates, or deletes) a single bot comment with one row per added tag, showing the upstream label / description fetched from the source database, the Mathlib declaration's first-line signature, and the author's optional attribute comment. CI fails when any tag cannot be resolved upstream and emits a warning (not a failure) when only transient network errors occur. Trusted-script execution pattern (same as `PR_summary.yml`): * The PR head is checked out into `pr-branch/` and used as DATA only — no code from it ever runs in this `pull_request_target` context. * The base ref is checked out into `base/`; its `scripts/extract-crossref-tags.lean` and `scripts/crossref-snippet.lean` are the trusted versions we actually invoke, and its `lean-toolchain` is what `lake env lean --run` picks up. * `mathlib-ci` is checked out via the shared `get-mathlib-ci` action; its `scripts/crossref_review/crossref-pr-comment.py` orchestrator (added in leanprover-community/mathlib-ci#39) drives the extract → snippet → format pipeline. * The resulting comment file is piped through `scripts/pr_summary/update_PR_comment.sh` for post-once / update-in-place behaviour. A startup gate exits cleanly when the base ref pre-dates the companion Lean scripts (e.g. against an older master), so the workflow doesn't fail spuriously. `docs/workflows.md` gains a new entry; `scripts/README.md` points at the mathlib-ci location for the orchestrator. Stacked on leanprover-community#39664 (LSP widget) and depends on the orchestrator landing in leanprover-community/mathlib-ci#39. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…prover-community#6) Add a local composite action that mints a GitHub App installation token using OIDC and signing from an Azure Key Vault. This lets us avoid storing private keys as GitHub secrets and use federated credentials to control access to them. To make this fast, we avoid using azure/login + az keyvault key sign (this is very slow) and use straight HTTP calls. Our flow is simple enough that this is manageable without a lot of complexity.
Mint an Azure storage access token via GitHub OIDC for cache uploads. To be used by the mathlib CI.
…ommunity#8) Add retry-with-backoff for all Zulip API calls, and exit gracefully (instead of crashing) when the API returns errors after retries. Previously, the script would crash with `KeyError: 'messages'` when Zulip rate-limited a `get_messages` call, because the error response has no `messages` key. This caused CI failures on unrelated PRs. Also documents that consuming workflows should use `continue-on-error: true` as a safety net. Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
…r-community#14) The Zulip instructions tell users to clone mathlib4, but `git checkout nightly-testing` fails in a fresh mathlib4 clone since that branch only exists in mathlib4-nightly-testing. Fix by fetching nightly-testing from the URL first, then creating the local branch if it doesn't exist. Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…over-community#15) The regex `[A-Za-z0-9_]*` excluded dots, so `backward.isDefEq.respectTransparency` was reported as just `backward.isDefEq`, and `backward.privateInPublic.warn` was lumped into `backward.privateInPublic`. Add `.` to the character class so the full option name is captured, and add a trailing space to the count pattern to prevent substring overlap between e.g. `backward.privateInPublic` and `backward.privateInPublic.warn`. Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…r-community#16) This script created the wrong diffs in lean-community/mathlib4#36982. This bug can be fixed by simply adding `public` to the list of modifiers to be dropped.
…munity#18) This PR makes `merge-lean-testing-pr.sh` create or reuse a real `nightly-testing` remote and ensures the local `nightly-testing` branch tracks `nightly-testing/nightly-testing`. This avoids the broken state where running the script in a fresh `mathlib4` clone leaves a later plain `git push` targeting `origin/nightly-testing` instead of `leanprover-community/mathlib4-nightly-testing`. :robot: Prepared with OpenAI Codex
…ommunity#19) Info messages don't indicate problems: the detailed breakdown in the Zulip report is pure noise (entries like "96 × info: stdout:" or "1 × info: Found git remote for proofwidgets at ..."). The top-line "Info messages: N" summary is retained in the counts bullet list. See https://leanprover.zulipchat.com/#narrow/channel/428973-nightly-testing/topic/Docgen.20status.20updates/near/586226267
…identity (leanprover-community#20) The comment-deduplication filter still searched for comments authored by `leanprover-community-bot`, but comments are now posted via the `mathlib-lean-pr-testing` GitHub App, whose user login is `mathlib-lean-pr-testing[bot]`. The lookup never matched, so the script always fell through to posting a new comment instead of editing the existing one. See https://leanprover.zulipchat.com/#narrow/channel/428973-nightly-testing/topic/Bot.20messages/near/585690908
…eanprover-community#22) The PR summary comment previously advised running `./scripts/reporting/technical-debt-metrics.sh pr_summary`, which was a stale relative path assuming the script was in `mathlib4`. Update both the in-script comment and the printed instructions to give a copy-pasteable recipe that clones `mathlib-ci` as a sibling of `mathlib4` and invokes the script from there. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…rface errors in PR comment (leanprover-community#26) * fix: replay auto commits in isolated worktrees and surface errors in PR comment The commit verification script had several bugs that combined to produce unhelpful PR comments and cascading false-positive failures: - Per-commit replay shared the main checkout, so any untracked file an earlier auto commit's command produced (e.g., from a `> hits.txt` redirect) blocked the next commit's `git checkout --detach $parent` with "untracked working tree files would be overwritten". The "fix" to earlier commit N+1 then reported a confusing checkout error instead of its real failure. - JSON output was hand-rolled with `echo "{ \"...\": \"$subject\" }"` and only escaped quotes. Subjects containing literal newlines (created by some commit-message generators) produced unparseable JSON, which then caused the comment-rendering step to fail silently with `jq: parse error: Invalid escape`. - The comment surfaced no actual error output. Authors saw "❌ Failed" and had to dig through CI logs to learn what went wrong. This rewrite: - Replays each auto commit in a fresh `git worktree add --detach $parent` under a single mktemp'd parent. Worktrees are removed after each iteration, on EXIT (trap), and `git worktree prune` runs at startup so any stale state from interrupted runs is cleared. The same isolation applies to the transient-commit cherry-pick replay. - Captures combined stdout+stderr via `tee` so live CI logs still show the command's output while we also save it for the JSON report. - Builds JSON via `jq -nc --arg ...` so commit subjects with arbitrary bytes (newlines, backslashes, control chars) round-trip safely. - Records a `failure_kind` (`command_failed`, `timed_out`, or `tree_mismatch`) plus the last 4KB / 40 lines of output, and for tree mismatches a truncated `git diff-tree --stat`. The summary script dispatches on `failure_kind` (no keyword matching on error text — the underlying tool's message is the diagnosis) and renders each failed commit as a `<details>` block containing the actual output. - Caps total comment size at 50KB, falling back to "see CI logs" links for late-truncated content. - Replaces the literal `✅`/`❌` escape strings (which only rendered correctly inside `jq` output) with real UTF-8 ✅/❌. - Drops the bash-4 `mapfile` and associative-array dependencies; both scripts now run on bash 3.2+, easing local testing. - Recovers from unparseable JSON: if `verify_commits.sh` crashes before writing the JSON file, the summary script emits a fallback comment pointing to the CI logs rather than producing empty output. Smoke-tested against the failing run for leanprover-community/mathlib4#38480. The new comment shows the actual bash parse error / xargs / sed / git rm output for each commit independently, with no cascading "checkout would overwrite" noise. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix: allow TIMEOUT_SECONDS to be overridden via environment variable Hard-coded 600s makes the timeout-handling path effectively unreachable from test harnesses that build a synthetic git history and need a quick artificial timeout. Use the same `${VAR:-default}` pattern already used for TRANSIENT_PREFIX et al., so a test scenario can set TIMEOUT_SECONDS=1 and exercise the timed_out failure_kind path in seconds. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…y#24) * feat(olean_diff.py): script for comparing oleans and generating a markdown comment * adjust error messages
…nprover-community#25) If we start counting this number as part of the tech debt metrics, this should encourage us to make more effective use of the module system. Tested locally on Mathlib by running `grep '^@\[expose\] | grep -v ...` and building the regex until it matched all the forms of exposed public sections.
Fix the local run instructions of declarations_diff.sh and import_trans_difference.sh so that they point so this repo. This style is already used in technical-debt-metrics.sh, which had the correct instructions.
…er-community#23) leanprover-community#19 removed info messages, but the weekly linting report in both Mathlib and CSLib make use of this (see leanprover-community/mathlib4#37236). Instead, make this an off by default option. I will follow up with Mathlib and CSLib PRs enabling this for the weekly linting reports.
…rover-community#21) This PR adds a counter for `set_option warning.simp.varHead false` occurrences to the weekly tech debt report. The `warning.simp.varHead` option silences a warning introduced in Lean about simp lemmas whose LHS has a metavariable as its head. Places where this warning is silenced are tech debt we want to track over time. Prompted by discussion on leanprover-community/mathlib4-nightly-testing#209 which adds several `set_option warning.simp.varHead false in` uses. 🤖 Prepared with Claude Code Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
leanprover-community#29) * fix: require all CI outcomes to succeed before adding `builds-mathlib` Previously the `builds-mathlib` label was added as soon as `TEST_OUTCOME == "success"`, regardless of whether `LINT_OUTCOME`, `BUILD_OUTCOME`, `NOISY_OUTCOME`, `ARCHIVE_OUTCOME`, or `COUNTEREXAMPLES_OUTCOME` had failed. The success-message branch had the same bug, short-circuiting all the failure-message cases below it. As a result, lean4 PRs whose Mathlib CI ran with a failing `simp` linter step (but a succeeding `test` step) were labelled `builds-mathlib` and shown a green "successfully built against this PR" comment, hiding the lint failure from PR authors and reviewers. This PR requires all six step outcomes (`BUILD`, `NOISY`, `ARCHIVE`, `COUNTEREXAMPLES`, `LINT`, `TEST`) to be `success` before the `builds-mathlib` label is added or the green-check status comment is posted. The `breaks-mathlib` branch already triggered on any failure, so its behaviour is unchanged; the asymmetry is now closed. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * refactor: drive labels and status comment from one aggregate state Previously the label-mutation block and the comment-message block ran their own outcome classifications. After the previous commit, the label block knew only `all-success` and `any-failure`, while the message block also handled `cancelled` and a fall-through case. That left two real problems: * `cancelled`, `skipped`, or empty outcomes posted a yellow comment but did not touch labels, so a stale `builds-mathlib` or `breaks-mathlib` from an earlier run would persist after a reattempt. * When one outcome was `failure` and another `cancelled`, labels said `breaks-mathlib` but the comment said "was cancelled", hiding the failure. This refactor classifies the run once into one of `success`, `failure`, `cancelled`, or `other`, with `failure` taking precedence over `cancelled`. Both the labels and the message branch off the same state. The non-success/non-failure paths now explicitly remove both `builds-mathlib` and `breaks-mathlib` so labels cannot drift out of sync with reality. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This PR adds a "level" attribute to the tech debt reports posted weekly on Zulip and under every PR. The intention is that "weak" tech debt is allowed to increase in a PR (when reasonable), while "strong" tech debt should only increase in exceptional cases (for example, Lean updates or because we expose a new source of tech debt). I played around with a few formatting options for the report and settled on this one: show the strong tech debt and then the weak tech debt in two different tables. That should make it clear for the reader (and possible automation) if we care about a particular change or not.
…eanprover-community#31) This category of tech debt will give false positives until leanprover/lean4#13595 is merged (hopefully in the next 2 weeks).
…ver-community#32) The label/comment automation used `curl -L -s ...`, where `-s` swallows both progress output and error responses. When the GitHub App token lacked permission for the target repo (as happened with batteries-pr-testing branches before mathlib4#38945), `curl` would print the API's 4xx body nowhere and exit 0, so the script's `set -e` never tripped and the script silently misreported success ("Removing label X" / "Adding label Y" with no actual API effect). Switch all 7 curl calls to `-sS --fail-with-body`: silent on success, prints the response body and exits non-zero on HTTP errors. Combined with the script's existing `set -euo pipefail`, any future API failure now fails the workflow step loudly with a diagnostic message. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…er-community#33) * chore: route Zulip notifications to nightly-testing-mathlib This PR retargets the regression-report and docgen-status workflows, and the adaptation-PR helper script, from the shared `nightly-testing` channel to the new `nightly-testing-mathlib` channel, in line with the proposal to split `#nightly-testing` into per-project channels for Mathlib, Batteries, and Cslib. 🤖 Prepared with Claude Code * chore: update echo string to reference new channel name Codex review caught that the prose `echo` still referenced the old `#nightly-testing` channel even though the actual `zulip-send` call was already retargeted at `#nightly-testing-mathlib`. 🤖 Prepared with Claude Code
…eanprover-community#34) After leanprover-community#32 added `--fail-with-body` to the curl calls, every label-API 4xx aborts the script under `set -euo pipefail`. That's the desired behaviour for `add_label`, but `remove_label` returns 404 when the label isn't currently on the PR — which is the script's desired post-condition. The whole labelling step then aborts before reaching the `add_label` calls, so PRs end up with no `builds-mathlib` / `breaks-mathlib` label even when the build's outcome is clear. Inspect the HTTP status manually instead and accept 200 or 404; keep `add_label` strict. Co-authored-by: Kim Morrison <kim.morrison@anu.edu.au> Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…community#36) In GNU grep BRE, `\|` is the alternation operator, so `grep "\|strong$"` matches every line (the empty alternative matches anything). As a result both `report strong` and `report weak` emitted the full table, and the weekly Zulip post showed every category twice. Use a literal `|` so only rows ending in `|strong` or `|weak` are kept. Co-authored-by: Kim Morrison <kim@lean-fro.org> Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…nprover-community#35) The technical debt metrics seem to have broken recently, and in the CI log there is a fatal Git error about not detaching `HEAD`. The log does not show which command precisely failed. Looking at the set of error messages that Git produces, this can be traced to `git checkout` (which `git switch` is the modern version of). So replacing `git switch` and `git checkout` with `git switch --detach` should hopefully fix the issue. Example of broken tech debt reports: * https://github.com/leanprover-community/mathlib4/actions/runs/25331716666/job/74266774924#step:11:133 Interestingly `git checkout` in the transitive imports script does not seem to break anything. My suspicion is the `git checkout -q` switch turns the advice message into a fatal error? I'd have to look at the Git source code in more detail since the man page is not very clear. In any case, let's just turn it into `git switch --detach` there too for consistency.
Add `scripts/crossref_review/crossref-pr-comment.py` — the trusted orchestrator driven by mathlib4's `.github/workflows/crossref_review.yml` (introduced in a stacked mathlib4 PR). It runs the Lean extract / snippet scripts from the mathlib4 base checkout against the PR checkout as data, formats a Markdown table of every `@[stacks ...]`, `@[kerodon ...]`, and `@[wikidata ...]` attribute added by the PR, and exits 0 / 1 / 2 / 3 so the workflow can distinguish "nothing to post" / "post and pass" / "post and fail" / "post but treat as a network blip." The script writes its comment body to a file whose first line is the sentinel title used by `scripts/pr_summary/update_PR_comment.sh` to dedupe — so the workflow can pipe it straight through that helper for post-once / update-in-place behaviour. Pure stdlib only, so the CI job can run it without `pip install`. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
kim-em
added a commit
to kim-em/mathlib4
that referenced
this pull request
May 22, 2026
When a PR touches `Mathlib/**/*.lean` and adds `@[stacks ...]`, `@[kerodon ...]`, or `@[wikidata ...]` attributes, this workflow posts (or updates, or deletes) a single bot comment with one row per added tag, showing the upstream label / description fetched from the source database, the Mathlib declaration's first-line signature, and the author's optional attribute comment. CI fails when any tag cannot be resolved upstream and emits a warning (not a failure) when only transient network errors occur. Trusted-script execution pattern (same as `PR_summary.yml`): * The PR head is checked out into `pr-branch/` and used as DATA only — no code from it ever runs in this `pull_request_target` context. * The base ref is checked out into `base/`; its `scripts/extract-crossref-tags.lean` and `scripts/crossref-snippet.lean` are the trusted versions we actually invoke, and its `lean-toolchain` is what `lake env lean --run` picks up. * `mathlib-ci` is checked out via the shared `get-mathlib-ci` action; its `scripts/crossref_review/crossref-pr-comment.py` orchestrator (added in leanprover-community/mathlib-ci#39) drives the extract → snippet → format pipeline. * The resulting comment file is piped through `scripts/pr_summary/update_PR_comment.sh` for post-once / update-in-place behaviour. A startup gate exits cleanly when the base ref pre-dates the companion Lean scripts (e.g. against an older master), so the workflow doesn't fail spuriously. `docs/workflows.md` gains a new entry; `scripts/README.md` points at the mathlib-ci location for the orchestrator. Stacked on leanprover-community#39664 (LSP widget) and depends on the orchestrator landing in leanprover-community/mathlib-ci#39. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…t step The mathlib4 cross-reference review pipeline is moving from a text-based extractor running independently in a `pull_request_target` workflow to an elaboration-based lint step that runs in the build pipeline's `post_steps` job after `lake exe cache get`. The lint step emits a machine-readable TSV (db / tag / declName / file / comment) as an artifact, and a `workflow_run`-triggered companion downloads it here. This commit rewrites `scripts/crossref_review/crossref-pr-comment.py` so it: * Takes `--tsv` (the path to the artifact) instead of two checkout paths. * Defensively parses the TSV (unknown databases dropped, missing fields padded). Nothing from PR-derived input is executed. * Fetches snippets directly via Python's `urllib` — Wikidata via the batched `wbgetentities` API, Stacks/Kerodon via the documented Gerby `/data/tag/<TAG>/content/statement` endpoint. Gerby fetches are parallelised with `ThreadPoolExecutor(max_workers=8)` to keep the workflow under a minute on PRs that touch large files. * Constructs the Markdown comment in trusted code (no longer relies on PR-derived Markdown), still using the `### Cross-reference tags added by this PR` sentinel so `update_PR_comment.sh` can dedupe. * Exits 0 / 1 / 2 / 3 with the same grammar as before. Stacked on leanprover-community/mathlib4#39666 upcoming force-push, which adds the post-build lint step and rewires `crossref_review.yml` as `workflow_run`-triggered. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
kim-em
added a commit
to kim-em/mathlib4
that referenced
this pull request
May 22, 2026
Replace the text-based cross-reference extractor with an elaboration-
based lint step that runs in the build pipeline's `post_steps` job
and a `workflow_run`-triggered companion that posts the bot comment.
**Why.** The previous text-based extractor (a ~250 LOC byte-level
scanner in `scripts/crossref.lean`) was a workaround for not having a
built Mathlib environment in the pull_request_target workflow. Now
that we run as a post-build lint step, we have a fully elaborated
environment and can read `Mathlib.CrossRef.tagExt` directly — cutting
~250 LOC and eliminating the fidelity gap with the Lean parser.
**Pieces:**
* `scripts/crossref.lean` — the `extract` subcommand is gone. New
`check --diff <range> --tsv <path>` subcommand uses
`importModules (loadExts := true)` (NOT `withImportModules`, which
passes `loadExts := false` and would leave `tagExt` empty), walks
`tagExt`, looks up each tagged declaration's source module via
`env.getModuleFor?`, and filters by `git diff --name-only` of the
PR range. Output is machine-readable TSV
(`db\ttag\tdeclName\tfile\tcomment`). No Markdown, no network,
no snippets — those are the privileged workflow's job.
* `.github/workflows/build_template.yml` — adds three new steps to
`post_steps`, after the import-graph upload:
- Run `lake env lean --run scripts/crossref.lean check ...`
against the diff between the PR base and HEAD.
- Write a `crossref-context.json` with PR number + head SHA so
the companion workflow doesn't have to rely on the unreliable
`workflow_run.pull_requests[0].number`.
- Upload both as a single `crossref-context` artifact.
* `.github/workflows/crossref_review.yml` — rewritten. Old trigger
`pull_request_target` → new trigger `workflow_run`. Downloads the
artifact, resolves the PR number through a three-stage fallback
chain (artifact JSON → workflow_run payload → `gh pr list` head
SHA match), calls the mathlib-ci orchestrator (which now consumes
the TSV and fetches snippets in Python), posts/updates/deletes
via `update_PR_comment.sh`, and fails the check on `missing` tags.
**Privilege model.** The lint step runs with the PR's permissions and
produces only machine-readable TSV. The `workflow_run` job runs in the
base ref's permission scope (with `pull-requests: write`) and never
executes anything from the artifact — it parses the TSV defensively,
fetches snippets server-side from three fixed upstream APIs, and
builds the Markdown comment in trusted code.
Companion mathlib-ci PR:
leanprover-community/mathlib-ci#39
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Codex's review of the stacked refactor flagged that `md_escape` only handled pipes and newlines, leaving PR-author comment strings (the optional `"…"` after a `@[wikidata QID "…"]` attribute) free to render arbitrary Markdown / HTML / `@mentions` in the bot's PR comment. * Split into `md_cell_escape` (for upstream-derived snippets — Wikidata labels, Stacks/Kerodon statements — that already passed through the upstream's own sanitiser; we just collapse line breaks and neutralise pipes) and `md_author_escape` (for PR-author comments — backslash- escapes Markdown metacharacters, HTML-entity-escapes angle brackets, inserts a zero-width space after `@` to defang mentions, and neutralises pipes). * `_doc_url(file, decl_name)` replaces the inline string interpolation for the doc-gen link. It does a suffix-only `.lean → .html` rewrite (so `foo.leanblah.lean` is handled) and URL-encodes the path and fragment so unusual characters in declaration names don't break the link. * Wikipedia / Stacks / Kerodon tag URLs are also URL-encoded. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
kim-em
added a commit
to kim-em/mathlib4
that referenced
this pull request
May 22, 2026
Replace the text-based cross-reference extractor with an elaboration-
based lint step that runs in the build pipeline's `post_steps` job
and a `workflow_run`-triggered companion that posts the bot comment.
**Why.** The previous text-based extractor (a ~250 LOC byte-level
scanner in `scripts/crossref.lean`) was a workaround for not having a
built Mathlib environment in the pull_request_target workflow. Now
that we run as a post-build lint step, we have a fully elaborated
environment and can read `Mathlib.CrossRef.tagExt` directly — cutting
~250 LOC and eliminating the fidelity gap with the Lean parser.
**Pieces:**
* `scripts/crossref.lean` — the `extract` subcommand is gone. New
`check --diff <range> --tsv <path>` subcommand uses
`importModules (loadExts := true)` (NOT `withImportModules`, which
passes `loadExts := false` and would leave `tagExt` empty), walks
`tagExt`, looks up each tagged declaration's source module via
`env.getModuleFor?`, and filters by `git diff --name-only` of the
PR range. Output is machine-readable TSV
(`db\ttag\tdeclName\tfile\tcomment`). No Markdown, no network,
no snippets — those are the privileged workflow's job.
* `.github/workflows/build_template.yml` — adds three new steps to
`post_steps`, after the import-graph upload:
- Run `lake env lean --run scripts/crossref.lean check ...`
against the diff between the PR base and HEAD.
- Write a `crossref-context.json` with PR number + head SHA so
the companion workflow doesn't have to rely on the unreliable
`workflow_run.pull_requests[0].number`.
- Upload both as a single `crossref-context` artifact.
* `.github/workflows/crossref_review.yml` — rewritten. Old trigger
`pull_request_target` → new trigger `workflow_run`. Downloads the
artifact, resolves the PR number through a three-stage fallback
chain (artifact JSON → workflow_run payload → `gh pr list` head
SHA match), calls the mathlib-ci orchestrator (which now consumes
the TSV and fetches snippets in Python), posts/updates/deletes
via `update_PR_comment.sh`, and fails the check on `missing` tags.
**Privilege model.** The lint step runs with the PR's permissions and
produces only machine-readable TSV. The `workflow_run` job runs in the
base ref's permission scope (with `pull-requests: write`) and never
executes anything from the artifact — it parses the TSV defensively,
fetches snippets server-side from three fixed upstream APIs, and
builds the Markdown comment in trusted code.
Companion mathlib-ci PR:
leanprover-community/mathlib-ci#39
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
The mathlib4 cross-reference review pipeline is moving from a text-based extractor running independently in a
pull_request_targetworkflow to an elaboration-based lint step that runs in the build pipeline'spost_stepsjob. The lint step emits a machine-readable TSV (db / tag / declName / file / comment) as an artifact, and aworkflow_run-triggered companion downloads it.This PR rewrites
scripts/crossref_review/crossref-pr-comment.pyto consume the new TSV instead of driving extraction itself:--tsv(the artifact path) instead of two checkout paths.urllib— Wikidata via the batchedwbgetentitiesAPI, Stacks/Kerodon via the documented Gerby/data/tag/<TAG>/content/statementendpoint. Gerby fetches are parallelised withThreadPoolExecutor(max_workers=8)to keep the workflow under a minute on PRs that touch large files.### Cross-reference tags added by this PRsentinel soupdate_PR_comment.shcan dedupe.Stacked on the companion mathlib4 PR: leanprover-community/mathlib4#39666
🤖 Prepared with Claude Code