Skip to content

feat: scripts for !downstream-check PR-validation directive#37

Open
marcelolynch wants to merge 9 commits into
masterfrom
marcelo/2026/05/DownstreamPrValidation
Open

feat: scripts for !downstream-check PR-validation directive#37
marcelolynch wants to merge 9 commits into
masterfrom
marcelo/2026/05/DownstreamPrValidation

Conversation

@marcelolynch
Copy link
Copy Markdown
Collaborator

Adds the mathlib-ci side of a new !downstream-check directive that lets reviewers validate a mathlib4 PR against curated downstream Lean projects from the PR conversation itself.

A user comments on a mathlib4 PR, say:

!downstream-check FLT, Toric --merge-branch, carleson@v1.2.3

The companion pr_check_downstream.yml workflow in mathlib4 (separate PR) catches the comment, calls the scripts here, then dispatches the heavy build work to leanprover-community/downstream-reports. Results come back as a comment on the PR.

The grammar is (for each comma-separated entry): <name-or-slug>[@<rev>] [--merge-branch]

  • <name-or-slug> short inventory name (e.g. FLT) or owner/repo slug
  • @<rev> (optional.) Any git refspec for the downstream's checkout.
  • --merge-branch: (optional per-entry). Flips that entry from the default LKG mode (cherry-pick PR onto the downstream's last-known-good mathlib) to merge mode (build against the PR's merge tree).

marcelolynch and others added 8 commits May 8, 2026 15:01
Per the new rebase-onto-LKG mode in downstream-reports, allow each name
in a /check-downstream comment to carry an optional '@LKG' suffix:

  /check-downstream FLT@lkg, Toric    # FLT in LKG mode, Toric in merge mode
  /check-downstream all@lkg           # every enabled downstream, LKG mode

validate_names.sh strips the suffix before inventory lookup, then
preserves it verbatim in resolved_names so the dispatched workflow on
the downstream-reports side can route each entry. all@lkg expands to
every enabled downstream with the suffix applied (still gated to
OWNER/MEMBER, as 'all' already is). Any non-@LKG suffix is rejected with
a clear ::error:: annotation.

post_ack_comment.sh needs no code change — it already wraps each name
token in backticks unchanged, so '@LKG' renders correctly in the ack.

See docs/internal/pr-validation-workflow.md in downstream-reports for
the full design.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The downstream-reports validation workflow used to take a head_repo
input so it could clone the fork on a fork PR. That turned out to be
unnecessary: GitHub's refs/pull/N/merge ref (and the API-mirrored
merge_commit_sha that we forward) lives on leanprover-community/mathlib4
even for fork PRs, and the merge commit's two parents already identify
the PR base and head. The fork is never needed.

Drop head_repo and head_sha from validate_names.sh's outputs and from
the docs. The dispatched workflow no longer needs them; nothing else
consumes them.

Paired with the corresponding downstream-reports and mathlib4 PRs.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Paired with the downstream-reports / mathlib4 PRs that revise the
`!downstream-check` directive's grammar.

validate_names.sh:
  * Drop `all` / `all@lkg` shortcuts and the OWNER/MEMBER gating
    they required. Authorisation is gated upstream in mathlib4.
  * Drop `@lkg` suffix handling. LKG is the default mode now.
  * Parse each comma-separated entry as `<name>[@<rev>] [--merge-branch]`:
    bare name validated against inventory, suffix preserved verbatim,
    flag accepted only when it equals --merge-branch.
  * Stop accepting --author-association (no longer needed).
  * Stop emitting head_repo / head_sha (already removed in a previous
    commit; reinforced here in the docstring).

post_ack_comment.sh: render tokens with `@<rev>` / `--merge-branch`
inside backticks (no code change to the awk; updated wording and
docstring to clarify mode/rev are visible to the user).

README: rewrite the Comment grammar section, examples, and outputs
table.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
downstream-reports now POSTs one dispatch-level comment carrying every
matrix entry (summary table + per-entry sections), rather than one
comment per entry. Update the ack wording and the README so the user
expects a single follow-up comment, not a stream of them.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The ack comment used a single `<!-- pr-check-downstream:ack -->` marker
keyed only on the PR, so a second `!downstream-check` comment on the
same PR would edit-in-place over the first dispatch's ack — clobbering
its entry list and run link, and racing across concurrent dispatches.

Match the result-comment behaviour from downstream-reports@fa5116b:
always POST. Each dispatch leaves its own ack in the conversation,
pinned by its own run URL, so the audit trail of what got triggered
survives multiple dispatches.

Also rename the workflow step in the mathlib4 side from `Post / update
ack comment` to `Post ack comment` to match the new behaviour.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ackticks

Two minor polishes surfaced by a review pass:

* `validate_names.sh`: pre-check `.downstreams` with `jq -e` before
  pulling the name list. A truncated curl, an HTML 200 from a CDN
  error page, or an inventory schema regression now surface as a
  clean `::error::inventory at <url> is malformed or missing
  .downstreams` annotation rather than raw `jq: error (at …): Cannot
  iterate over null` stderr that GitHub treats as a generic step
  failure.

* `post_ack_comment.sh`: strip stray backticks from each entry before
  wrapping it in `` ` ``. `git check-ref-format` actually allows
  backticks in refnames (unusual, but legal), and a backtick mid-token
  would close the Markdown code span and let user-controlled text
  render unquoted. Stripping them keeps the rendered ack always
  matching the user's request and immune to markdown-escape mischief
  in the rev portion.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ndency

`validate_names.sh` no longer fetches downstream-reports' inventory file
to validate names against it. Bare tokens are passed through verbatim to
the dispatched workflow, which holds the inventory and is the single
source of truth for name resolution. Two consequences:

* mathlib-ci stays inventory-agnostic — no more cross-repo curl on every
  `!downstream-check`, no schema-drift surface between the two repos.
* The grammar accepts either the inventory short name (e.g. `FLT`) or
  the downstream's `owner/repo` slug (e.g. `leanprover-community/FLT`)
  for the same downstream. The dispatched build_matrix.py resolves
  both forms against `downstreams[*].name` and `downstreams[*].repo`
  respectively and reports a clear error from there on unknown tokens.

The script keeps grammar checks (empty entry, empty rev after `@`,
unknown flags) — those are independent of the inventory.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@bryangingechen
Copy link
Copy Markdown
Contributor

A few questions:

  • should these scripts be part of a standalone action (written in something that's not bash shell)? It might make it easier for you to iterate / test / maintain.
  • if there's an error in the comment syntax or if something else goes wrong, could you make it so that the bot still replies with a comment on how to fix it? (I tried to do this with splice-bot, for example)

Replaces the bash scripts (`validate_names.sh` + `post_ack_comment.sh`)
with two composite actions under `.github/actions/`, each backed by
testable Python:

* `check-downstream-validate` (`grammar.py`, `validate.py`,
  `error_comment.py`) — parses the directive grammar, resolves the
  PR's `merge_commit_sha`, and POSTs a user-facing comment to the PR
  when the directive is malformed.  The PR author sees what was wrong
  inline instead of having to dig through the workflow log.
* `check-downstream-ack` (`ack.py`) — renders + POSTs the dispatch ack
  comment.  Same backtick-stripping in entry rendering as before.

Grammar errors surface as comments here; runtime errors (unknown
downstream, build failures, etc.) come from the dispatched
downstream-reports workflow on its own.  The two repos stay decoupled
— this side never reads downstream-reports' inventory.

Testing:

* `tests/pr_check_downstream/` — 43 pytest unit tests covering the
  grammar parser (every valid form, every GrammarError variant,
  dedup, round-trip through serialize), the error-comment renderer
  (mention, fallback greeting, hint placement, usage reminder), the
  ack renderer (heading, short-SHA, backtick stripping), and the
  validate orchestration (with HTTP mocked out).
* `.github/workflows/test_pr_check_downstream.yml` — runs pytest on
  every push / PR touching the action sources or tests.

The bash scripts are gone; `scripts/pr_check_downstream/README.md`
now points readers at the two action directories and the design doc
in `downstream-reports`.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@bryangingechen bryangingechen self-assigned this May 14, 2026
@marcelolynch
Copy link
Copy Markdown
Collaborator Author

marcelolynch commented May 15, 2026

I applied Bryan's feedback in the last commit.

Tested the latest changes here

https://github.com/leanprover-community/mathlib4/actions/runs/25934042795/job/76234806969
https://github.com/leanprover-community/mathlib4/actions/runs/25933977404/job/76234592891

I confirmed both comments were posted (but I deleted them so I lost the screenshot, ha)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants