Skip to content

feat: add reusable check-sprint-on-merge workflow#42

Open
lml2468 wants to merge 6 commits into
mainfrom
feat/check-sprint-reusable
Open

feat: add reusable check-sprint-on-merge workflow#42
lml2468 wants to merge 6 commits into
mainfrom
feat/check-sprint-reusable

Conversation

@lml2468
Copy link
Copy Markdown
Contributor

@lml2468 lml2468 commented May 23, 2026

Summary

Adds a reusable workflow that blocks PR merges when the Sprint field on the Octo Board (project #2) is not set or does not match the current sprint iteration.

This is part of the Sprint discipline enforcement initiative: every PR merged to main must belong to the active sprint.

How it works

  1. Queries the GitHub Projects GraphQL API for all sprint iterations.
  2. Determines the current sprint based on startDate and duration (using Asia/Shanghai timezone).
  3. Queries the PR's Sprint field value from the project board.
  4. Fails with an actionable error if Sprint is missing or mismatched — two distinct messages:
    • Type A (Sprint not set / PR not on board): tells developer to set Sprint then re-run
    • Type B (Sprint set to wrong iteration): tells developer which sprint is current and which was set

Security

  • Uses pull_request_target — runs in base branch context, never checks out PR code.
  • permissions: {} — GITHUB_TOKEN carries zero scopes; all API access goes through PROJECT_TOKEN PAT (read:project + read:org).

Branch protection note

After merge, configure each business repo's main branch protection to require status check: check-sprint / check-sprint

Copy link
Copy Markdown

@Jerry-Xin Jerry-Xin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Project relevance gate passes: this reusable workflow is in scope for the Mininglamp-OSS .github repo, but the current design does not reliably enforce the “current sprint at merge time” policy.

🔴 Blocking

  • 🔴 Critical — .github/workflows/reusable-check-sprint.yml:120 computes the current sprint only when the check run executes, but branch protection will continue accepting an older successful check on the same commit after the sprint changes. A PR can pass on the last day of Sprint W21 and still be merged unchanged during Sprint W22, violating the stated requirement that every PR merged to main belongs to the active sprint. This needs a merge-time freshness strategy, such as merge queue support, a required check that is re-created/re-run after sprint rollover, or another mechanism that prevents stale green checks from satisfying branch protection.

💬 Non-blocking

  • 🟡 Warning — .github/workflows/reusable-check-sprint.yml:144 fetches only projectItems(first: 10) and does not paginate. If a PR is attached to more than 10 projects, the Octo Board item may be omitted and the workflow will fail with the Type A message even when Sprint is set. Prefer pagination or a higher limit with explicit handling.
  • 🟡 Warning — .github/workflows/reusable-check-sprint.yml:142 interpolates repo-name directly into a GraphQL string. Repository names are normally constrained, but reusable workflow inputs are still caller-controlled. Using GraphQL variables would remove this class of query-construction risk and avoid escaping edge cases.
  • 🔵 Suggestion — .github/workflows/reusable-check-sprint.yml:117 assumes organization, projectV2, field, and configuration are all present. If the token lacks access or the field is renamed, the workflow will emit a Python traceback instead of an actionable GitHub Actions error.

✅ Highlights

  • The PR is relevant to this repository and aligns with the existing Octo Board automation.
  • permissions: {} and avoiding checkout are appropriate for the intended pull_request_target security model.
  • The workflow cleanly distinguishes missing Sprint from mismatched Sprint, which should make failures actionable once the merge-time enforcement gap is fixed.

yujiawei
yujiawei previously approved these changes May 23, 2026
Copy link
Copy Markdown

@yujiawei yujiawei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review — PR #42 (.github)

Summary

A new reusable workflow that blocks PR merges when the Sprint field on the Octo Board (project #2) is missing or does not match the current sprint iteration. Single-file change, ~195 lines, no checkout of PR code, dedicated read-only PAT, empty permissions:. The security model is sound and the logic is straightforward.

No P0 / P1 blockers found. Approving with a list of P2 hardening suggestions worth addressing in a follow-up (or in this PR, at the author's discretion).

What I checked

  • pull_request_target exposure surface (PR code execution, secret bleed, env-var injection, GraphQL injection)
  • PROJECT_TOKEN scope and blast radius
  • Sprint matching logic (date boundaries, timezone, missing iterations)
  • Status check correctness (re-run UX, SHA-binding gaps, post-merge state drift)
  • Reusability / parametrization
  • Failure modes (transient GraphQL errors, oversized projectItems page)

Security review (the part that mattered for the security_sensitive routing)

The pull_request_target choice is justified here: the check must run on fork PRs and needs access to PROJECT_TOKEN to query a private org project board. A pull_request-only approach would silently skip forks and defeat the gate. The risk surface is contained by:

  1. No checkoutactions/checkout is absent. PR-controlled code never executes. (lines 33–69)
  2. permissions: {}GITHUB_TOKEN carries zero scopes; the only credential used is PROJECT_TOKEN. (line 33)
  3. Token least-privilegeread:project + read:org. Even if the token were exfiltrated, it cannot mutate the project board, write to repos, or post comments.
  4. Env-var isolation for Python heredocPR_NUMBER, REPO_NAME, PROJECT_TOKEN flow env:os.environ[...] inside a single-quoted heredoc (<<'PYTHON_SCRIPT'), so shell-level interpolation is disabled and the only attacker-controlled input that reaches Python is the integer PR number (coerced via int(...) on line 48) and the repo name (from caller — see GraphQL-injection note below).

This matches the org's post-2026-05-16 hardening pattern (single-quoted heredoc + env-var indirection + no checkout). Good.

Findings

P2 — Defense-in-depth: prefer GraphQL variables over f-string interpolation

reusable-check-sprint.yml:135-156

data = graphql(f"""
{{
  repository(owner: "{REPO_OWNER}", name: "{REPO_SHORT}") {{
    pullRequest(number: {PR_NUMBER}) {{
      ...

REPO_OWNER / REPO_SHORT are interpolated directly into the GraphQL query string. In practice callers will pass ${{ github.repository }} (which GitHub guarantees is safe), so this is not exploitable today. But for a "reusable" workflow that future callers will copy without scrutiny, this is the wrong default. Use parameterised queries:

def graphql(query, variables=None):
    payload = {"query": query, "variables": variables or {}}
    ...

data = graphql(
    """
    query($owner: String!, $name: String!, $number: Int!) {
      repository(owner: $owner, name: $name) {
        pullRequest(number: $number) { ... }
      }
    }
    """,
    {"owner": REPO_OWNER, "name": REPO_SHORT, "number": PR_NUMBER},
)

Eliminates the injection class entirely.

P2 — Hardcoded org and project number undermine "reusable"

reusable-check-sprint.yml:53-55, 75

BOARD_URL = "https://github.com/orgs/Mininglamp-OSS/projects/2"
PROJECT_NUMBER = 2
...
organization(login: "Mininglamp-OSS") { projectV2(number: 2) { ... } }

The workflow advertises itself as reusable but is hardwired to one org and one board. Any future repo or sub-org reuse requires a fork. Promote org-login, project-number, and a derived board-url to workflow_call.inputs with the current values as defaults — keeps existing callers working while making the workflow actually reusable.

P2 — Caller trigger contract is undocumented; re-run UX depends on it

The reusable workflow's two error messages ("…then re-run this check in Actions UI") only work as advertised if the caller declares the right triggers. If a caller forgets synchronize, a fresh push after a developer sets Sprint will not re-evaluate the check, and the stale failure remains bound to the old SHA. Add a # Caller usage: example header block showing the expected trigger:

on:
  pull_request_target:
    types: [opened, synchronize, reopened]

Also worth documenting the status check name that callers must wire into branch protection. The job name check-sprint and the PR body's check-sprint / check-sprint only match if the caller's job is also named check-sprint. Pin this in the workflow header so callers don't guess.

P2 — Post-green Sprint mutation is not re-checked

The Sprint field on the project board can be edited after the workflow goes green for a given SHA. Branch protection won't re-evaluate, so a PR can be approved with Sprint=N, then have Sprint cleared, then merged. This is a fundamental limitation of GH project boards (no webhook → workflow re-trigger), not a bug in this PR, but it's worth either:

  • documenting the gap in the workflow header so admins aren't surprised, or
  • recommending GH's "require branches to be up to date before merging" branch protection setting, which forces a fresh push (and thus re-check) close to merge time.

P2 — Sprint gap blocks all PRs with an opaque error

reusable-check-sprint.yml:111-118

If today falls between iterations (gap day, or board configured with gaps), current_sprint is None and every PR fails with "Could not determine the current sprint iteration." This is a hard CI-wide outage with a non-actionable error for the average developer. Two options:

  • Treat consecutive-iteration coverage as an invariant and add a self-test workflow that verifies it daily.
  • Soften the message: tell developers who to ping (project admin) and what to fix (add an iteration covering today on the Octo Board).

P2 — No retry on transient GraphQL failures

reusable-check-sprint.yml:27-50

A single network blip or transient 5xx from api.github.com/graphql will block merge. Wrap graphql() in a simple retry (2 retries, exponential backoff, only on connection errors / 5xx) so the gate doesn't flap.

P2 — projectItems(first: 10) is a silent ceiling

reusable-check-sprint.yml:138

If a PR happens to be added to >10 projects and project #2 is not in the first 10 returned, the check incorrectly reports Type A ("Sprint not set"). Unlikely today, but easy to harden:

  • request first: 20 (cheap and covers any realistic case), or
  • iterate with pageInfo.hasNextPage / endCursor until project #2 is found.

Nit — Use zoneinfo instead of subprocess for timezone

reusable-check-sprint.yml:55-62

def get_today_shanghai() -> date:
    result = subprocess.run(["date", "+%Y-%m-%d"], ..., env={**os.environ, "TZ": "Asia/Shanghai"})
    return date.fromisoformat(result.stdout.strip())

Stdlib equivalent, no subprocess, no TZ env leakage:

from datetime import datetime
from zoneinfo import ZoneInfo
def get_today_shanghai() -> date:
    return datetime.now(ZoneInfo("Asia/Shanghai")).date()

Items I want a human to double-check before merge

  • PROJECT_TOKEN provisioning — confirm the PAT actually has read:project + read:org and is scoped to a service account, not a personal token tied to a real user (otherwise the gate breaks the day that person leaves).
  • Branch protection wiring across business repos — the PR body says "configure each business repo's main branch protection to require status check: check-sprint / check-sprint." Verify the exact <caller-job-name> / check-sprint string matches what GitHub will surface for each caller workflow.
  • Project board iteration coverage — verify the Octo Board has iterations defined contiguously through, say, 2026-12-31, so the "Could not determine current sprint" error path doesn't surprise anyone next quarter.

Verdict

Approving. The security model is right (no checkout, scoped PAT, empty permissions, env-var isolation), the logic is correct, and all my findings are non-blocking improvements / defense-in-depth. The P2 items are worth picking up either inline or as a follow-up — none of them justify holding this PR.

- FIX1 (Jerry-Xin critical): document stale-check limitation with
  recommended mitigation (Require branches to be up to date) and
  add enforce-freshness input for commit-timestamp guard
- FIX2 (both reviewers): replace f-string GraphQL interpolation
  with proper parameterised variables
- FIX3 (yujiawei nit): use zoneinfo.ZoneInfo instead of subprocess
  for Asia/Shanghai date
- FIX4 (Jerry-Xin warning): increase projectItems limit 10→20
- FIX5 (Jerry-Xin suggestion): wrap all data access in try/except
  with actionable error messages
- FIX6 (yujiawei P2): add retry with exponential backoff (3 attempts)
  for transient GraphQL/network failures
- FIX7 (yujiawei P2): improve sprint-gap error message with today's
  date and link to project settings
- FIX8 (yujiawei P2): add caller usage + status check name + token
  scopes to workflow header
- FIX9 (yujiawei P2): promote org-login and project-number to
  configurable inputs with defaults
@lml2468
Copy link
Copy Markdown
Contributor Author

lml2468 commented May 23, 2026

Thanks for the thorough reviews! All feedback addressed in the latest commit (a1879fd):

Jerry-Xin (blocking → resolved):

  • 🔴 Stale check: documented the known limitation prominently in the header with recommended mitigation ("Require branches to be up to date" branch protection setting). Added optional enforce-freshness input (default false) — when enabled, fails if PR head commit predates the current sprint start date, covering the residual risk case.
  • 🟡 projectItems(first: 10) → increased to first: 20
  • 🟡 GraphQL f-string interpolation → replaced with proper GraphQL variables throughout
  • 🔵 Bare data access → wrapped both sprint and PR data access in try/except (KeyError, TypeError) with actionable error messages

yujiawei (P2 suggestions → all addressed):

  • P2 GraphQL variables ✅ (same as above)
  • P2 Hardcoded org/project → promoted to org-login and project-number inputs with defaults
  • P2 Caller contract → added header block with caller usage example, expected trigger types, and status check name for branch protection
  • P2 Stale post-green mutation → documented in header + enforce-freshness option
  • P2 Sprint gap message → now includes today's date and links to project settings
  • P2 Retry → added 3-attempt exponential backoff for transient curl/5xx failures
  • P2 first: 10 ceiling → increased to first: 20
  • Nit timezone → switched to zoneinfo.ZoneInfo (stdlib, no subprocess)

Jerry-Xin
Jerry-Xin previously approved these changes May 23, 2026
Copy link
Copy Markdown

@Jerry-Xin Jerry-Xin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PR is in scope for this repository and the workflow is ready to merge; I found no blocking bugs or security issues.

💬 Non-blocking

🟡 Warning: reusable-check-sprint.yml uses projectItems(first: 20) without pagination, and line 343 matches only by project number. A PR attached to more than 20 projects, or another project numbered 2, could be missed or misidentified. Consider querying the target project id and comparing IDs, with pagination.

🟡 Warning: reusable-check-sprint.yml shows uses: ./.github/workflows/reusable-check-sprint.yml, which only works if every caller repo has a local copy. For a central .github reusable workflow, the example should use something like Mininglamp-OSS/.github/.github/workflows/reusable-check-sprint.yml@main or a pinned tag.

🔵 Suggestion: reusable-check-sprint.yml could handle non-JSON / non-2xx responses explicitly. Today a malformed response can produce a Python traceback instead of a clean Actions error.

✅ Highlights

The relevance gate passes: shared GitHub Actions workflows are appropriate for Mininglamp-OSS/.github.

Security posture is sound for pull_request_target: no checkout, no PR code execution, permissions: {}, and API access isolated to PROJECT_TOKEN.

Validation performed: read full gh pr diff, reviewed the workflow with line numbers and related repo workflow patterns, parsed YAML successfully, and ran actionlint v1.7.12 successfully against the new workflow.

yujiawei
yujiawei previously approved these changes May 23, 2026
Copy link
Copy Markdown

@yujiawei yujiawei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review — PR #42 (.github)

Summary

Re-review of the new head commit (a1879fd) for the reusable Sprint-on-merge check. The author has comprehensively addressed every blocking and non-blocking item from the prior round: stale-check residual risk is now documented up front and gated by an opt-in enforce-freshness input; GraphQL queries are fully parameterised; bare data access is guarded; org/project are inputs with sane defaults; the caller contract (triggers, status-check name) is documented in the header; and a 3-attempt exponential-backoff retry is in place. The security model is sound (no checkout, permissions: {}, dedicated read-only PAT, env-var isolation from a single-quoted heredoc) and the iteration math is correct.

No P0 / P1 blockers found. Approving with a small set of P2 / nit suggestions worth picking up in a follow-up — none of them justify holding this PR.

What I re-checked

  • Resolution of every prior P0–P2 item against the new diff
  • pull_request_target exposure surface on the reusable layer (token reaches Python only via env:, no shell interpolation, single-quoted heredoc)
  • permissions: {} caps the inherited GITHUB_TOKEN to zero scopes for this job, regardless of caller perms
  • Retry logic (HTTP 5xx + non-zero curl exit; 4xx falls through to body parsing — surfaced via the outer try/except (KeyError, TypeError))
  • Iteration boundary math ([startDate, startDate + duration)) and Shanghai-date evaluation
  • enforce-freshness head-commit timestamp guard (committedDate parsing, Shanghai conversion, < sprint_start comparison)
  • projectItems(first: 20) ceiling is sufficient for realistic project-board attachment counts

Prior items — status on this commit

Prior finding Status
🔴 Jerry-Xin: stale check on sprint rollover Resolved — prominent header callout + recommended branch-protection mitigation + optional enforce-freshness for residual case
🟡 Jerry-Xin: projectItems(first: 10) ceiling Resolved — raised to first: 20
🟡 Jerry-Xin: GraphQL string interpolation Resolved — fully parameterised with $owner / $name / $number / $orgLogin / $projectNumber
🔵 Jerry-Xin: bare data access traceback Resolved — both queries wrapped in try/except (KeyError, TypeError) with actionable hints
P2 yujiawei: GraphQL variables Resolved (same as above)
P2 yujiawei: hardcoded org / project Resolved — org-login, project-number are inputs with defaults
P2 yujiawei: caller trigger contract Resolved — header has caller usage block, expected trigger types, status-check name
P2 yujiawei: post-green Sprint mutation Resolved (documented + enforce-freshness)
P2 yujiawei: sprint-gap error opacity Resolved — message now includes today's date and links to project settings
P2 yujiawei: no retry on transient failures Resolved — 3-attempt exponential backoff on curl exit / HTTP 5xx
Nit yujiawei: subprocess for timezone Resolved — switched to zoneinfo.ZoneInfo

Remaining findings (non-blocking)

P2 — enforce-freshness relies on committer-controlled committedDate

reusable-check-sprint.yml (freshness branch)

committed_date_str = pr["commits"]["nodes"][0]["commit"]["committedDate"]
...
if committed_date < sprint_start:
    sys.exit(1)

committedDate is a git field set by whoever made the commit (GIT_COMMITTER_DATE). A PR author can stamp an arbitrary recent date on an old commit and bypass the freshness intent. This is fine for the documented use case (defending against the accidental "branch up-to-date through sprint rollover" race) but the header should make the threat model explicit — i.e. it is a UX/correctness guard, not an adversarial control. If a stricter guard is ever needed, GitHub also exposes pushedDate on commits via REST, which is harder for the author to spoof.

P2 — 4xx responses fall through to body parsing

reusable-check-sprint.yml (graphql helper)

The retry path only triggers on non-zero curl exit or HTTP ≥ 500. A 401/403/404 currently falls through to json.loads(body_str) and then body["data"], raising KeyError. The outer caller catches it, but the surfaced message ("Failed to read Sprint iterations. Check that PROJECT_TOKEN has read:project + read:org scope …") only happens to be roughly right for 401, and is misleading for 404 or other 4xx. A small explicit branch ("HTTP 401 → token invalid / lacks scopes", "HTTP 403 → token denied access to the org or project", "HTTP 404 → project / repo not found") would make this much more actionable in CI.

P2 — projectItems(first: 20) is still a silent ceiling

Raising from 10 → 20 covers any realistic case today, but if a future PR ends up attached to more than 20 projects and project #2 is not in the first page, the check will incorrectly report "Sprint not set". Either iterate with pageInfo.hasNextPage / endCursor until the configured project is found, or add a defensive log when nodes length equals the limit so the truncation is at least visible.

P2 — MAX_RETRIES = 3 includes the initial attempt

The comment / commit message says "3-attempt exponential backoff", and the loop does run 3 times, but only 2 of those are retries (after attempt 0 / 1). Worth either renaming MAX_RETRIES to MAX_ATTEMPTS for clarity, or documenting "3 attempts (2 retries)" in the docstring — this is a future-you readability nit.

Nit — import subprocess inside the helper

def graphql(query: str, variables: dict = None) -> dict:
    import subprocess

Module cache makes this functionally fine, but for a top-level helper used on every call it's odd. Promote to the top of the script with the rest of the imports.

Nit — header ASCII box

The big banner box is informative, but for .github/workflows/*.yml viewed in IDEs without a monospace gutter it will eventually drift if anyone edits inside it. Consider a plain #-comment paragraph instead — same information, no box to keep aligned.

Items I want a human to double-check before merge

  • PROJECT_TOKEN ownership — confirm the PAT is a service-account token (not a personal token tied to a real user) with exactly read:project + read:org. The previous review flagged this; please verify once more before relying on this check in branch protection across business repos.
  • Branch-protection wiring — the documented status check name is check-sprint / check-sprint. This only matches when the caller's job is also named check-sprint. Worth one quick pass over the first couple of business-repo callers to confirm the literal string lines up before the gate is enforced.
  • Project board iteration coverage — verify the Octo Board has iterations defined contiguously through at least the end of the current quarter, so the "No active sprint covers today" error path is not the first signal that someone forgot to roll iterations forward.

Verdict

Approving. Security model is right (no checkout, permissions: {}, scoped PAT, env-var-isolated heredoc), Sprint logic and freshness guard are correct, and all the prior round's concerns are visibly resolved. The remaining items are P2 polish / defensive-coding suggestions that can land in a follow-up.

- FIX A: switch from project-number matching to project-ID matching
  (fetch projectV2.id in step 1, compare item.project.id in step 2);
  add pagination loop (max 5 pages, first: 20) with pageInfo cursor
  so PRs attached to >20 projects are handled correctly
- FIX B: fix caller usage example in header — was './.github/...'
  (only works with a local copy), now correctly shows the central
  reusable path Mininglamp-OSS/.github/.github/workflows/...@main
- FIX C: wrap json.loads() in try/except JSONDecodeError so proxy
  HTML error pages produce a clean ::error:: message instead of a
  Python traceback
@lml2468 lml2468 dismissed stale reviews from yujiawei and Jerry-Xin via c7aa6eb May 23, 2026 13:47
@lml2468
Copy link
Copy Markdown
Contributor Author

lml2468 commented May 23, 2026

Round 2 fixes pushed (c7aa6eb) — addressing Jerry-Xin's second-round warnings:

FIX A — Project ID matching + pagination (🟡 Warning resolved)

  • Step 1 now fetches projectV2 { id } and stores the project node ID
  • Step 2 matches item["project"]["id"] == project_id instead of comparing number
  • Pagination loop added: up to 5 pages × 20 items = 100 project attachments handled, following pageInfo { hasNextPage endCursor } until item is found or exhausted

FIX B — Caller usage example (🟡 Warning resolved)

  • Header now shows Mininglamp-OSS/.github/.github/workflows/reusable-check-sprint.yml@main — the correct central-reusable path, not the per-repo-local path

FIX C — Non-JSON response handling (🔵 Suggestion resolved)

  • json.loads(body_str) wrapped in try/except json.JSONDecodeError — proxy HTML error pages now produce a clean ::error:: message with HTTP code and truncated response, no traceback

yujiawei
yujiawei previously approved these changes May 23, 2026
Copy link
Copy Markdown

@yujiawei yujiawei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review — PR #42 (.github)

Independent review of feat: add reusable check-sprint-on-merge workflow (head c7aa6eb).

Summary

A focused, well‑documented reusable workflow that gates merges on a correctly‑set Sprint field on Project V2 #2. The security model is the right one for a PR gate that needs Project access: pull_request_target, no checkout of PR code, permissions: {} on the job, and a dedicated PROJECT_TOKEN PAT for the read‑only GraphQL calls. Sprint‑window math, timezone handling, and the optional freshness check all look correct. No P0/P1 issues. A handful of P2 suggestions below.

1. Verification

  • ✅ Security model is sound for a pull_request_target workflow:
    • No actions/checkout step anywhere in reusable-check-sprint.yml → PR code is never fetched or executed.
    • Job‑level permissions: {} (line 89) reduces GITHUB_TOKEN to zero scopes.
    • All API calls go through PROJECT_TOKEN, scoped to read:project + read:org.
    • No ${{ … }} interpolation of PR‑controlled fields into shell or Python source. PR‑derived inputs (pr-number, repo-name) flow only through env: and are consumed by Python via os.environ[…], so command‑injection vectors are closed.
  • ✅ Sprint window math is correct: start = date.fromisoformat(it["startDate"]), end = start + timedelta(days=it["duration"]), start <= today < end (lines 217–222) — half‑open interval, matches GitHub's iteration semantics.
  • ✅ Timezone handling is consistent: both today (line 196) and committed_date (lines 350–352) are normalized to Asia/Shanghai before comparison.
  • ✅ HTTP error handling: retries only on transient failures (curl exit ≠ 0, HTTP 5xx) with exponential backoff up to MAX_RETRIES = 3; 4xx and GraphQL‑level errors fail fast with a clear annotation.
  • ✅ Caller usage example uses pull_request_target (not pull_request), matching the secrets requirement.
  • ✅ Two distinct error messages for "Sprint not set" vs "Sprint mismatch" — actionable for developers.
  • enforce-freshness correctly compares head commit committedDate (UTC → Asia/Shanghai) against the current sprint_start.

2. Findings

P0 — Blockers

None.

P1 — Functional bugs / merge blockers

None.

P2 — Suggestions (non‑blocking)

  1. enforce-freshness defaults to false while the file's own header documents the residual stale‑check risk that this flag is designed to close (lines 17–22 vs. lines 75–79). The default means most callers will adopt the workflow without that mitigation. Consider one of:

    • Flip the default to true so callers opt out, not in.
    • Or, add a one‑line note in the README/usage example calling out which repos should set it (e.g. anything with long‑lived branches that could survive across a sprint boundary).
  2. PR project‑items pagination caps silently at MAX_PROJECT_PAGES = 5 (= 100 items). If a PR is somehow attached to >100 project items and the Octo Board entry is on a later page, board_item stays None and the PR is reported as "Sprint not set" with no way for the developer to tell apart "I forgot to set it" from "the script gave up looking". This is extremely unlikely in practice, but a print("::warning::reached project items pagination cap") after the loop would surface the truncation if it ever happens.

  3. Two near‑duplicate GraphQL query strings for the PR projectItems fetch (one with commits(last: 1), one without — lines 246–273 / 275–294). Functionally fine, but a single query that always selects commits(last: 1) and only consumes the field when ENFORCE_FRESHNESS is true would shrink the surface area for future drift between the two branches.

  4. Token is passed via curl -H "Authorization: bearer $TOKEN" rather than via --header @-/stdin or a file. Process listing on the runner exposes the token to anything else running on that runner during the brief subprocess.run window. Not exploitable in this design (no PR code runs in the same job), so this is style, not security — but if you want belt‑and‑suspenders, curl -H @- <<< "Authorization: bearer $TOKEN" keeps the secret off argv.

  5. board_item.get("sprint") on line 380board_item is the raw dict from projectItems.nodes, so .get is fine; just noting the chained None checks correctly handle "PR is on the board, but Sprint field has no value" vs. "PR is not on the board". Both paths print the same Type A error message ("Sprint not set"), which matches the PR description and is intentional.

  6. No unit tests / dry‑run mode. The script is small and self‑contained, so tests aren't strictly required, but a --dry-run (or pure‑Python testable function around pick_current_sprint(iterations, today) + evaluate(board_item, current_sprint)) would let future changes land with confidence. Consider for the next iteration.

3. Items for human verification (security_sensitive)

  • Confirm PROJECT_TOKEN in the org/repo secrets is scoped to read:project + read:org only — and not, e.g., a maintainer's broad PAT. The workflow can't enforce least‑privilege at the secret level; that's an org‑settings audit.
  • Confirm branch‑protection on each consuming repo will be configured to require the check-sprint / check-sprint status check, and to "Require branches to be up to date before merging" — both are out of band for this PR but are part of the security posture this workflow assumes (per the in‑file docstring on lines 12–16).
  • The default pull_request_target event in the usage example listens to labeled, unlabeled — fine for re‑evaluating the gate, but be aware that any caller adopting this verbatim will run the workflow on every label flip. Acceptable because the job only does read‑only GraphQL, but worth noting if cost/quotas matter.

4. Additional observations

  • File is placed alongside other reusable-*.yml workflows in the org .github repo, matching the existing convention (e.g. reusable-codeql.yml, reusable-pr-labeler.yml).
  • The workflow's own CI (actionlint, No tabs in workflow files) is green at this SHA.
  • The Python heredoc style is consistent with how the script will run on ubuntu-latest runners (Python 3.9+ guaranteed, zoneinfo available).

Verdict

APPROVED. No P0/P1 issues. The P2 items above are improvements, not merge blockers — recommend tracking #1 (enforce-freshness default) as a follow‑up so the documented stale‑check risk is closed by default across the org.

Jerry-Xin
Jerry-Xin previously approved these changes May 23, 2026
Copy link
Copy Markdown

@Jerry-Xin Jerry-Xin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reusable workflow is in scope for this .github repository and is ready to merge; I found no blocking correctness or security issues.

💬 Non-blocking

  • 🟡 Warning — .github/workflows/reusable-check-sprint.yml:52 and .github/workflows/reusable-check-sprint.yml:84: the documented token scopes mention only read:project and read:org, but the workflow also queries repository PR data and commits. For private repositories, document the required repository read access as well.

  • 🟡 Warning — .github/workflows/reusable-check-sprint.yml:260: MAX_PROJECT_PAGES = 5 caps lookup at 100 project items. That is probably fine operationally, but if a PR is attached to more than 100 projects, this can falsely report “Sprint not set.”

  • 🔵 Suggestion — .github/workflows/reusable-check-sprint.yml:354: the optional freshness guard uses the Git commit committedDate, which can be controlled by whoever creates the commit. It is useful as a stale-check mitigation, but should not be described as a hard freshness guarantee.

✅ Highlights

  • Uses workflow_call cleanly and avoids checking out PR code.
  • permissions: {} plus explicit PROJECT_TOKEN use keeps the default token minimized.
  • Current sprint matching by iteration ID is stronger than title matching.
  • Pagination over projectItems avoids the common first-page-only bug.
  • The embedded Python compiles cleanly after YAML block indentation stripping.

Per 梦林: we have exactly one project (number=2), so matching by
project number is correct and sufficient. The FIX-A from the previous
commit introduced unnecessary complexity (project node-ID lookup,
5-page pagination loop, after-cursor variables) that adds cognitive
overhead without any correctness benefit for our setup.

Retained from round-2:
- FIX B: fix caller usage example in header to central reusable path
- FIX C: wrap json.loads in try/except JSONDecodeError
@lml2468 lml2468 dismissed stale reviews from Jerry-Xin and yujiawei via 2adb4d8 May 23, 2026 14:04
@lml2468
Copy link
Copy Markdown
Contributor Author

lml2468 commented May 23, 2026

Reverted the over-engineered FIX-A from the previous commit (2adb4d8).

Context: 梦林 confirmed we have exactly one project (number = 2) and this will remain so by design. Project-number matching is therefore correct and sufficient — there is no ambiguity to resolve.

What's reverted:

  • Removed project node-ID lookup (projectV2 { id }) from step 1
  • Removed 5-page pagination loop with after/endCursor from step 2
  • Reverted project { id number } back to project { number }
  • Reverted item matching to item["project"]["number"] == PROJECT_NUMBER

Still kept from round 2:

  • ✅ FIX B: caller usage example corrected to Mininglamp-OSS/.github/.github/workflows/reusable-check-sprint.yml@main
  • ✅ FIX C: json.loads wrapped in try/except JSONDecodeError with clean error output

Current state: all round-1 fixes + FIX B + FIX C. Simple, correct, no unnecessary complexity.

Copy link
Copy Markdown

@Jerry-Xin Jerry-Xin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PR is relevant to this .github repository, but the reusable merge gate does not reliably identify the intended Octo Board project.

🔴 Blocking

  • 🔴 Critical — .github/workflows/reusable-check-sprint.yml:262 and .github/workflows/reusable-check-sprint.yml:347: the workflow only reads projectItems(first: 20) and then matches projects by project.number. Project numbers are scoped to their owner, and the connection is paginated. A PR can be correctly added to Mininglamp-OSS project #2 with the right Sprint, but this check can still inspect a different project numbered 2, or miss the Octo Board entirely if it is beyond the first 20 project items. Since this is intended as a branch-protection gate, that creates false merge blocks. Query the configured org project id in Step 1, page through projectItems until that exact project.id is found, and compare by ID rather than number.

💬 Non-blocking

  • 🟡 Warning — .github/workflows/reusable-check-sprint.yml:323: enforce-freshness uses the head commit’s committed date as the freshness signal. That is acceptable as an optional guard, but it can reject legitimate carried-over work whose latest commit predates the sprint start even after the Sprint field is corrected. The documentation should make this operational tradeoff explicit before teams enable it.
  • 🔵 Suggestion — .github/workflows/reusable-check-sprint.yml:122: consider moving the inline Python into a small checked-in script once this grows further. That would make unit-style tests around iteration selection, pagination, and error cases much easier.

✅ Highlights

  • The pull_request_target security notes are clear, and the workflow avoids checking out or executing PR code.
  • The error messages distinguish missing Sprint from mismatched Sprint and give actionable remediation steps.

yujiawei
yujiawei previously approved these changes May 23, 2026
Copy link
Copy Markdown

@yujiawei yujiawei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review — PR #42 (.github)

1. Verification Summary

Check Result Evidence
pull_request_target security model documented and respected reusable-check-sprint.yml:24-32 — explicit comment + zero actions/checkout, no PR code execution
permissions: {} zeroes GITHUB_TOKEN scopes reusable-check-sprint.yml:84
Secret PROJECT_TOKEN declared required: true on the reusable boundary reusable-check-sprint.yml:75-78
Sprint-rollover stale-check limitation acknowledged with mitigation reusable-check-sprint.yml:3-22 (header), enforce-freshness opt-in at 262-281
Two distinct error paths (Type A: unset / Type B: mismatch) match PR description reusable-check-sprint.yml:294-301 (unset) vs 307-313 (mismatch)
Asia/Shanghai timezone applied consistently to "today" and freshness compare reusable-check-sprint.yml:175-178 and 269-271 both use ZoneInfo("Asia/Shanghai")
Iteration interval is half-open [start, end) reusable-check-sprint.yml:184-187 — no off-by-one between adjacent sprints
Retry policy is bounded and skips 4xx / GraphQL errors reusable-check-sprint.yml:113-152MAX_RETRIES=3, exponential backoff 1·2^attempt, no retry on logical errors

No P0 / P1 blockers found. Verdict: APPROVED with the P2 hardening suggestions below.

2. Findings

P2 — Hardening / quality (non-blocking)

P2-1. PROJECT_TOKEN passed on the curl command line
reusable-check-sprint.yml:117-124 builds the Authorization: bearer <token> header as a positional argv to curl, which makes the secret transiently visible in /proc/<curl-pid>/cmdline to any other process owned by the runner user. On a GitHub-hosted runner the practical exposure window is small (single-tenant, single-step job, ephemeral VM) and permissions: {} already neutralises GITHUB_TOKEN, so this is not an immediate exploit path. But the standard hardening is one of:

  • drop curl for urllib.request and pass the header via the Request object (token never leaves the Python process); or
  • keep curl and feed the header from a temp file via -H "@/tmp/auth" or --config /tmp/curlrc.
    Recommend the urllib option since it also removes the subprocess import inside graphql() (131).

P2-2. No per-request curl timeout
reusable-check-sprint.yml:115-127 has no --max-time / --connect-timeout. The job-level timeout-minutes: 5 (82) is the only backstop, so a hung TCP socket can burn the whole budget on a single attempt and starve the remaining retries. Adding e.g. --connect-timeout 10 --max-time 30 keeps the retry loop meaningful.

P2-3. HTTP-status parsing is not defensive
reusable-check-sprint.yml:142-144 does int(lines[1]) with no try/except. If curl ever writes a non-numeric trailer (network teardown mid-write, -w template breakage during a future edit), this raises an uncaught ValueError with a confusing traceback instead of the formatted ::error:: message the rest of the function emits. A small try: http_code = int(lines[1]) except ValueError: http_code = 0 (treated as a retryable failure) is enough.

P2-4. Misleading error text when the Sprint field is absent
reusable-check-sprint.yml:167-178 catches (KeyError, TypeError) from data["organization"]["projectV2"]["field"]["configuration"]["iterations"] and always blames token scopes. But the same exception fires when the Sprint field doesn't exist on the project (e.g. it was renamed, or someone changed its type to single-select). Suggest distinguishing the two: explicitly check field is None first and surface a "Sprint field not found on project #X" message before falling back to the scope hint.

P2-5. projectItems(first: 20) is unpaginated
reusable-check-sprint.yml:200, 234 query at most 20 project items per PR. Today no PR has anywhere near 20 boards attached, but if that ever changes the loop at 285-289 silently fails to find Octo Board → "Sprint not set" error → developer confusion. Either bump to a safe upper bound (e.g. first: 100, the GraphQL hard max) or walk pageInfo. Worth at least a TODO comment acknowledging the cap.

P2-6. Usage example over-triggers
The header example at reusable-check-sprint.yml:44-46 lists [opened, synchronize, reopened, labeled, unlabeled]. Sprint is a Projects field, not a label, so labeled/unlabeled events can't change the answer — they just spend Actions minutes and clutter the checks UI. Recommend trimming the example to [opened, synchronize, reopened].

3. Recommendations

  1. (P2-1) Replace curl with urllib.request in graphql() to remove the cmdline secret exposure and the inline import subprocess. ~25 LOC delta, no behavior change.
  2. (P2-2) Add --connect-timeout 10 --max-time 30 to curl (only if P2-1 isn't taken).
  3. (P2-3) Wrap int(lines[1]) in try/except ValueError and treat as retryable.
  4. (P2-4) When field is None, emit a Sprint-field-missing error before the generic scope hint.
  5. (P2-5) Bump projectItems(first: 20) to first: 100 (current GraphQL ceiling) or paginate.
  6. (P2-6) Drop labeled, unlabeled from the documented trigger set.

None of these block merge. Address them in a follow-up if cycle time is tight; otherwise rolling them into this PR is cheap.

4. Additional Observations

  • pull_request_target is the right call here — the reusable workflow does not invoke actions/checkout, does not run PR-supplied code, and does not consume any PR-controlled inputs other than pr-number/repo-name from the trusted event context. The classic pull_request_target foot-gun (running fork code with secrets) does not apply.
  • enforce-freshness design is sound — comparing the committed date (UTC, normalised to Shanghai date) against sprint_start (Shanghai date) at reusable-check-sprint.yml:269-281 correctly closes the documented stale-check window when the caller opts in. Worth mentioning in the rollout doc that branch-protection's "Require branches to be up to date before merging" combined with enforce-freshness: true is the belt-and-suspenders configuration.
  • fieldValueByName(name: "Sprint") at reusable-check-sprint.yml:208, 219 is case-sensitive. If a future project rename to "sprint" / "SPRINT" happens, the check silently degrades to "Sprint not set". Not a blocker, but worth one line in the README explaining the field name is exact-match.
  • Status check naming — branch-protection note at reusable-check-sprint.yml:64 says register check-sprint / check-sprint. That matches the caller pattern in the usage example (jobs.check-sprint calling reusable workflow whose job is also check-sprint). Callers who pick a different job name will need a different status-check name; consider calling that out explicitly.
  • Followup ops note: rotation policy for PROJECT_TOKEN (a long-lived PAT with read:project + read:org) belongs in the rollout runbook, not in this PR. Flagging only so it doesn't get lost.

…e docs

- FIX1: replace curl subprocess with urllib.request; token stays
  in-process (no /proc/pid/cmdline exposure); MAX_RETRIES→MAX_ATTEMPTS;
  explicit HTTP 401/403/404 error messages; timeout=30 on urlopen
- FIX2: explicit 'if field is None' check before attribute access;
  Sprint-field-renamed/removed now surfaces a clear error instead of
  a misleading 'check token scopes' hint
- FIX3: projectItems first:20→first:100 (GraphQL hard max); eliminates
  the silent truncation ceiling without needing pagination
- FIX4: remove 'labeled, unlabeled' from trigger example in header;
  label events cannot change a Projects Sprint field
- FIX5: clarify PROJECT_TOKEN scope docs — read:project+read:org for
  public repos; also add read:repo for private repos
@lml2468
Copy link
Copy Markdown
Contributor Author

lml2468 commented May 23, 2026

Round 3 fixes pushed (6ae2402) — addressing remaining P2 items from yujiawei and responding to Jerry-Xin's latest blocking concern:


Re: Jerry-Xin 🔴 blocking — project number vs ID matching

Design decision from 梦林 (project owner): Mininglamp-OSS has exactly one project board and will always have exactly one — project #2 (Octo Board). This is a stated design constraint, not an assumption. In this context, matching by project.number == PROJECT_NUMBER is both correct and unambiguous. There is no other project numbered 2 in our org, and PRs are only auto-added to this one project. The previous attempt to switch to ID-matching + pagination was reverted specifically because it introduced complexity without correctness benefit for our setup.


yujiawei P2 fixes (6ae2402):

FIX 1 — curl → urllib.request (P2-1)
Token is now set via Request(headers={"Authorization": ...}) — never appears in /proc/<pid>/cmdline. subprocess import removed entirely. MAX_RETRIES renamed to MAX_ATTEMPTS with docstring clarifying "3 attempts (2 retries)". HTTP 401/403/404 each produce specific error messages. urlopen(timeout=30) replaces the missing curl timeouts.

FIX 2 — Explicit null Sprint field guard (P2-4)
Added if field is None: check immediately after fetching field. When the Sprint field is renamed/removed on the board, users now see "Sprint field not found on project #2. Has the 'Sprint' field been renamed or removed?" — not the misleading "check token scopes" message.

FIX 3 — projectItems first:20 → first:100 (P2-5)
100 is the GraphQL hard max. Eliminates the silent ceiling without pagination complexity.

FIX 4 — Remove labeled/unlabeled from trigger example (P2-6)
Label events cannot change a GitHub Projects field. Trimmed to [opened, synchronize, reopened].

FIX 5 — Clarify PROJECT_TOKEN scope docs
Header and secret description now document: read:project + read:org for public repos; also add read:repo for private repos.

yujiawei
yujiawei previously approved these changes May 23, 2026
Copy link
Copy Markdown

@yujiawei yujiawei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review — PR #42 (.github)

Summary

A focused, well-engineered reusable workflow that blocks merges when a PR's Sprint field on Project V2 #2 is missing or does not match the current iteration. Single file (.github/workflows/reusable-check-sprint.yml, +403 LOC, no other changes). The security model is the right one for a gate that needs Project access — pull_request_target + zero actions/checkout + permissions: {} + dedicated read-only PROJECT_TOKEN PAT + env-var indirection into a single-quoted Python heredoc. Sprint window math, timezone handling, retry policy, and the optional freshness guard all look correct.

This commit visibly resolves the earlier review rounds' blocking items (stale-check disclosure + opt-in enforce-freshness, GraphQL variables, explicit field-not-found error, urllib over curl, bounded retries with backoff, projectItems(first: 100) ceiling). No P0 / P1 blockers found. Approving with a small set of P2 hardening suggestions worth picking up either inline or as a follow-up.

1. Verification

Check Result Evidence
pull_request_target model documented; PR code never executed reusable-check-sprint.yml:27-39 (security comment), no actions/checkout step anywhere
permissions: {} zeroes the inherited GITHUB_TOKEN reusable-check-sprint.yml:84
Secret declared required: true on the reusable boundary reusable-check-sprint.yml:75-78
All API calls go through PROJECT_TOKEN; no shell interpolation of PR-controlled fields reusable-check-sprint.yml:91-99 (env-var passthrough) + <<'PYTHON_SCRIPT' single-quoted heredoc at :101
Sprint window is half-open [start, start+duration) reusable-check-sprint.yml:185-188 (if start <= today < end)
today and committed_date both normalised to Asia/Shanghai reusable-check-sprint.yml:171-173 and 268-270
Retry policy is bounded, only on 5xx / network errors, no retry on 4xx or GraphQL errors reusable-check-sprint.yml:118-160 (MAX_ATTEMPTS = 3, exponential 1·2^attempt)
Explicit field is None check before generic scope-related error reusable-check-sprint.yml:178-181
Distinct error messages for "Sprint not set" vs "Sprint mismatch" reusable-check-sprint.yml:294-300 vs 307-313
Stale-check residual risk documented up front reusable-check-sprint.yml:3-22 (header banner)
30s urllib timeout caps each attempt reusable-check-sprint.yml:128 (urlopen(req, timeout=30))

2. Findings

P0 — Blockers

None.

P1 — Functional bugs / merge blockers

None.

P2 — Hardening (non-blocking)

P2-1. Project matching by number rather than idreusable-check-sprint.yml:289-292

for item in items:
    if item["project"]["number"] == PROJECT_NUMBER:
        board_item = item
        break

Project numbers are scoped per owner, so within Mininglamp-OSS this is unambiguous in practice today. But projectItems on a PR can return items from any org/user project the PR is attached to. If a PR ever gets added to another owner's project that happens to be numbered 2, the wrong project's Sprint value would be inspected. Defence-in-depth fix: extend the Step-1 query to also select projectV2(number: $projectNumber) { id }, then compare item.project.id == target_project_id here.

P2-2. projectItems(first: 100) is a silent ceilingreusable-check-sprint.yml:225, 253

Raising from 20 → 100 covers any realistic case today, but if the configured project ever falls onto a later page the check incorrectly reports "Sprint not set" with no signal to the developer about what actually happened. Either paginate via pageInfo.hasNextPage / endCursor until the target project is found, or at minimum emit print("::warning::projectItems page cap reached; result may be incomplete") when len(items) == 100 so truncation is visible in the Actions UI.

P2-3. Two near-duplicate GraphQL query strings for the PR fetchreusable-check-sprint.yml:219-251

When ENFORCE_FRESHNESS is true, pr_query is reassigned to a copy that adds commits(last: 1) { commit { committedDate } }. Functionally fine, but two query bodies are easy to drift out of sync on future edits. Always fetch commits(last: 1) (the cost is negligible) and only consume the field when ENFORCE_FRESHNESS is true.

P2-4. enforce-freshness defaults to falsereusable-check-sprint.yml:68-72

The file's own header (lines 3-22) calls out the stale-check rollover risk, and enforce-freshness: true is the documented mitigation. Defaulting to false means callers adopt the workflow without that mitigation by default. Two options:

  • Flip the default to true so callers opt out, not in.
  • Or, leave the default but explicitly note in the usage example which classes of repos should set it (e.g. repos with long-lived branches that can survive a sprint boundary).

P2-5. committedDate is committer-controlledreusable-check-sprint.yml:267-280

committedDate reflects GIT_COMMITTER_DATE, which the PR author can stamp arbitrarily on a rebase (git rebase with --committer-date-is-author-date etc.) or via --date. As a UX guard against accidental cross-sprint merges this is fine; it should not be described as an adversarial control. Worth one line in the header to make the threat model explicit. (pushedDate from the REST API is the harder-to-spoof alternative, but for the documented use case committedDate is acceptable.)

P2-6. Hardcoded fieldValueByName(name: "Sprint") is exact-matchreusable-check-sprint.yml:208, 240

If the project field is ever renamed (Sprintsprint / SPRINT / localised), the workflow silently degrades to "Sprint not set" for every PR. Either lift the field name to a workflow_call.inputs parameter, or add a one-line note in the header that the name is case-sensitive and changing it on the board requires updating this workflow.

Nit — MAX_ATTEMPTS = 3 is well-named, retry log message could be clearer

reusable-check-sprint.yml:131-134

print(f"::warning::GraphQL returned HTTP {http_code}, "
      f"retrying in {delay}s...")

Consider adding the attempt number (e.g. attempt 1/3) so flakes are easier to correlate across runs. Minor.

3. Items for human verification (security_sensitive)

  • PROJECT_TOKEN provisioning — confirm the PAT is a service-account token (not a personal token tied to a real maintainer) with exactly read:project + read:org (and read:repo for private business repos, as the header notes). The workflow can't enforce least-privilege at the secret level; that's an org-settings audit.
  • Branch-protection wiring on each consuming repo — the documented status-check name is check-sprint / check-sprint. This only matches when the caller's job is also named check-sprint. Worth one quick pass over the first couple of business-repo callers to confirm the literal string lines up before the gate is enforced in branch protection. Also ensure "Require branches to be up to date before merging" is set — it's the documented mitigation for the rollover stale-check window.
  • Octo Board iteration coverage — verify the Octo Board has iterations defined contiguously through at least the end of the current quarter, so "No active sprint covers today" is not the first signal that someone forgot to roll iterations forward.
  • Caller-side trigger types — the in-file usage example documents pull_request_target: [opened, synchronize, reopened]. Callers that adopt a different trigger set (e.g. only opened) will not re-evaluate after a developer sets Sprint and pushes a new commit; recommend pinning the trigger types in the rollout runbook.

4. Additional observations

  • The workflow's own CI (actionlint, No tabs in workflow files) is green at this SHA.
  • File placement (.github/workflows/reusable-check-sprint.yml) matches the existing reusable-*.yml convention in this repo.
  • ubuntu-latest runners ship Python ≥3.9, so zoneinfo.ZoneInfo("Asia/Shanghai") is guaranteed to resolve without tzdata fallback.
  • The earlier review round's subprocess / curl cmdline secret-exposure concern is fully resolved — urllib.request with header dict keeps PROJECT_TOKEN inside the Python process.

Verdict

APPROVED. Security model is right (no checkout, permissions: {}, scoped PAT, env-var-isolated single-quoted heredoc), Sprint logic and freshness guard are correct, retry policy is bounded and discriminating, and all prior round concerns are visibly resolved. The remaining items are P2 polish / defense-in-depth — none of them justify holding this PR.

Copy link
Copy Markdown

@Jerry-Xin Jerry-Xin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The workflow is relevant to this .github repo, but one project-selection bug should be fixed before merge.

🔴 Blocking

  • 🔴 Critical: .github/workflows/reusable-check-sprint.yml fetches each PR project item with only project { number }, then .github/workflows/reusable-check-sprint.yml selects the first item whose project number equals PROJECT_NUMBER. GitHub ProjectV2 numbers are scoped to the project owner, not globally unique, so a PR that is also in a user/repo/org project with the same number can cause this workflow to validate the wrong project item. That can incorrectly fail protected merges even when the PR is correctly configured on the Octo Board. Fix by querying the configured org project id in the first GraphQL query and comparing item["project"]["id"] against that ID.

💬 Non-blocking

✅ Highlights

  • Good security posture: permissions: {} and no checkout of PR code.
  • The error messages distinguish missing Sprint from mismatched Sprint and are actionable.
  • Retrying transient GraphQL/network failures is a useful robustness improvement.

CHANGE 1: lightweight project-ID matching (Jerry-Xin blocking resolved)
  - Step 1 fetches projectV2 { id }; Step 2 matches by project.id
  - No pagination loops; first:100 covers all realistic cases
  - Warning emitted if 100-item cap reached without finding the project

CHANGE 2: merge duplicate PR query strings (yujiawei P2-3)
  - Single query always fetches commits(last:1); only consumed when
    ENFORCE_FRESHNESS=true; eliminates future drift between two branches

CHANGE 3: PAT scope docs classic vs fine-grained (Jerry-Xin 🟡)
  - Header and secret description now document both classic and
    fine-grained PAT scope requirements, plus private repo addendum

CHANGE 4: tighten enforce-freshness header note (Jerry-Xin 🟡)
  - Explicitly states guard only runs on re-trigger, does NOT
    retroactively invalidate stale green checks from previous SHAs

CHANGE 5: committedDate is committer-controlled (yujiawei P2-5)
  - Added code comment: UX guard, not adversarial control; pushedDate
    is the spoofing-resistant alternative

CHANGE 6: fieldValueByName case-sensitivity (yujiawei P2-6)
  - Added comment warning field name is exact-match, case-sensitive

CHANGE 7: enforce-freshness usage guidance (yujiawei P2-4)
  - Usage example includes comment about when to set enforce-freshness
    to true (long-lived branches that may survive sprint rollover)

CHANGE 8: attempt number in all retry/failure messages (yujiawei Nit)
  - Retry warnings: '(attempt N/3)'; final errors: '(3/3)'
  - 403 error message now covers both classic and fine-grained PATs

CHANGE 9: replace ASCII box with plain comments (yujiawei Nit)
@lml2468
Copy link
Copy Markdown
Contributor Author

lml2468 commented May 23, 2026

Final comprehensive fix pushed (bf87470). All 9 review items addressed at once:

CHANGE 1 — Jerry-Xin 🔴 BLOCKING resolved: lightweight project-ID matching
Step 1 now fetches projectV2 { id }. Step 2 matches by item["project"]["id"] == project_id. No pagination loops — first: 100 covers all realistic cases. If the 100-item cap is ever hit without finding the project, a ::warning:: is emitted rather than silently failing.

Design note: Mininglamp-OSS has exactly one project board (#2) by design. The ID-based match is the technically correct fix that closes the theoretical ambiguity without adding pagination complexity.

CHANGE 2 — yujiawei P2-3: single PR query string
Always fetches commits(last: 1), only consumes it when ENFORCE_FRESHNESS=true. No more near-duplicate query bodies.

CHANGE 3 — Jerry-Xin 🟡: PAT scope docs (classic vs fine-grained)
Header and secret description now document both classic PAT (read:project + read:org) and fine-grained PAT ("Projects: Read" + "Organization custom roles: Read") requirements, plus private repo addendum.

CHANGE 4 — Jerry-Xin 🟡: tighten enforce-freshness note
Header explicitly states the guard only runs on re-trigger and does NOT retroactively invalidate stale green checks from previous SHAs.

CHANGE 5 — yujiawei P2-5: committedDate caveat
Code comment: UX guard, not adversarial control; GIT_COMMITTER_DATE is author-controlled; pushedDate is the spoofing-resistant alternative.

CHANGE 6 — yujiawei P2-6: fieldValueByName case-sensitivity
Comment warns the field name is exact-match; renaming on the board requires updating the query.

CHANGE 7 — yujiawei P2-4: enforce-freshness usage guidance
Usage example includes comment about when to set enforce-freshness: true.

CHANGE 8 — yujiawei Nit: attempt numbers in all messages
Retry warnings show (attempt N/3); final errors show (3/3). 403 error message covers both PAT types.

CHANGE 9 — yujiawei Nit: ASCII box → plain comments
Header is now plain # paragraphs.

Copy link
Copy Markdown

@Jerry-Xin Jerry-Xin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PR is relevant to this repository, but the workflow does not reliably enforce the stated merge policy.

🔴 Blocking

🔴 Critical — .github/workflows/reusable-check-sprint.yml:41-58, .github/workflows/reusable-check-sprint.yml:379-397: the check only re-evaluates on the caller’s pull_request_target events, so the required status can become stale whenever the Project Sprint field changes after a green run. A PR can pass with Sprint = current, then have Sprint removed or changed to a different iteration, and branch protection will still allow merge because no new commit/check is triggered by GitHub Projects field changes. This is a time-of-check/time-of-use bypass of the PR’s stated guarantee that every PR merged to main must belong to the active sprint. The existing rollover note covers date changes, but not board-field mutation, which is the primary state being enforced here. This needs either a design that re-checks closer to merge time or explicit documentation/configuration that this is a best-effort check, not a merge-time enforcement control.

💬 Non-blocking

🟡 Warning — .github/workflows/reusable-check-sprint.yml:302-375: projectItems(first: 100) avoids most practical truncation, but it still has no pagination. If a PR is attached to more than 100 projects and the Octo Board item is not in the first page, the workflow fails as “Sprint not set.” That is fail-closed, so it is not a bypass, but pagination would make the failure mode cleaner.

✅ Highlights

The workflow avoids checking out PR code under pull_request_target, uses a dedicated PAT instead of broadening GITHUB_TOKEN, parameterizes GraphQL variables, handles missing Sprint field distinctly, and provides actionable Type A / Type B error messages. The embedded Python compiles, and the PR’s workflow sanity checks passed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants