Skip to content

[CI] Add parity auto-trigger workflow#3231

Open
ethanwee1 wants to merge 1 commit into
developfrom
ethanwee/parity-auto-every-commit
Open

[CI] Add parity auto-trigger workflow#3231
ethanwee1 wants to merge 1 commit into
developfrom
ethanwee/parity-auto-every-commit

Conversation

@ethanwee1
Copy link
Copy Markdown

@ethanwee1 ethanwee1 commented May 18, 2026

Summary

  • Add a scheduled parity auto-trigger that scans completed pytorch/pytorch main trunk.yml pushes and dispatches parity.yml once per ready upstream SHA.
  • Gate dispatch on the ROCm arch workflows that actually ran for a SHA, plus the CUDA jobs consumed by parity, so partial reports are avoided.
  • Add a pull_request dry-run path with a smaller scan window to validate the scanner without creating parity reports from PR CI.

How it works

  • The workflow runs every 10 minutes and queries recent completed pytorch/pytorch trunk.yml push runs on main. Those trunk runs provide the candidate upstream SHAs to evaluate.
  • For each candidate SHA, it first checks recent ROCm/pytorch parity.yml run titles. If any existing parity run already contains that SHA, the SHA is skipped so we keep one report per upstream commit.
  • Maximum number of dispatches of parity.yml are 50, which is comfortably above the maximum number of commits to main branch of pytorch/pytorch in any 10-minute interval
  • It then lists all upstream workflow runs for that SHA and determines which ROCm arches actually ran. Missing periodic arch workflows are not treated as pending work; only arches with matching workflow files are expected in that report.
  • For the arches that did run, it lists upstream check-runs and waits for the matching ROCm test shards to reach status=completed. It also waits for the CUDA default, distributed, and inductor check-runs consumed by parity.
  • Auxiliary shards such as mem_leak_check and rerun_disabled_tests are ignored because the parity report does not consume them.
  • Once all relevant ROCm and CUDA check-runs are complete, it dispatches parity.yml with the ready arch list and a CSV prefix containing the upstream SHA, for example autoparity-YYYYMMDD-<sha>.
  • Pull request runs are forced to dry_run=true, so they exercise the scanner and log would-be dispatches without creating reports. Scheduled and manually dispatched runs can create real parity reports.

Test plan

Dispatch cadence note

  • The full validation run used max_dispatches=5 only to avoid flooding ROCm/pytorch during manual testing.
  • The production scheduled workflow runs every 10 minutes and defaults to max_dispatches=50, max_commits=200, and max_age_hours=72 unless manually overridden.

Add a scheduled scanner that dispatches one parity report per ready upstream PyTorch main commit, with PR dry-runs to validate readiness without creating reports.
@rocm-repo-management-api
Copy link
Copy Markdown

rocm-repo-management-api Bot commented May 18, 2026

Jenkins build for f4dfbd8845f2d05dd28225ca78af48c1926d9e31 commit finished as FAILURE
Links: Pipeline Overview / Build artifacts / Test Results

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a new GitHub Actions workflow to automatically scan recent completed trunk.yml push runs on main, determine when the relevant ROCm + CUDA check-runs for a given upstream SHA are fully complete, and then dispatch parity.yml (with a PR-only dry-run mode).

Changes:

  • Introduces a scheduled (every 10 minutes) + manual + PR dry-run “parity auto-trigger” workflow.
  • Implements SHA deduplication by checking existing parity.yml run titles in the current repo.
  • Gates dispatch on completion of ROCm arch shard check-runs (only for arch workflows detected as having run) plus specific CUDA check-runs used by parity.

Comment on lines +130 to +140
COMMITS=$(gh api \
"repos/$UPSTREAM/actions/workflows/trunk.yml/runs?branch=$BRANCH&event=push&status=completed&per_page=$MAX_COMMITS" \
--jq '
reduce .workflow_runs[] as $run ({seen:{}, rows:[]};
if .seen[$run.head_sha] then .
else .seen[$run.head_sha] = true | .rows += [$run]
end
)
| .rows[]
| "\(.head_sha) \(.created_at)"
')
Comment on lines +55 to +63
description: 'JSON: arch -> PCRE regex that matches the check-run names of that arch''s ROCm test shards on pytorch/pytorch. An arch is considered "ready" only when every check-run whose name matches has status=completed (so we wait for all test shards, not just workflow completion).'
required: false
default: '{"mi355":"rocm.*mi355.*/ test [(](default|distributed|inductor),","mi300":"rocm.*mi300.*/ test [(](default|distributed|inductor),","mi200":"(rocm.*(mi200|mi210).*/ test [(](default|distributed|inductor),|linux-jammy-rocm-py3[.]10 / test [(](default|distributed|inductor),)","navi31":"rocm.*navi31.*/ test [(]default,","nightly":"rocm-nightly.*/ test [(](default|distributed|inductor),"}'
type: string
arch_workflow_regex_map:
description: 'JSON: arch -> PCRE regex that matches workflow file paths for upstream ROCm workflows that mean this arch ran on the SHA. Missing workflows mean the arch is not expected for that commit.'
required: false
default: '{"mi355":"(^|/)(trunk|rocm-mi355|periodic-rocm-mi355|inductor-rocm-mi355)[.]yml$","mi300":"(^|/)(rocm-mi300|periodic-rocm-mi300|inductor-rocm-mi300)[.]yml$","mi200":"(^|/)(trunk-rocm-sandbox|rocm-mi200|periodic-rocm-mi200|inductor-rocm-mi200)[.]yml$","navi31":"(^|/)(rocm-navi31|periodic-rocm-navi31|inductor-rocm-navi31)[.]yml$","nightly":"(^|/)rocm-nightly[.]yml$"}'
type: string
Comment on lines +148 to +162
# Pull recent parity runs. Run titles look like:
# "<csv_name or SHA> · mi355, mi300, mi200"
# Once any parity run exists for a SHA, we do not dispatch another
# report for that SHA. This keeps the dashboard to one report per
# upstream commit.
EXISTING=$(gh run list \
--repo "$GITHUB_REPOSITORY" \
--workflow parity.yml \
--limit 1000 \
--json displayTitle 2>/dev/null || echo '[]')

sha_already_dispatched() {
local sha="$1"
echo "$EXISTING" | jq -e --arg sha "$sha" \
'any(.[]; .displayTitle | contains($sha))' >/dev/null
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants