Optimize CodeQL scanning with sharding and job separation#1367
Open
5an7y-Microsoft wants to merge 6 commits intomicrosoft:mainfrom
Open
Optimize CodeQL scanning with sharding and job separation#13675an7y-Microsoft wants to merge 6 commits intomicrosoft:mainfrom
5an7y-Microsoft wants to merge 6 commits intomicrosoft:mainfrom
Conversation
Split samples alphabetically into 4 equal shards using ListAllSamples.ps1 and run each on a separate machine in parallel. ThrottleLimit stays 1 per shard for accurate CodeQL tracing. Each shard uploads SARIF with a distinct category (shard-0..shard-3) so results merge in the Security tab. On PRs the existing Build-ChangedSamples behavior is preserved (no sharding needed since only changed files are built). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Job name and step name now show '1 of 4' through '4 of 4' (shard+1) instead of '0' through '3 of 4' which looked like an off-by-one bug - Move build-mode back into the matrix (was there before sharding) so it remains explicit and consistent with the original workflow structure Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
GitHub Actions expressions don't support arithmetic operators so 'matrix.shard + 1' was invalid. Switch the matrix to [1, 2, 3, 4] so the job name, step name, and log output all read '1 of 4' through '4 of 4'. The PowerShell script subtracts 1 locally for the 0-based slice math. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The previous single-job approach with a 4-shard matrix caused PRs to spin up 4 identical runners all building the same changed files. Split into two jobs: - analyze-pr: runs only on pull_request, single runner, builds changed samples via Build-ChangedSamples.ps1. No shard matrix overhead. - analyze: runs on push/schedule, 4-shard matrix, each shard builds its slice of all samples. Also drops the now-unnecessary 'get-changed-files' step from the push path. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
ci: split CodeQL into separate PR and push/schedule jobs
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This pull request updates the
CodeQL AnalysisGitHub Actions workflow to improve efficiency and reduce analysis time, especially for large pushes. The workflow now distinguishes between pull request and push/scheduled events: PRs analyze only changed samples in a single job, while pushes and scheduled runs use sharding to build and analyze all samples in parallel across four jobs. Several steps and dependencies have also been updated.Workflow structure improvements:
analyze-pr(for pull requests, builds only changed samples in a single job) andanalyze(for pushes/schedules, splits all samples across 4 parallel shards for faster analysis). [1] [2]Performance and efficiency:
ThrottleLimit 1to maintain accurate CodeQL tracing.Dependency and version updates:
github/codeql-action/initandgithub/codeql-action/analyzeto use versionv4instead ofv3for improved features and support.Documentation: