Add GPU E2E stage to Linux VHD builder pipeline by ganeshkumarashok · Pull Request #8138 · Azure/AgentBaker

ganeshkumarashok · 2026-03-19T23:26:05Z

Summary

Adds a gpu_e2e stage to the Linux VHD builder pipeline (.vsts-vhd-builder.yaml) that runs GPU E2E tests after VHD build completes
With standalone GPU E2E pipeline triggers disabled in fix: disable automatic e2e pipeline triggers #8135, this ensures GPU E2E tests continue to run as part of the VHD build flow
The new stage runs in parallel with the existing e2e and scriptless_cse_cmd_e2e stages, all depending on the build stage

Test plan

Verify the VHD builder pipeline runs the new gpu_e2e stage after build succeeds
Confirm GPU-tagged E2E scenarios execute with the freshly built VHDs
Confirm non-GPU E2E stages are unaffected

With the standalone GPU E2E pipeline triggers disabled in #8135, GPU E2E tests need to run as part of the VHD build pipeline. This adds a gpu_e2e stage that runs after the build stage, matching the configuration from the standalone e2e-gpu.yaml.

github-actions · 2026-03-19T23:26:47Z

PR Title Lint Failed ❌

Current Title: Add GPU E2E stage to Linux VHD builder pipeline

Your PR title doesn't follow the expected format. Please update your PR title to follow one of these patterns:

Conventional Commits Format:

feat: add new feature - for new features
fix: resolve bug in component - for bug fixes
docs: update README - for documentation changes
refactor: improve code structure - for refactoring
test: add unit tests - for test additions
chore: remove dead code - for maintenance tasks
chore(deps): update dependencies - for updating dependencies
ci: update build pipeline - for CI/CD changes

Guidelines:

Use lowercase for the type and description
Keep the description concise but descriptive
Use imperative mood (e.g., "add" not "adds" or "added")
Don't end with a period

Examples:

✅ feat(windows): add secure TLS bootstrapping for Windows nodes
✅ fix: resolve kubelet certificate rotation issue
✅ docs: update installation guide
❌ Added new feature
❌ Fix bug.
❌ Update docs

Please update your PR title and the lint check will run again automatically.

Copilot

Pull request overview

Adds GPU E2E coverage back into the Linux VHD build pipeline by introducing a dedicated gpu_e2e stage that runs after the build stage, in parallel with existing E2E stages—ensuring GPU scenarios continue to execute even with standalone GPU E2E pipeline triggers disabled.

Changes:

Add a new gpu_e2e stage to .pipelines/.vsts-vhd-builder.yaml.
Configure the stage to run GPU-tagged Linux scenarios (TAGS_TO_RUN=gpu=true) with a custom timeout and capacity-skip behavior.
Run the stage in parallel with existing e2e and scriptless_cse_cmd_e2e stages (all depend on build).

Copilot · 2026-03-19T23:29:51Z

.pipelines/.vsts-vhd-builder.yaml

+  - stage: gpu_e2e
+    dependsOn: build
+    condition: and(succeeded(), ne(variables.SKIP_E2E_TESTS, 'true'))
+    variables:
+      VHD_BUILD_ID: $(Build.BuildId)
+      TAGS_TO_RUN: "gpu=true"
+      TAGS_TO_SKIP: "os=windows"
+      SKIP_TESTS_WITH_SKU_CAPACITY_ISSUE: "true"
+      E2E_GO_TEST_TIMEOUT: "75m"
+    jobs:
+      - template: ./templates/e2e-template.yaml
+        parameters:
+          name: Linux GPU Tests
+          IgnoreScenariosWithMissingVhd: true
+


Adding gpu_e2e makes a 3rd parallel stage using templates/e2e-template.yaml. That template publishes an artifact named $(LOGGING_DIR), and .pipelines/scripts/e2e_run.sh sets LOGGING_DIR using date +%s (seconds resolution). Parallel jobs that start within the same second can end up with the same artifact name and cause intermittent pipeline failures when publishing artifacts. Consider making the log/artifact name deterministically unique per job/stage (e.g., include stage/job ID or use higher-resolution time) so concurrent E2E stages can’t collide.

awesomenix · 2026-03-20T01:23:23Z

.pipelines/.vsts-vhd-builder.yaml

+    condition: and(succeeded(), ne(variables.SKIP_E2E_TESTS, 'true'))
+    variables:
+      VHD_BUILD_ID: $(Build.BuildId)
+      TAGS_TO_RUN: "gpu=true"


Lets pick some few cases of GPU to run, tag them with GPU_Basic or something like that and run it part of PR

Copilot AI review requested due to automatic review settings March 19, 2026 23:26

ganeshkumarashok requested review from AbelHu, Devinwong, awesomenix, calvin197, cameronmeissner, djsly, junjiezhang1997, lilypan26, mxj220, pdamianov-dev, phealy, r2k1, sulixu, surajssd, timmy-wright and zachary-bailey as code owners March 19, 2026 23:26

ganeshkumarashok temporarily deployed to test March 19, 2026 23:26 — with GitHub Actions Inactive

Copilot started reviewing on behalf of ganeshkumarashok March 19, 2026 23:27 View session

Copilot AI reviewed Mar 19, 2026

View reviewed changes

awesomenix reviewed Mar 20, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add GPU E2E stage to Linux VHD builder pipeline#8138

Add GPU E2E stage to Linux VHD builder pipeline#8138
ganeshkumarashok wants to merge 1 commit intomainfrom
aganeshkumar/gpu-e2e-post-vhd-build

ganeshkumarashok commented Mar 19, 2026

Uh oh!

github-actions bot commented Mar 19, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Mar 19, 2026

Uh oh!

awesomenix Mar 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

ganeshkumarashok commented Mar 19, 2026

Summary

Test plan

Uh oh!

github-actions bot commented Mar 19, 2026

PR Title Lint Failed ❌

Conventional Commits Format:

Guidelines:

Examples:

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI Mar 19, 2026

Choose a reason for hiding this comment

Uh oh!

awesomenix Mar 20, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants