Skip to content

ci: experiment with parallel Windows shard setup + build#13828

Draft
midleman wants to merge 27 commits into
mainfrom
mi/windows-parallel-setup
Draft

ci: experiment with parallel Windows shard setup + build#13828
midleman wants to merge 27 commits into
mainfrom
mi/windows-parallel-setup

Conversation

@midleman
Copy link
Copy Markdown
Contributor

Summary

Experimental fan-out: drop needs: [setup-windows] on e2e-windows shards so their ~15 min of artifact-independent setup (Python, R, Quarto, Graphviz, secrets) runs in parallel with the build job. Shards poll for the build artifact internally before downloading + extracting.

  • Build runs as before (~45 min)
  • Shards start at workflow trigger and do their setup in parallel
  • Shards wait for build, then download + extract, then run tests

QA Notes

  • Goal: ~15 min off Windows critical path on test-merge
  • Built on top of mi/cache-e2e-test-deps (ci: Windows build-once + Linux PR worker isolation #13615); validate that branch first
  • The test-merge.yml temporary pull_request trigger from the parent branch is still active, so this PR will fire test-merge.yml on push for validation

midleman and others added 27 commits May 27, 2026 08:45
…EMFILE errors

Retries EMFILE (too many open files) errors during gulp vscode-win32-x64 packaging.
Same fix VS Code uses in their build tooling.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
fromLocalWebpack was constructing File objects for all extension files at
once and passing them via es.readArray(), which opened thousands of
ReadStreams simultaneously. On Windows, this hit the OS EMFILE (too many
open files) limit when packaging positron-python's 11,784 python_files.

Switch to createSequentialFileStream (already used by fromLocalNormal)
which opens files one at a time via a pump queue, matching the pattern
that fromLocalNormal already relied on.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The EMFILE fix in build/lib/extensions.ts (sequential file opening via
createSequentialFileStream) validated successfully on Windows CI. The
spike workflow and the original graceful-fs patch are no longer needed.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Add test-e2e-windows-build.yml that compiles Positron once and uploads
the full workspace (node_modules + compiled output) as a tar artifact.
Strip the build steps from test-e2e-windows-run.yml so each shard just
downloads and extracts that artifact instead of compiling independently.

Update test-merge.yml (3 shards), test-full-suite.yml (4 shards), and
test-pull-request.yml to call the build workflow first, then pass the
artifact to the run shards.

Also fix a pre-existing bug: test-merge.yml's slack-notify-win-extensions
was checking needs.e2e-windows-electron.outputs.extensions_failed but
the job is named e2e-windows.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Each Playwright worker now gets its own Xvfb display (:10 + parallelIndex)
so workers don't race on a shared X server. The setup-xvfb action accepts
a workers input and starts that many displays at :10..:(10+N-1).

Linux PR e2e-electron workers bumped from 2 to 3, expected to drop the
job from ~31m to ~25m on cold cache / ~19m on warm cache.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
3 workers saved ~2m off Linux PR e2e. Trying 4 to see if there's another
1-2m available before scaling hits a ceiling.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The spike confirmed zstd is available on windows-latest-8x runners
(191ms compress vs gzip's 1228ms, 320ms decompress vs 416ms). The
production change landed in test-e2e-windows-build.yml; this validator
is no longer needed.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Fail fast if WORKERS is not a positive integer so an invalid value can't
silently start zero Xvfb displays. Also verify zstd is on PATH before
extracting the .tar.zst Windows build artifact so a missing tool fails
loudly instead of with a cryptic decompression error.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
3 and 4 workers both surface monaco-workbench readiness timeouts on
ubuntu-latest-8x (4 Electrons cold-starting at once exceeds the 30s
checkPositronReady window). The savings (~2m vs 2 workers) is too small
to justify the added flake risk.

Keeping the per-worker Xvfb infrastructure so future runners with more
cores can dial workers up without code changes.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… to 4)

Linux electron merge now runs 3 shards (was 2). Windows merge now runs
4 shards (was 3), benefiting from the build-once pattern shipped in
this branch where setup-windows produces one artifact that all shards
consume. Adjust max_failures to keep the per-shard ratio comparable
(20 total / N shards rounded up).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
REVERT THIS COMMIT BEFORE MERGING.

- Add pull_request trigger to test-merge.yml, gated on changes to that file
- Comment out slack-notify and slack-notify-win-extensions so the PR run
  doesn't spam #positron-rstats-test-results

Goal: prove the new shard counts (Linux 3, Windows 4) actually run as
expected before main starts using them.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…setup

Two Windows shard wins, both targeting the 8-15min of setup overhead
before tests even start:

1. Add Defender exclusions for the workspace and temp paths before
   extracting the build artifact. Defender's real-time scanning of the
   ~50k small files in node_modules trees is the dominant cost of the
   extract step (currently ~8 min). Excluding these paths typically
   cuts extract time 3-5x on Windows GitHub-hosted runners.

2. Drop the 'Set up Quarto' action. Positron bundles its own Quarto
   1.9.38 into the build artifact (build/lib/quarto.ts) and uses it
   at runtime. The CI-installed Quarto was never used by Positron or
   exercised by e2e tests, which interact through the UI rather than
   shelling out to the quarto CLI. Saves ~2 min per shard.

Both changes are reversible — Defender exclusions only apply to the
ephemeral runner, and Quarto can be re-added if tests turn out to
need a system Quarto we haven't found yet.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…hromium

Push each suite to 6 parallel shards on test-merge to see if we hit
new bottlenecks (build-once artifact, predictive sharding distribution,
or runner availability) before settling on a permanent ceiling.

max_failures: 4 across all suites (20 / 6 = 3.33, rounded up).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Apply the same Defender path exclusion to setup-windows/build-windows
as we added to the e2e shards. This covers npm install (100k+ files,
currently ~22 min), cache restore, compile (writes thousands of .js),
and Playwright browser install (~600 MB). Realistic upside is 5-10
minutes off the 43 min build job.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
6 chromium shards was a copy of the Linux/Windows bump but chromium
has a smaller test footprint that doesn't justify the extra shard.
5 keeps it under the artifact-contention concern raised by the
multi-shard download throttling.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The earlier Defender exclusion only covered github.workspace
(D:\a\positron\positron) but the extract step writes to dirname() of
that — D:\a\positron — which Defender was still scanning. That's why
extract only dropped from 8m to 6m instead of the projected 2-3m.

Now also exclude:
- Split-Path -Parent of github.workspace (actual extract destination)
- D:\a (GitHub Actions Windows runner workspace root)
- Keep C:\a for safety

Also log Get-MpPreference.ExclusionPath after applying to verify each
path actually got registered.

Applies to both build-windows and the e2e shard run.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Previous commit (c02396e) removed this on the assumption that Positron's
bundled Quarto 1.9.38 was sufficient for e2e tests. That was wrong:
the e2e tests under test/e2e/tests/quarto/ failed without a system
Quarto installed. Restoring the action as-is.

The cost is real (~2m/shard × 6 = ~12 runner-min/merge) but tests pass
is non-negotiable. Worth revisiting later as a separate effort, possibly
via cache or runner image, after we have more data on what specifically
needs system Quarto vs the bundled binary.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
EXPERIMENT branched from mi/cache-e2e-test-deps.

Today the e2e-windows shards wait for setup-windows to complete before
doing anything, then spend ~15 min on Python/R/Quarto/Graphviz setup
that doesn't actually need the build artifact. That 15 min sits idle on
the critical path.

This change overlaps that 15 min of setup with the ~45 min build:

1. Drop `needs: [setup-windows]` on e2e-windows in test-merge.yml so
   shards start at workflow trigger time, in parallel with build.

2. Move the artifact-dependent steps (download, Defender exclusion,
   extract) to AFTER all the artifact-independent setup. Insert a new
   polling step that waits for the build job to complete before
   downloading.

Expected wallclock saving: ~15 min off the Windows critical path on
test-merge runs, since shard setup now happens during the build window
instead of after it.

Caveats:
- Build job failure: the polling step detects this and fails early
- More runner-min consumed (shard waits idle if its setup finishes
  before the build does) vs the current model where the shard hadn't
  even started yet
- Brittle to upstream job name changes (currently matches "build-windows")

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant