Skip to content

perf: reduce e2e test suite time by ~20%#3139

Draft
benjaminleonard wants to merge 2 commits intomainfrom
benjaminleonard/e2e-perf-analysis
Draft

perf: reduce e2e test suite time by ~20%#3139
benjaminleonard wants to merge 2 commits intomainfrom
benjaminleonard/e2e-perf-analysis

Conversation

@benjaminleonard
Copy link
Contributor

@benjaminleonard benjaminleonard commented Mar 19, 2026

Reduce e2e test suite time by ~20% locally and ~3x on CI through five targeted optimizations:

  1. Shard e2e tests 3-way per browser — Playwright distributes tests across machines with --shard N/M. This reduces wall-time per browser by ~66% since parallelism shifts from cores to jobs.
  2. Add FAST_MOCK env var — Skip/reduce artificial API delays in mock handlers. Global delay: 50-150ms (was 200-400ms). Disk import/metrics kept higher (1000ms, 400ms) for tests that observe transient UI states.
  3. Reduce closeToast sleep — 1000ms → 500ms. Saves ~12s across test suite.
  4. Lower expect timeout — 10s → 7s for faster failure detection.
  5. Reduce scroll-restore sleeps — 1000ms → 500ms per test.

Local wall-clock: 2:13 → 1:48 (19% faster). All 267 core tests pass; pre-existing flaky tests (pagination, action-menu) under full contention remain unchanged.


Running on GitHub to verify a. it passes and b. it has meaningful improvement without flake

Implement 5 optimizations to cut e2e testing down significantly:

1. Shard e2e tests 3-way in CI (3 shards × 3 browsers = 9 jobs instead of 3).
   Playwright distributes tests across machines with --shard N/M, reducing
   wall time by ~66% per browser since parallelism is now across CI jobs.

2. Add FAST_MOCK env var to skip/reduce artificial API delays in mock handlers.
   Global request delay: 50-150ms (was 200-400ms).
   Disk import/stop: 1000ms (was 2000ms) — kept higher for transient state tests.
   Metrics queries: 400ms (was 1000ms) — kept for loading indicator visibility.

3. Reduce closeToast sleep from 1000ms → 500ms (saves ~12s across test suite).

4. Lower expect timeout from 10s → 7s for faster failure detection.

5. Reduce scroll-restore test sleeps from 1000ms → 500ms (saves 2s per run).

Local wall-clock: 2:13 → 1:48 (19% faster). CI should see similar reductions
plus 3x gain from sharding, totaling ~3x wall-time reduction per browser.

All 267 core e2e tests pass; some pre-existing flaky tests remain flaky under
full contention (pagination, action-menu) but pass in isolation—unrelated.

Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
@vercel
Copy link

vercel bot commented Mar 19, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
console Ready Ready Preview Mar 19, 2026 11:12am

Request Review

The sharded matrix jobs have dynamic names like "Playwright (chrome,
shard 1/3)" which don't match the old required check name. This adds a
single "Playwright" job that aggregates all shard results so branch
protection can point at one stable name.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@david-crespo
Copy link
Collaborator

Successful in 2 minutes is certainly appealing, though 7 minutes was not so bad. It looks like at least one of the sleeps was reduced too much. That's my one real concern: making things flakier. I'll try fixing that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants