Skip to content

cicd: move lint to ubuntu-latest with caching, tune pytest#924

Draft
thomashebrard wants to merge 1 commit into
devfrom
fix/gha
Draft

cicd: move lint to ubuntu-latest with caching, tune pytest#924
thomashebrard wants to merge 1 commit into
devfrom
fix/gha

Conversation

@thomashebrard
Copy link
Copy Markdown
Member

@thomashebrard thomashebrard commented May 20, 2026

Why

CI is currently ~€2000/month and 10–13 min per PR. The lint workflow runs on self-hosted Azure runners which are paid-by-the-minute even though ruff / pyright / mypy are single-threaded and don't benefit from the multi-core machines. The mypy step alone is ~2 min per matrix entry because the .mypy_cache is thrown away every run.

Public OSS projects (NumPy, pandas, Pydantic, FastAPI, mypy itself) all do this: lint on ubuntu-latest with .mypy_cache persisted via actions/cache, tests where the cores actually help. We're not doing anything novel here, just applying the standard pattern.

What changed

lint-check.yml — all jobs to ubuntu-latest, with caching

  • runs-on: [aca-runner-lint]runs-on: ubuntu-latest for all three lint jobs (lint-ruff-plxt, lint-typecheck, lint-config-sync)
  • Adds astral-sh/setup-uv@v3 with enable-cache: true and cache-dependency-glob: uv.lock. uv's ~/.cache/uv is restored from prior runs, keyed on uv.lock. Install: ~2 min → ~8s on warm cache.
  • Adds actions/cache@v4 for .mypy_cache in the typecheck job. Key includes matrix python-version + uv.lock + pyproject.toml. Three-tier restore-keys for graceful fallback when those change. mypy: ~2 min → ~25–30s on incremental runs.

lint-fresh-check.yml — new safety-net workflow

mypy's incremental mode is designed to give the same result as a fresh run, but has historically had soundness bugs around plugin state, complex generics, and stub-version drift. To address that risk:

  • Triggers on push to dev / main and a weekly cron (Monday 04:00 UTC)
  • Full 5-version typecheck matrix
  • No mypy cache restore — fresh from-scratch run every time
  • If incremental mypy ever drifts from fresh, this catches it within minutes of merge (or once a week on the cron) instead of accumulating bad state

Pattern: cache on PR (fast, probably right), fresh on main (slow, definitely right). Standard for serious OSS Python.

Makefile — pytest tuning in gha-tests target

Pure performance tuning, no behavioral change to which tests run or pass:

  • --dist=worksteal — better xdist load balancing for uneven test durations
  • --tb=line — tighter failure output, less stdout I/O across workers
  • -p no:cacheprovider — skip .pytest_cache writes (each runner is ephemeral)
  • --no-header — cosmetic, skip banner

Expected: 5–15% pytest wall-time reduction depending on test duration variance.

What didn't change (and why)

  • Tests workflow still on Azure. tests-check.yml keeps runs-on: [aca-runner-test]. pytest with -n auto is genuinely CPU-bound and benefits from 8 cores — that's the one place self-hosting pays for itself. Azure-side cost cuts ship as a separate infra change (ACA Consumption profile + Azure Files cache mount; ~€1200/mo → ~€150/mo).
  • Boot test smoke step kept in tests-check.yml. Adds ~27s but catches install-broken cases fast.
  • 5-version Python matrix kept on PRs in both lint and tests.

Expected impact

Today After this PR After + infra change
Lint wall time (per matrix entry) ~5m 30s ~1m 50s unchanged
Tests wall time (per Python version) ~7 min ~6m 40s (worksteal) ~6 min (Azure Files cache)
Lint cost ~€800/mo of Azure spend €0 (free for public repo) €0
Test cost ~€1200/mo of Azure spend unchanged ~€150/mo
Total ~€2000/mo ~€1200/mo ~€150/mo

Not in scope (future work)

  • ci-test extra in pyproject.toml.venv is currently 1.4 GB because make install does uv sync --all-extras, which pulls in torch (355 MB), opencv (119 MB), transformers (54 MB) for the docling extra. None of that is exercised by tests excluded from PR (the extract/img_gen/search/pipelex_api markers). A targeted ci-test extra could cut install ~3×. Needs careful audit of which extras have imports that load on PR — separate PR.
  • pytest-split sharding — would get tests to ~2m45s wall time. Requires pytest-split dep + committed .test_durations file + 5×4 matrix expansion. Separate PR if/when wanted.
  • Pipelex.make() fixture scope — the autouse module-scoped reset_pipelex_config_fixture runs per test module. If Pipelex.make() is expensive (worth profiling), promoting to session scope where safe could shave wall time.

How to verify

  • This PR's own CI: lint jobs should run on ubuntu-latest, finish in ~2 min, show cache hit/miss steps in logs.
  • Tests workflow should run on Azure as before (no change there).
  • After merge to dev: lint-fresh-check.yml fires and runs the full fresh matrix in ~5 min — that's the bulletproof check.

🤖 Generated with Claude Code


Summary by cubic

Move lint to GitHub-hosted ubuntu-latest with caching and add a fresh typecheck safety net; also tune pytest flags. This cuts lint time ~3× and drops Azure lint cost to €0, with tests unchanged.

  • Refactors

    • Lint jobs now run on ubuntu-latest with astral-sh/setup-uv@v3 cache (keyed on uv.lock) and actions/cache@v4 for .mypy_cache.
    • Expected: install ~8s on warm cache; mypy ~25–30s incremental; lint ~1m50s per matrix entry.
    • Makefile: pytest tuned with --dist=worksteal, --tb=line, -p no:cacheprovider, --no-header.
  • New Features

    • Added lint-fresh-check.yml: runs on push to dev/main and weekly cron.
    • Full matrix typecheck without restoring .mypy_cache to verify incremental results against a fresh run.

Written for commit f9964ed. Summary will update on new commits. Review in cubic

Lint workflow runs on GitHub-hosted ubuntu-latest (free for public repo)
instead of self-hosted Azure runners. ruff/pyright/mypy are single-threaded
so the 2-core ubuntu-latest matches what the Azure D8 lint runner was
giving anyway, and the install + mypy caches make this 3× faster.

Changes:

* .github/workflows/lint-check.yml — all jobs to runs-on: ubuntu-latest.
  Adds astral-sh/setup-uv@v3 with uv-cache enabled (keyed on uv.lock) so
  the previously ~2-minute install drops to ~8s on warm cache. Adds
  actions/cache@v4 for .mypy_cache, keyed on uv.lock + pyproject.toml +
  matrix python-version, with three-tier restore-keys for graceful
  fallback. Expected: mypy 2 min → ~25-30s on incremental runs.

* .github/workflows/lint-fresh-check.yml — new workflow. Runs the full
  typecheck matrix on push to dev/main and weekly cron, with NO mypy
  cache restore. This is the safety net for the cached PR checks: if
  mypy's incremental mode ever drifts from a fresh run, this catches
  it within minutes of merge instead of letting bad state accumulate.

* Makefile gha-tests target — tightens the pytest invocation:
  --dist=worksteal (newer xdist load balancing for uneven test
  durations, 5-15% wall-time win), --tb=line (less stdout I/O on
  failures across workers), -p no:cacheprovider (no .pytest_cache
  writes since each runner is ephemeral), --no-header (cosmetic).
  No behavioral change to which tests run or pass.

Tests workflow itself (tests-check.yml) is unchanged — tests stay on
self-hosted Azure runners where multi-core pytest-xdist parallelism
matters. Azure-side improvements (Consumption profile + Azure Files
cache mount for uv) ship in a separate infra repo change.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant