Skip to content

Prototype: cache CMake FetchContent populated dir with actions/cache#1685

Draft
bghgary wants to merge 7 commits intoBabylonJS:masterfrom
bghgary:prototype-fetchcontent-cache
Draft

Prototype: cache CMake FetchContent populated dir with actions/cache#1685
bghgary wants to merge 7 commits intoBabylonJS:masterfrom
bghgary:prototype-fetchcontent-cache

Conversation

@bghgary
Copy link
Copy Markdown
Contributor

@bghgary bghgary commented May 4, 2026

Goal

Measure CI configure-step savings from caching CMake's FetchContent populated dir via actions/cache. Discussion in [internal session]; recap below.

Why

Configure step today (last green master run 25109347097):

Job Configure step
Win32_x64_D3D11 142s
UWP_x64 189s
MacOS / build 158s
iOS_iOS180 180s
Linux/Android (configure inside Build step) est 60-100s

A warm-cache CMake configure (no network, deps already on disk) typically runs ~20-40s. Estimated savings:

  • Per-job typical: ~100-160s on Win32/Mac/iOS/UWP, ~50-80s on Linux/Android.
  • Per-pipeline wall time: ~2-3 min off the longest job (Win32_x64_D3D11_Sanitizers at ~16 min).
  • Aggregate compute per CI run: ~45-70 min CI minutes saved across the 28-job matrix.
  • Variance: removes the 30-min UWP timeout flake we recently hit, where configure alone took 16:35 on a contended Windows runner.

Approach

Each reusable build-*.yml workflow:

  1. Adds an actions/cache@v4 step right after checkout:
    • path: ${{ github.workspace }}/.fc-cache
    • key: fc-${{ runner.os }}-${{ hashFiles('**/CMakeLists.txt') }}
    • restore-keys: fc-${{ runner.os }}- (partial-match fallback)
  2. Passes two new flags to the cmake configure invocation:
    • -D FETCHCONTENT_BASE_DIR=${{ github.workspace }}/.fc-cache — redirects FetchContent's populated dir to the cache location.
    • -D FETCHCONTENT_UPDATES_DISCONNECTED=ON — once a dep is in cache, don't git fetch to re-validate it.

Cache key

Per-OS (line endings + file timestamps differ across runners) and hashes every CMakeLists.txt. This invalidates the cache on any GIT_TAG bump, new FetchContent_Declare, or PATCH_COMMAND change while leaving it intact across unrelated source-file changes.

Scope

6 workflows: build-{linux,macos,ios,uwp,win32,win32-shader}.yml. Android skipped — it goes through gradle which dispatches cmake; needs a separate edit to the gradle build script. Will add as a follow-up if results are good here.

Validation plan

  1. First CI run on this PR: cold cache for every job (no key match yet). Configure timings should be similar to master baseline.
  2. Push an empty commit (or have the PR re-trigger via amend-equivalent push) — second run should hit warm cache. Measure delta.
  3. Compare warm-cache step timings vs master baseline; report.

Risks / things to verify

  • Cache size per OS (need to stay under GHA's 10 GB total quota per repo).
  • FETCHCONTENT_UPDATES_DISCONNECTED=ON behavior when a dep is partially populated (e.g., previous run failed mid-fetch). Should work because it only skips remote re-validation; missing deps still get fetched.
  • PATCH_COMMAND outcomes get cached as part of the source-tree state. If a patch script is changed, the CMakeLists hash should change to invalidate. Verify by intentionally bumping a patch.
  • Cross-job cache collision: multiple jobs sharing the same key may all populate from scratch on the first run. Subsequent runs share. Acceptable.

[Created by Copilot on behalf of @bghgary]

bghgary and others added 7 commits May 4, 2026 16:49
Prototype: redirect CMake's FETCHCONTENT_BASE_DIR to a stable
workspace-relative path (.fc-cache) and have actions/cache restore /
save it across runs. Pass FETCHCONTENT_UPDATES_DISCONNECTED=ON so
warm-cache runs don't validate sources against remotes.

Cache key is per-OS plus a hash of every CMakeLists.txt, so any
GIT_TAG bump, new dep, or patch change invalidates the cache while
unrelated repo changes don't.

Applied to all reusable build workflows except build-android.yml
(Android goes through gradle which calls cmake indirectly; will be a
follow-up).

Goal of this PR: measure cache-hit configure-step savings vs the
master baseline. Master configure step is ~140-190s across Win32 /
UWP / macOS / iOS today; warm-cache target is ~20-40s. Worst
observed master configure step was 16+ minutes on a contended Windows
runner (UWP_x64 timed out at 30 min job limit), which a hot cache
should eliminate.

[Created by Copilot on behalf of @bghgary]

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The first run produced 4 macOS / iOS failures with:
  CMake Error: Error: generator : Xcode
  Either remove the CMakeCache.txt file and CMakeFiles directory
  or choose a different binary directory.

Cause: parallel macOS-runner jobs (MacOS / build with Xcode generator,
MacOS_Ninja / build with Ninja generator, iOS_iOS180 with Xcode +
IOS=ON) all shared the same cache key `fc-macOS-<hash>`. The first
job to complete saved a cache containing
`<dep>-subbuild/CMakeCache.txt` recording its generator;
parallel/subsequent jobs restored that cache and CMake refused to
reuse a subbuild dir produced under a different generator.

Fix: scope cache keys per workflow purpose so jobs whose subbuild
state would differ get separate caches. Workflow-level prefixes:

  fc-linux-<hash>
  fc-macos-<generator>-<hash>
  fc-ios-<hash>
  fc-uwp-<platform>-<hash>
  fc-win32-<platform>-<hash>
  fc-win32shader-<platform>-<hash>

This keeps cross-job sharing within a workflow + platform + generator
combination, which is the largest legal sharing scope: e.g. all
Win32_x64_* jobs share one cache, but Win32_arm64 gets its own.

[Created by Copilot on behalf of @bghgary]

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Cold-cache run was 28/28 green; warm-cache run on the same SHAs got 3 failures with 'precompiled header file is from a different version of the compiler' (C1853) on Windows. Cause: the cache included <dep>-build/ subdirs which contain compilation artifacts (PCH, .obj, .lib) that aren't safe to share across runner instances. Only the source-clone (<dep>-src) and download-orchestration (<dep>-subbuild) directories are safe to cache.

Use actions/cache path negation (!path) to exclude *-build subdirs while still caching the rest of FETCHCONTENT_BASE_DIR.

[Created by Copilot on behalf of @bghgary]

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
First warm-cache run hit the same C1853 PCH error: actions/cache restore happens BEFORE the path negation rules apply at save time, so the old (broken) cache from the pre-exclude commit was restored despite the new exclude rules. Bump 'fc-' prefix to 'fc-v2-' so existing cache entries don't match and a fresh population happens with the exclude rules in effect from the start.

[Created by Copilot on behalf of @bghgary]

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@bghgary
Copy link
Copy Markdown
Contributor Author

bghgary commented May 5, 2026

#1685 FetchContent Cache Prototype — Status Report

TL;DR

Prototype works for cold-cache — overall pipeline is green when caches don't hit. Warm-cache fails reliably with MSVC C1853 "precompiled header file is from a different version of the compiler" because actions/cache@v4 is not honoring my !*-build/** exclude pattern, so per-job build artifacts (PCH files) are saved into the shared cache and corrupt subsequent runs.

Net: the simple "cache ${FETCHCONTENT_BASE_DIR} with negation excludes" approach doesn't work as I designed it. A more invasive approach (FETCHCONTENT_SOURCE_DIR_<name> per-dep overrides) is needed before this can be merged. Recommendation: don't merge this prototype as-is, revisit with a different caching strategy.

What I tried (chronologically)

Attempt 1: simple actions/cache on ${FETCHCONTENT_BASE_DIR}

  • Cache key: fc-${{ runner.os }}-${{ hashFiles('**/CMakeLists.txt') }}
  • Path: whole ${{ github.workspace }}/.fc-cache
  • Cmake invocation: -D FETCHCONTENT_BASE_DIR=... -D FETCHCONTENT_UPDATES_DISCONNECTED=ON

Result: 4 macOS/iOS jobs failed with CMake Error: Error: generator : Xcode / Either remove the CMakeCache.txt file.... Cause: runner.os is macOS for both real-macOS and iOS jobs, and macOS itself uses Xcode and Ninja generators in different jobs. The first apple job to finish saved a cache with <dep>-subbuild/CMakeCache.txt recording its generator; parallel jobs using a different generator restored that cache and CMake refused to reuse the subbuild dir.

Attempt 2: per-workflow / per-platform / per-generator cache keys

Bumped each workflow's cache key prefix:

  • fc-linux-...
  • fc-macos-${{ inputs.generator }}-...
  • fc-ios-...
  • fc-uwp-${{ inputs.platform }}-...
  • fc-win32-${{ inputs.platform }}-...
  • fc-win32shader-${{ inputs.platform }}-...

Result: cold-cache run passed 28/28 ✅. But the warm-cache run hit a NEW failure: 3 Windows jobs failed with error C1853: precompiled header file is from a different version of the compiler on glslang-build/.../cmake_pch.pch. Cause: the cache was saving the entire ${FETCHCONTENT_BASE_DIR} including <dep>-build/ directories that contain build artifacts (.obj, .pch, .lib). When restored to a different runner, the PCH was incompatible.

Attempt 3: exclude *-build/** from cache path

path: |
  ${{ github.workspace }}/.fc-cache
  !${{ github.workspace }}/.fc-cache/*-build
  !${{ github.workspace }}/.fc-cache/*-build/**

Result: cache size stayed at 623 MB (same as before exclude was added) and the C1853 error fired again. The exclude pattern is silently ignored by actions/cache@v4, or the syntax is subtly wrong. This is a known-tricky area of actions/cache.

Attempt 4: bumped cache keys to v2 to invalidate stale caches

Suspected the failures might have been from restoring pre-exclude caches. Bumped all keys from fc-X-... to fc-v2-X-... to force fresh population.

Result: cold-cache passed 28/28 ✅. Warm-cache run still hit C1853 with Cache Size: ~623 MB. So the exclude pattern is genuinely not working — even with a fresh cache populate that runs with the exclude pattern in effect, the saved cache still contains the build dirs. Confirms that actions/cache@v4 is not honoring !*-build/** for nested directories.

Timing data — what I actually got

Master baseline (run 25109347097, last green push) vs prototype cold-cache (run 25352089195, after v2 keys + excludes nominally in effect):

Job Master total Cold-cache total Δ
Win32_x64_D3D11 633s 657s +24s
Win32_x64_D3D12 499s 382s -117s
Win32_x64_D3D11_Sanitizers 983s 981s ±0
Win32_x64_V8_D3D11 580s 579s ±0
Win32_x64_JSI_D3D11 530s 578s +48s
UWP_x64 434s 379s -55s
UWP_arm64 483s 402s -81s
MacOS / build 311s 433s +122s
MacOS_Xcode26 272s 291s +19s
MacOS_Sanitizers 485s 327s -158s
MacOS_Ninja 147s 226s +79s
iOS_iOS180 322s 254s -68s
iOS_iOS175 402s 273s -129s
Ubuntu_Clang_JSC 350s 394s +44s
Ubuntu_GCC_JSC 362s 363s ±0

This is cold-cache vs master, single sample each. All deltas are within plausible runner-variance bands (similar to what we saw for #1680). With the warm-cache run failing, I never got the actual warm-cache savings measurement.

Why the exclude pattern doesn't work (hypothesis)

actions/cache@v4 uses @actions/glob for path matching. Looking at the documented behavior, !path/** should exclude. But the cache size before/after the exclude was added is identical, suggesting the cache action's tar-creation step is including the directories anyway.

Possible reasons:

  1. The exclude pattern needs to be a sibling to the include with specific formatting that I got wrong.
  2. actions/cache@v4 has a known limitation where directory excludes don't propagate to the underlying tar invocation.
  3. The cache was already saved before excludes took effect, and subsequent restores keep the old content (would explain it on V1 keys, but V2 keys should have been a fresh save).

I didn't fully nail down which it is — the cache size being unchanged after v2-key-bump is the strongest signal that excludes genuinely aren't working for me.

Recommended next steps (if revisiting)

The simple "cache ${FETCHCONTENT_BASE_DIR}" approach has a fundamental issue: ${FETCHCONTENT_BASE_DIR} contains both source clones (cache-friendly) and build artifacts (cache-hostile). Need to separate them.

Option A: per-dep FETCHCONTENT_SOURCE_DIR_<name> overrides

Cache only source clones in a directory that has nothing build-output in it. Then for each dep, pass -DFETCHCONTENT_SOURCE_DIR_<name>=... to point CMake at the cached source.

# Need to enumerate each dep
set(FETCHCONTENT_SOURCE_DIR_arcana.cpp ${CACHE_DIR}/arcana.cpp)
set(FETCHCONTENT_SOURCE_DIR_jsruntimehost ${CACHE_DIR}/jsruntimehost)
# ... 6+ more

CMake skips git clone for any dep with the override set. Subbuild dir is still created in ${PROJECT_BINARY_DIR}/_deps, so subbuild metadata is per-build (correct). Build dir likewise per-build.

This is the most robust approach but requires enumerating each dep — manageable but invasive.

Option B: cache only *-src subpaths explicitly

path: ${{ github.workspace }}/.fc-cache/*-src

Caches only source dirs, no subbuild, no build. Combined with a CMake-side mechanism to detect "source already exists, skip download." CMake's FetchContent_Populate in 3.30+ has built-in handling for this via FETCHCONTENT_TRY_FIND_PACKAGE_MODE and source-dir detection but isn't automatic.

Option C: separate CACHE_DIR for sources, leave _deps default

Set FETCHCONTENT_BASE_DIR to the default (under build dir) so subbuild and build go to the build dir as normal. Use a separate dir for cached sources, and pass FETCHCONTENT_SOURCE_DIR_<name> overrides to that dir per dep. Same as Option A but with the source dir lifted out of _deps/.

My recommendation: Option A. It's the most explicit, gives the best cache hit rates (per-dep keys can be even more granular if a single dep changes its GIT_TAG), and avoids the actions/cache exclude-pattern issue entirely.

Cost vs benefit summary

  • Cold-cache configure (the worst-case): 16 min on bad-runner days, ~3 min normal.
  • Hot-cache configure: ~30 sec target.
  • Estimated savings if warm-cache worked: ~100-160s per Win/Mac/iOS/UWP job.
  • Pipeline wall time: bounded by Win32_x64_D3D11_Sanitizers (~16 min) — savings would shave ~2 min off the longest job.
  • Aggregate compute: ~45-70 min CI per run, ~10 hr/day across all PRs and master.
  • Most valuable benefit: removes the "30-min UWP timeout" flake from configure-step variance.

This is enough win to justify ~1-2 days of further work on Option A. Recommendation: hold #1685 as draft; revisit when there's time, with Option A.

Files in this PR (current state)

  • bghgary:prototype-fetchcontent-cache @ 41efc730 — current head with v2 keys + (broken) exclude pattern.
  • 6 reusable workflows modified: build-{linux,macos,ios,uwp,win32,win32-shader}.yml.
  • Each adds an actions/cache@v4 step + 2 cmake flags.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant