perf(pm): handle warm registry cache hits in scheduler by elrrrrrrr · Pull Request #2966 · utooland/utoo

elrrrrrrr · 2026-05-18T05:43:14Z

What

AB experiment for p4 warm-link ctx: let the install scheduler handle registry cache hits synchronously in the main loop before enqueueing download work.

Why

In p4 the lockfile and registry cache are already warm, but every registry package still enters a worker task just to check <cache>/<name>/<version>/_resolved. This keeps the scheduler state centralized while removing one per-package tokio worker from the all-cache-hit path.

Notes

Cache miss behavior is unchanged: misses still enqueue download work and use the existing async downloader.
The sync probe is limited to the cheap _resolved marker check and only updates scheduler-owned state.
Adjusted the scheduler dedupe unit test to avoid depending on whether the local machine happens to have react@18.2.0 in cache.

Validation

cargo fmt
cargo test -p utoo-pm
cargo clippy -p utoo-pm --all-targets -- -D warnings --no-deps

Bench plan

Trigger linux/npmjs phase bench and compare mainly:

phase	expected signal
p3 cold install	should be neutral; cache misses still download through the same queue
p4 warm link	target phase; expect lower ctx from skipping per-package cache-hit worker tasks

gemini-code-assist

Code Review

This pull request introduces a synchronous cache lookup mechanism, registry_cache_lookup_sync, to the installation scheduler to optimize performance for cache hits by bypassing the download queue. The changes include the implementation of the sync lookup function and updates to the scheduler's logic and tests. Feedback suggests that performing synchronous I/O within the main scheduler loop may block the async executor, potentially leading to performance degradation on slow filesystems, and recommends considering spawn_blocking or a batching mechanism.

gemini-code-assist · 2026-05-18T05:45:15Z

+        if let Some(cache_path) = registry_cache_lookup_sync(&package.name, &package.version) {
+            self.download_done.insert(key, cache_path.clone());
+            if let Some(spec) = waiter {
+                self.clone_queue.push_back(ReadyClone { spec, cache_path });
+            }
+            return;
+        }


Performing synchronous I/O (registry_cache_lookup_sync) inside the main scheduler loop can block the async executor. While this is intended as a performance optimization to avoid task spawning overhead for cache hits, it introduces a risk of head-of-line blocking for the entire installation process if the filesystem is slow (e.g., network drives or high I/O wait). Since the scheduler is responsible for pumping all downloads and clones, any delay here affects overall throughput. Consider if this trade-off is acceptable for all supported environments, or if a small batching mechanism or spawn_blocking should be used for the probe.

github-actions · 2026-05-18T05:58:15Z

📊 pm-bench-phases · `d60b3ff` · linux (`ubuntu-latest`)

Workflow run — ant-design

PMs: utoo (this branch) · utoo-npm (latest published) · bun (latest)

npmjs.org

p0_full_cold

PM	wall	±σ	user	sys	RSS	pgMinor
bun	9.27s	0.15s	10.64s	10.32s	689M	326.7K
utoo-next	8.08s	0.18s	10.94s	12.35s	928M	120.9K
utoo-npm	8.09s	0.22s	11.02s	12.25s	978M	123.4K
utoo	7.90s	0.23s	11.89s	12.52s	943M	150.6K

PM	vCtx	iCtx	netRX	netTX	cache	node_mod	lock
bun	17.3K	19.7K	1.20G	7M	1.88G	1.76G	1M
utoo-next	127.2K	85.6K	1.17G	5M	1.73G	1.72G	2M
utoo-npm	123.5K	90.9K	1.17G	5M	1.73G	1.72G	2M
utoo	123.0K	96.5K	1.17G	6M	1.72G	1.72G	2M

p1_resolve

PM	wall	±σ	user	sys	RSS	pgMinor
bun	2.13s	0.06s	4.29s	1.15s	533M	169.0K
utoo-next	3.15s	0.11s	5.50s	2.28s	625M	84.6K
utoo-npm	3.23s	0.06s	5.54s	2.32s	616M	88.5K
utoo	2.53s	0.07s	6.21s	1.75s	656M	124.9K

PM	vCtx	iCtx	netRX	netTX	cache	node_mod	lock
bun	9.4K	4.9K	203M	3M	108M	-	1M
utoo-next	76.5K	94.1K	201M	3M	7M	3M	2M
utoo-npm	77.2K	93.6K	201M	3M	7M	3M	2M
utoo	15.4K	20.5K	204M	3M	7M	3M	2M

p3_cold_install

PM	wall	±σ	user	sys	RSS	pgMinor
bun	6.93s	0.22s	6.46s	9.98s	626M	210.1K
utoo-next	7.28s	2.23s	5.57s	11.20s	522M	62.1K
utoo-npm	7.36s	2.09s	5.56s	11.06s	460M	61.3K
utoo	6.76s	1.69s	5.28s	10.92s	476M	57.3K

PM	vCtx	iCtx	netRX	netTX	cache	node_mod	lock
bun	5.4K	7.2K	1.00G	4M	1.77G	1.77G	1M
utoo-next	123.9K	50.8K	1001M	3M	1.72G	1.72G	3M
utoo-npm	113.1K	49.8K	1000M	3M	1.72G	1.72G	3M
utoo	123.5K	80.8K	1000M	3M	1.72G	1.72G	3M

p4_warm_link

PM	wall	±σ	user	sys	RSS	pgMinor
bun	3.51s	0.03s	0.19s	2.48s	134M	32.7K
utoo-next	2.56s	0.44s	0.53s	3.94s	79M	18.6K
utoo-npm	2.48s	0.07s	0.53s	3.84s	80M	18.5K
utoo	2.43s	0.15s	0.51s	3.90s	62M	14.3K

PM	vCtx	iCtx	netRX	netTX	cache	node_mod	lock
bun	266	25	5M	9K	1.93G	1.74G	1M
utoo-next	42.1K	19.0K	12K	22K	1.72G	1.72G	2M
utoo-npm	40.1K	19.1K	15K	9K	1.72G	1.72G	2M
utoo	50.6K	24.0K	13K	24K	1.73G	1.72G	2M

npmmirror.com: no output captured.

elrrrrrrr · 2026-05-18T05:59:51Z

GHA run 1 read

Run: https://github.com/utooland/utoo/actions/runs/26015795276

phase	utoo wall	utoo ctx	same-run utoo-next	same-run bun	read
p3_cold_install	6.76s ±1.69	123.5K / 80.8K	7.28s, 123.9K / 50.8K	6.93s, 5.4K / 7.2K	wall ok but iCtx regresses materially
p4_warm_link	2.43s ±0.15	50.6K / 24.0K	2.56s, 42.1K / 19.0K	3.51s, 266 / 25	not positive; ctx is worse than same-run baselines and much worse than #2965

Conclusion: do not fold this as-is. Moving only the registry cache-hit probe into the scheduler does not reduce p4 scheduling cost; it likely just shifts the hot path while seeded-cache probe and clone worker costs remain dominant.

elrrrrrrr · 2026-05-20T03:59:42Z

Closing this PM performance experiment after the investigation phase. The benchmark data and conclusions are preserved in the PR body/comments; we will split the validated pieces into smaller reviewable PRs for the formal ship path.

perf(pm): handle warm registry cache hits in scheduler

4e772a2

elrrrrrrr added A-Pkg Manager Area: Package Manager benchmark Run pm-bench on PR labels May 18, 2026

gemini-code-assist Bot reviewed May 18, 2026

View reviewed changes

elrrrrrrr closed this May 20, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf(pm): handle warm registry cache hits in scheduler#2966

perf(pm): handle warm registry cache hits in scheduler#2966
elrrrrrrr wants to merge 1 commit into
perf/pm-resolver-demand-bfsfrom
exp/pm-install-sync-cache-hit-b33d922

elrrrrrrr commented May 18, 2026

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

gemini-code-assist Bot May 18, 2026

Uh oh!

github-actions Bot commented May 18, 2026

Uh oh!

elrrrrrrr commented May 18, 2026

Uh oh!

elrrrrrrr commented May 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

elrrrrrrr commented May 18, 2026

What

Why

Notes

Validation

Bench plan

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot May 18, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions Bot commented May 18, 2026

📊 pm-bench-phases · d60b3ff · linux (ubuntu-latest)

npmjs.org

p0_full_cold

p1_resolve

p3_cold_install

p4_warm_link

Uh oh!

elrrrrrrr commented May 18, 2026

GHA run 1 read

Uh oh!

elrrrrrrr commented May 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

📊 pm-bench-phases · `d60b3ff` · linux (`ubuntu-latest`)