perf(pm): run install extract workers on rayon#2971
Conversation
There was a problem hiding this comment.
Code Review
This pull request refactors the package extraction process to use synchronous operations executed on a dedicated thread pool via rayon, replacing the previous asynchronous tokio tasks. Key changes include the introduction of extract_to_cache_sync and extract_and_write_sync, along with updates to the InstallScheduler to handle task completion through a new internal channel. Feedback indicates that error handling in extract_to_cache_sync should be improved by propagating I/O errors from try_exists() rather than swallowing them.
| pub fn extract_to_cache_sync(name: &str, version: &str, bytes: Bytes) -> Result<PathBuf> { | ||
| let cache_path = registry_cache_path(name, version); | ||
|
|
||
| if cache_path.join("_resolved").try_exists().unwrap_or(false) { |
There was a problem hiding this comment.
The use of .unwrap_or(false) on the result of try_exists() swallows potential I/O errors. If try_exists() fails for a reason other than the file not existing (e.g., permission issues), the error is ignored, and the code proceeds as if the file doesn't exist. This can lead to redundant work and obscure the root cause of the failure.
It's better to propagate the error using the ? operator, which is consistent with how errors are handled elsewhere in this function and in extract_and_write_sync.
| if cache_path.join("_resolved").try_exists().unwrap_or(false) { | |
| if cache_path.join("_resolved").try_exists()? { |
📊 pm-bench-phases ·
|
| PM | wall | ±σ | user | sys | RSS | pgMinor |
|---|---|---|---|---|---|---|
| bun | 9.31s | 0.31s | 10.66s | 10.29s | 671M | 329.3K |
| utoo-next | 8.02s | 0.24s | 10.55s | 12.19s | 940M | 123.1K |
| utoo-npm | 8.73s | 0.97s | 10.96s | 12.47s | 966M | 121.6K |
| utoo | 7.42s | 0.17s | 11.26s | 11.89s | 946M | 148.4K |
| PM | vCtx | iCtx | netRX | netTX | cache | node_mod | lock |
|---|---|---|---|---|---|---|---|
| bun | 17.7K | 19.3K | 1.20G | 7M | 1.90G | 1.77G | 1M |
| utoo-next | 126.1K | 89.0K | 1.17G | 5M | 1.73G | 1.72G | 2M |
| utoo-npm | 138.1K | 98.5K | 1.17G | 5M | 1.73G | 1.72G | 2M |
| utoo | 100.4K | 66.1K | 1.17G | 6M | 1.72G | 1.72G | 3M |
p1_resolve
| PM | wall | ±σ | user | sys | RSS | pgMinor |
|---|---|---|---|---|---|---|
| bun | 2.10s | 0.08s | 4.08s | 1.15s | 518M | 173.5K |
| utoo-next | 3.03s | 0.06s | 5.38s | 2.28s | 624M | 88.0K |
| utoo-npm | 3.08s | 0.05s | 5.47s | 2.21s | 618M | 79.0K |
| utoo | 2.45s | 0.08s | 6.04s | 1.79s | 656M | 127.2K |
| PM | vCtx | iCtx | netRX | netTX | cache | node_mod | lock |
|---|---|---|---|---|---|---|---|
| bun | 9.5K | 4.7K | 203M | 3M | 108M | - | 1M |
| utoo-next | 76.7K | 96.9K | 201M | 2M | 7M | 3M | 2M |
| utoo-npm | 76.2K | 98.3K | 201M | 3M | 7M | 3M | 2M |
| utoo | 16.1K | 21.8K | 204M | 3M | 7M | 3M | 2M |
p3_cold_install
| PM | wall | ±σ | user | sys | RSS | pgMinor |
|---|---|---|---|---|---|---|
| bun | 6.75s | 0.13s | 6.44s | 9.71s | 637M | 205.5K |
| utoo-next | 6.97s | 1.40s | 5.18s | 10.69s | 490M | 64.1K |
| utoo-npm | 6.16s | 0.28s | 5.32s | 10.62s | 484M | 65.7K |
| utoo | 6.55s | 1.61s | 5.18s | 10.68s | 518M | 64.0K |
| PM | vCtx | iCtx | netRX | netTX | cache | node_mod | lock |
|---|---|---|---|---|---|---|---|
| bun | 5.7K | 7.2K | 1.00G | 4M | 1.77G | 1.77G | 1M |
| utoo-next | 113.0K | 50.2K | 1000M | 3M | 1.72G | 1.72G | 2M |
| utoo-npm | 111.3K | 49.7K | 1000M | 3M | 1.72G | 1.72G | 2M |
| utoo | 101.3K | 56.5K | 1000M | 3M | 1.72G | 1.72G | 2M |
p4_warm_link
| PM | wall | ±σ | user | sys | RSS | pgMinor |
|---|---|---|---|---|---|---|
| bun | 3.57s | 0.04s | 0.17s | 2.51s | 135M | 31.6K |
| utoo-next | 2.41s | 0.28s | 0.53s | 3.88s | 78M | 18.3K |
| utoo-npm | 2.42s | 0.12s | 0.51s | 3.82s | 80M | 18.9K |
| utoo | 2.33s | 0.05s | 0.54s | 3.90s | 62M | 14.3K |
| PM | vCtx | iCtx | netRX | netTX | cache | node_mod | lock |
|---|---|---|---|---|---|---|---|
| bun | 251 | 17 | 47K | 18K | 1.88G | 1.77G | 1M |
| utoo-next | 43.4K | 19.6K | 6K | 26K | 1.72G | 1.72G | 2M |
| utoo-npm | 42.2K | 20.3K | 2K | 6K | 1.72G | 1.72G | 2M |
| utoo | 53.7K | 24.8K | 2K | 4K | 1.73G | 1.72G | 2M |
npmmirror.com: no output captured.
GHA run 1 readRun: https://github.com/utooland/utoo/actions/runs/26017369914
Conclusion: reject as a standalone fold. Removing the outer tokio task around extract work helps part of p3 vCtx, but it does not reduce iCtx enough and hurts p4 ctx. Keep #2965 as the proven scheduling boundary; this extract change needs a different design before it is worth combining. |
Summary
AB experiment for p3 cold install scheduling.
The install scheduler currently spawns a tokio task for extraction, and that task immediately awaits the extractor rayon job. This change removes that outer tokio task for extract work:
extract_and_write_sync/extract_to_cache_syncOpDone::Extractto the scheduler through an internal completion channelHypothesis
p3 cold install spends real time in tarball decompression and package-cache writes. The tokio wrapper around rayon extraction does not do useful async work; it only adds task scheduling and wakeups. Moving extract workers directly onto rayon should reduce ctx and possibly wall in p3 without affecting p1.
Validation
Passed:
cargo fmtcargo test -p utoo-pm service::install_scheduler::testscargo test -p utoo-pm util::downloader::testscargo test -p utoo-pm test_extract_and_write_synccargo clippy -p utoo-pm --all-targets -- -D warnings --no-depsFull workspace clippy is not repeated here; it is currently blocked in this worktree by the known
pack-core/next.jsAPI mismatch observed on #2969 validation.Benchmark Plan
Label-trigger GHA bench. Main read is p3 wall and vCtx/iCtx vs same-run
utoo-next,utoo-npm, andbun; p4 should be neutral unless extract cache probing changes show up.