perf(pm): skip missing node_modules cleanup#2980
Conversation
There was a problem hiding this comment.
Code Review
This pull request introduces significant performance improvements to the package manager by transitioning several file system and extraction operations from asynchronous tasks to synchronous ones, utilizing rayon for parallel execution. It also adds a prefetching mechanism for registry tarballs and optimizes the InstallScheduler to handle cache hits more efficiently. The reviewer identified two areas for code cleanup: the redundant check for io::ErrorKind::AlreadyExists in create_dir_all and a redundant cleanup check in clone_sync that is already handled by its caller.
| && let Err(e) = fs::create_dir_all(dir) | ||
| && e.kind() != io::ErrorKind::AlreadyExists | ||
| { | ||
| return Err(e).with_context(|| err_msg.clone()); |
There was a problem hiding this comment.
The check e.kind() != io::ErrorKind::AlreadyExists is redundant here. std::fs::create_dir_all already handles existing directories and returns Ok(()) if the directory already exists. If it returns an error, it's for other reasons (such as a file existing at that path), which should likely be treated as a failure rather than being ignored.
if created_dirs.insert(dir.clone())
&& let Err(e) = fs::create_dir_all(dir)
{| if dst.try_exists()? | ||
| && let Err(e) = std::fs::remove_dir_all(dst) | ||
| { | ||
| tracing::warn!("Failed to clean target directory {}: {}", dst.display(), e); | ||
| } |
There was a problem hiding this comment.
📊 pm-bench-phases ·
|
| PM | wall | ±σ | user | sys | RSS | pgMinor |
|---|---|---|---|---|---|---|
| bun | 9.11s | 0.23s | 10.54s | 10.27s | 695M | 327.2K |
| utoo-next | 7.90s | 0.37s | 10.57s | 12.13s | 961M | 116.2K |
| utoo-npm | 8.81s | 0.92s | 10.75s | 12.24s | 982M | 125.7K |
| utoo | 7.75s | 0.61s | 11.13s | 11.72s | 956M | 147.1K |
| PM | vCtx | iCtx | netRX | netTX | cache | node_mod | lock |
|---|---|---|---|---|---|---|---|
| bun | 17.7K | 19.5K | 1.20G | 7M | 1.88G | 1.75G | 1M |
| utoo-next | 127.0K | 91.1K | 1.17G | 5M | 1.73G | 1.72G | 2M |
| utoo-npm | 136.0K | 96.6K | 1.17G | 5M | 1.73G | 1.72G | 2M |
| utoo | 87.5K | 53.9K | 1.18G | 6M | 1.73G | 1.72G | 2M |
p1_resolve
| PM | wall | ±σ | user | sys | RSS | pgMinor |
|---|---|---|---|---|---|---|
| bun | 2.10s | 0.08s | 3.99s | 1.18s | 514M | 173.7K |
| utoo-next | 3.12s | 0.07s | 5.31s | 2.23s | 609M | 87.4K |
| utoo-npm | 3.09s | 0.05s | 5.32s | 2.29s | 618M | 80.9K |
| utoo | 2.49s | 0.10s | 5.99s | 1.77s | 660M | 126.0K |
| PM | vCtx | iCtx | netRX | netTX | cache | node_mod | lock |
|---|---|---|---|---|---|---|---|
| bun | 9.6K | 4.5K | 203M | 3M | 108M | - | 1M |
| utoo-next | 76.5K | 93.0K | 201M | 3M | 7M | 3M | 2M |
| utoo-npm | 77.5K | 97.1K | 201M | 3M | 7M | 3M | 2M |
| utoo | 16.2K | 18.4K | 204M | 3M | 7M | 3M | 3M |
p3_cold_install
| PM | wall | ±σ | user | sys | RSS | pgMinor |
|---|---|---|---|---|---|---|
| bun | 6.57s | 0.05s | 6.42s | 9.66s | 616M | 203.8K |
| utoo-next | 5.86s | 0.15s | 5.10s | 10.35s | 470M | 62.0K |
| utoo-npm | 6.94s | 1.41s | 5.25s | 10.58s | 483M | 61.9K |
| utoo | 6.07s | 1.01s | 4.92s | 10.11s | 554M | 63.5K |
| PM | vCtx | iCtx | netRX | netTX | cache | node_mod | lock |
|---|---|---|---|---|---|---|---|
| bun | 5.5K | 7.9K | 1.00G | 4M | 1.77G | 1.77G | 1M |
| utoo-next | 96.7K | 49.2K | 1000M | 2M | 1.72G | 1.72G | 2M |
| utoo-npm | 112.9K | 50.5K | 1000M | 3M | 1.72G | 1.72G | 2M |
| utoo | 73.8K | 44.5K | 1000M | 3M | 1.72G | 1.72G | 2M |
p4_warm_link
| PM | wall | ±σ | user | sys | RSS | pgMinor |
|---|---|---|---|---|---|---|
| bun | 3.35s | 0.00s | 0.20s | 2.45s | 135M | 32.3K |
| utoo-next | 2.34s | 0.10s | 0.48s | 3.72s | 79M | 18.1K |
| utoo-npm | 2.13s | 0.02s | 0.47s | 3.76s | 80M | 18.2K |
| utoo | 1.88s | 0.05s | 0.34s | 3.38s | 49M | 11.0K |
| PM | vCtx | iCtx | netRX | netTX | cache | node_mod | lock |
|---|---|---|---|---|---|---|---|
| bun | 243 | 23 | 5M | 34K | 1.93G | 1.74G | 1M |
| utoo-next | 39.2K | 17.2K | 7K | 10K | 1.72G | 1.72G | 2M |
| utoo-npm | 42.8K | 20.5K | 5K | 6K | 1.72G | 1.72G | 2M |
| utoo | 13.5K | 9.7K | 2K | 6K | 1.73G | 1.72G | 2M |
npmmirror.com: no output captured.
Summary
node_modulesdirectory does not existHypothesis
The phase bench prepare step removes
node_modulesbefore p0, p3, and p4.clean_depsstill enters the unused-package cleanup path and builds glob iterators fornode_modules/*/package.jsonandnode_modules/@*/*/package.json. Returning early for a missing directory avoids those filesystem probes.Expected effect is small; this is mainly a ctx/syscall cleanup experiment.
Validation
cargo fmtCARGO_NET_OFFLINE=true cargo test -p utoo-pm service::clean::tests::test_remove_unused_packages_missing_node_modulesCARGO_NET_OFFLINE=true cargo clippy -p utoo-pm --all-targets -- -D warnings --no-depsBenchmark Result
GHA run: https://github.com/utooland/utoo/actions/runs/26025635458
npmjs phase bench:
Conclusion: no clear ctx/syscall win. #2975 already had p4 around
13.1-13.7K / 9.3-9.9K; this branch is13.5K / 9.7K. Treat as rejected/hold unless a syscall trace shows cleanup glob is material.