perf(pm): prefetch lockfile clones#2976
Conversation
There was a problem hiding this comment.
Code Review
This pull request introduces prefetching capabilities for package downloads and clones to optimize the installation process. It transitions several I/O-intensive tasks, such as extraction and cloning, from asynchronous execution to synchronous implementations managed via a worker pool, including macOS-specific optimizations using clonefile. Review feedback highlights potential thread starvation risks when using rayon for blocking I/O and suggests refactoring duplicated validation logic. Other recommendations include improving error handling during directory cleanup and adhering to standard practices by avoiding panic recovery logic.
| self.ops.push(tokio::spawn(async move { | ||
| let result = extract_to_cache( | ||
| let done_tx = self.done_tx.clone(); | ||
| rayon::spawn(move || { |
There was a problem hiding this comment.
Using rayon::spawn for blocking I/O operations (like extraction and cloning) carries a high risk of thread pool starvation or deadlocks.
Rayon's global pool is designed for CPU-bound work-stealing. When an 'outer' task spawned on Rayon blocks on I/O (e.g., fs::File::create), it occupies a thread. If that task then triggers 'inner' parallel work (like the par_chunks used in extract_tarball_sync), and all other Rayon threads are similarly blocked on I/O, the inner tasks will wait in the queue indefinitely while the threads are blocked waiting for them.
It is highly recommended to use tokio::task::spawn_blocking for these I/O-heavy tasks, as it uses a dedicated, expandable pool designed for blocking operations.
| Some(&path[..index]) | ||
| } | ||
|
|
||
| fn prefetch_lock_clones( |
There was a problem hiding this comment.
The validation logic for determining if a package is prefetchable (checking should_omit_package, link.is_some(), is_prefetchable_registry_tarball, CPU/OS compatibility, etc.) is duplicated between prefetch_lock_downloads and prefetch_lock_clones.
Consider refactoring this into a shared helper function or a closure to improve maintainability and ensure consistency when new filters are added.
| })); | ||
| let done_tx = self.done_tx.clone(); | ||
| rayon::spawn(move || { | ||
| let result = std::panic::catch_unwind(std::panic::AssertUnwindSafe(|| { |
There was a problem hiding this comment.
The use of std::panic::catch_unwind here to handle worker panics as recoverable errors violates the general rule: 'Do not implement recovery logic for panics. Panics should be treated as unrecoverable bugs that need to be fixed, not as transient, recoverable errors.'
References
- Do not implement recovery logic for panics. Panics should be treated as unrecoverable bugs that need to be fixed, not as transient, recoverable errors.
| && let Err(e) = fs::create_dir_all(dir) | ||
| && e.kind() != io::ErrorKind::AlreadyExists |
There was a problem hiding this comment.
Ignoring io::ErrorKind::AlreadyExists from fs::create_dir_all is likely unnecessary and potentially hiding issues.
According to the Rust documentation, create_dir_all already returns Ok(()) if the directory exists. If it returns an error with AlreadyExists, it typically means a file (not a directory) exists at that path. By ignoring this, the code will proceed to try and write files into a non-directory path, which will fail later with a less clear error.
| && let Err(e) = fs::create_dir_all(dir) | |
| && e.kind() != io::ErrorKind::AlreadyExists | |
| && let Err(e) = fs::create_dir_all(dir) |
| if dst.try_exists()? | ||
| && let Err(e) = std::fs::remove_dir_all(dst) | ||
| { | ||
| tracing::warn!("Failed to clean target directory {}: {}", dst.display(), e); | ||
| } |
There was a problem hiding this comment.
If std::fs::remove_dir_all(dst) fails, the subsequent clone_dir_native_sync is almost certain to fail as well (especially on macOS where clonefile requires the destination to not exist).
Instead of just tracing a warning, this should return the error to provide immediate and clear feedback about the failure to prepare the target directory.
if dst.try_exists()? {
std::fs::remove_dir_all(dst)
.with_context(|| format!("Failed to clean target directory {}", dst.display()))?;
}
📊 pm-bench-phases ·
|
| PM | wall | ±σ | user | sys | RSS | pgMinor |
|---|---|---|---|---|---|---|
| bun | 9.40s | 0.09s | 10.56s | 10.53s | 734M | 341.9K |
| utoo-next | 8.41s | 0.07s | 10.82s | 12.70s | 941M | 119.3K |
| utoo-npm | 8.71s | 1.09s | 11.15s | 12.96s | 972M | 127.7K |
| utoo | 7.98s | 0.11s | 11.46s | 12.40s | 928M | 144.2K |
| PM | vCtx | iCtx | netRX | netTX | cache | node_mod | lock |
|---|---|---|---|---|---|---|---|
| bun | 14.3K | 18.3K | 1.20G | 6M | 1.88G | 1.76G | 1M |
| utoo-next | 119.6K | 81.8K | 1.17G | 5M | 1.73G | 1.72G | 2M |
| utoo-npm | 128.2K | 94.9K | 1.17G | 5M | 1.73G | 1.72G | 2M |
| utoo | 81.0K | 46.6K | 1.06G | 5M | 1.72G | 1.72G | 3M |
p1_resolve
| PM | wall | ±σ | user | sys | RSS | pgMinor |
|---|---|---|---|---|---|---|
| bun | 1.96s | 0.04s | 4.19s | 1.06s | 495M | 158.0K |
| utoo-next | 3.10s | 0.02s | 5.60s | 2.12s | 627M | 88.1K |
| utoo-npm | 3.08s | 0.02s | 5.66s | 2.08s | 624M | 79.2K |
| utoo | 2.39s | 0.04s | 6.17s | 1.67s | 648M | 121.4K |
| PM | vCtx | iCtx | netRX | netTX | cache | node_mod | lock |
|---|---|---|---|---|---|---|---|
| bun | 7.9K | 4.8K | 204M | 3M | 110M | - | 1M |
| utoo-next | 71.0K | 94.4K | 201M | 2M | 7M | 3M | 2M |
| utoo-npm | 70.4K | 96.2K | 201M | 2M | 7M | 3M | 2M |
| utoo | 14.3K | 20.2K | 204M | 3M | 7M | 3M | 3M |
p3_cold_install
| PM | wall | ±σ | user | sys | RSS | pgMinor |
|---|---|---|---|---|---|---|
| bun | 6.90s | 0.41s | 6.36s | 10.19s | 646M | 212.9K |
| utoo-next | 8.07s | 1.63s | 5.09s | 11.43s | 468M | 60.9K |
| utoo-npm | 6.18s | 0.23s | 5.11s | 11.14s | 471M | 62.9K |
| utoo | 5.59s | 0.16s | 4.91s | 10.69s | 503M | 64.1K |
| PM | vCtx | iCtx | netRX | netTX | cache | node_mod | lock |
|---|---|---|---|---|---|---|---|
| bun | 4.2K | 6.7K | 1.00G | 3M | 1.77G | 1.77G | 1M |
| utoo-next | 113.2K | 50.1K | 999M | 3M | 1.72G | 1.72G | 2M |
| utoo-npm | 98.2K | 45.6K | 998M | 2M | 1.72G | 1.72G | 2M |
| utoo | 65.7K | 41.1K | 998M | 2M | 1.72G | 1.72G | 2M |
p4_warm_link
| PM | wall | ±σ | user | sys | RSS | pgMinor |
|---|---|---|---|---|---|---|
| bun | 3.29s | 0.03s | 0.21s | 2.38s | 134M | 32.6K |
| utoo-next | 2.24s | 0.05s | 0.52s | 3.91s | 81M | 19.4K |
| utoo-npm | 2.26s | 0.04s | 0.50s | 3.95s | 80M | 18.7K |
| utoo | 2.01s | 0.10s | 0.46s | 3.63s | 54M | 12.1K |
| PM | vCtx | iCtx | netRX | netTX | cache | node_mod | lock |
|---|---|---|---|---|---|---|---|
| bun | 275 | 22 | 5M | 53K | 1.93G | 1.76G | 1M |
| utoo-next | 42.5K | 18.8K | 306K | 12K | 1.72G | 1.72G | 3M |
| utoo-npm | 43.2K | 19.9K | 306K | 11K | 1.72G | 1.72G | 3M |
| utoo | 21.5K | 13.6K | 306K | 34K | 1.73G | 1.72G | 3M |
npmmirror.com: no output captured.
Summary
Hypothesis
p3/p4 currently prefetch registry downloads from lockfile, but clone work still expands only when each depth reaches
ensure_clone. Queueing clone specs afterclean_depslets the scheduler overlap cache lookup/download/extract/clone earlier whileensure_cloneonly attaches to already-known work.Validation
Benchmark
GHA linux/npmjs, run 26020770842:
Conclusion: p3 wall is the best so far, but p4 ctx regresses compared with #2975 and #2974. Treat this as a p3-only idea; do not fold it until #2977 and #2975 N=2 show whether the p3 gain can be obtained without hurting p4.