Skip to content

perf(pm): prefetch lockfile clones#2976

Draft
elrrrrrrr wants to merge 5 commits into
perf/pm-resolver-demand-bfsfrom
exp/pm-install-prefetch-lock-clones-b33d922
Draft

perf(pm): prefetch lockfile clones#2976
elrrrrrrr wants to merge 5 commits into
perf/pm-resolver-demand-bfsfrom
exp/pm-install-prefetch-lock-clones-b33d922

Conversation

@elrrrrrrr
Copy link
Copy Markdown
Contributor

@elrrrrrrr elrrrrrrr commented May 18, 2026

Summary

  • prefetch registry clone work from an existing package-lock after node_modules validation finishes
  • queue lockfile clones by depth and pass parent package paths for nested installs
  • keep link/file/git packages on the existing ensure path; this experiment targets registry p3/p4 hot paths

Hypothesis

p3/p4 currently prefetch registry downloads from lockfile, but clone work still expands only when each depth reaches ensure_clone. Queueing clone specs after clean_deps lets the scheduler overlap cache lookup/download/extract/clone earlier while ensure_clone only attaches to already-known work.

Validation

  • cargo fmt
  • cargo test -p utoo-pm service::install::tests::test_parent_package_path
  • cargo test -p utoo-pm service::install::tests::test_is_prefetchable_registry_tarball
  • cargo test -p utoo-pm service::install_scheduler::tests
  • cargo clippy -p utoo-pm --all-targets -- -D warnings --no-deps

Benchmark

GHA linux/npmjs, run 26020770842:

phase utoo wall utoo ctx #2974 wall #2974 ctx utoo-next wall bun wall
p0_full_cold 7.35s ±0.12 85.5K / 48.6K 7.58s ±0.13 84.9K / 51.9K 7.88s ±0.22 9.09s ±0.24
p1_resolve 2.50s ±0.27 14.2K / 18.8K 2.44s ±0.03 16.8K / 21.4K 2.94s ±0.07 1.89s ±0.06
p3_cold_install 5.22s ±0.32 66.8K / 39.7K 6.34s ±1.90 78.1K / 44.0K 7.27s ±2.62 6.62s ±0.02
p4_warm_link 2.14s ±0.13 21.7K / 13.6K 2.21s ±0.04 18.4K / 12.1K 2.46s ±0.14 3.32s ±0.03

Conclusion: p3 wall is the best so far, but p4 ctx regresses compared with #2975 and #2974. Treat this as a p3-only idea; do not fold it until #2977 and #2975 N=2 show whether the p3 gain can be obtained without hurting p4.

@elrrrrrrr elrrrrrrr added A-Pkg Manager Area: Package Manager benchmark Run pm-bench on PR labels May 18, 2026
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces prefetching capabilities for package downloads and clones to optimize the installation process. It transitions several I/O-intensive tasks, such as extraction and cloning, from asynchronous execution to synchronous implementations managed via a worker pool, including macOS-specific optimizations using clonefile. Review feedback highlights potential thread starvation risks when using rayon for blocking I/O and suggests refactoring duplicated validation logic. Other recommendations include improving error handling during directory cleanup and adhering to standard practices by avoiding panic recovery logic.

self.ops.push(tokio::spawn(async move {
let result = extract_to_cache(
let done_tx = self.done_tx.clone();
rayon::spawn(move || {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Using rayon::spawn for blocking I/O operations (like extraction and cloning) carries a high risk of thread pool starvation or deadlocks.

Rayon's global pool is designed for CPU-bound work-stealing. When an 'outer' task spawned on Rayon blocks on I/O (e.g., fs::File::create), it occupies a thread. If that task then triggers 'inner' parallel work (like the par_chunks used in extract_tarball_sync), and all other Rayon threads are similarly blocked on I/O, the inner tasks will wait in the queue indefinitely while the threads are blocked waiting for them.

It is highly recommended to use tokio::task::spawn_blocking for these I/O-heavy tasks, as it uses a dedicated, expandable pool designed for blocking operations.

Some(&path[..index])
}

fn prefetch_lock_clones(
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The validation logic for determining if a package is prefetchable (checking should_omit_package, link.is_some(), is_prefetchable_registry_tarball, CPU/OS compatibility, etc.) is duplicated between prefetch_lock_downloads and prefetch_lock_clones.

Consider refactoring this into a shared helper function or a closure to improve maintainability and ensure consistency when new filters are added.

}));
let done_tx = self.done_tx.clone();
rayon::spawn(move || {
let result = std::panic::catch_unwind(std::panic::AssertUnwindSafe(|| {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The use of std::panic::catch_unwind here to handle worker panics as recoverable errors violates the general rule: 'Do not implement recovery logic for panics. Panics should be treated as unrecoverable bugs that need to be fixed, not as transient, recoverable errors.'

References
  1. Do not implement recovery logic for panics. Panics should be treated as unrecoverable bugs that need to be fixed, not as transient, recoverable errors.

Comment on lines +129 to +130
&& let Err(e) = fs::create_dir_all(dir)
&& e.kind() != io::ErrorKind::AlreadyExists
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Ignoring io::ErrorKind::AlreadyExists from fs::create_dir_all is likely unnecessary and potentially hiding issues.

According to the Rust documentation, create_dir_all already returns Ok(()) if the directory exists. If it returns an error with AlreadyExists, it typically means a file (not a directory) exists at that path. By ignoring this, the code will proceed to try and write files into a non-directory path, which will fail later with a less clear error.

Suggested change
&& let Err(e) = fs::create_dir_all(dir)
&& e.kind() != io::ErrorKind::AlreadyExists
&& let Err(e) = fs::create_dir_all(dir)

Comment on lines +268 to +272
if dst.try_exists()?
&& let Err(e) = std::fs::remove_dir_all(dst)
{
tracing::warn!("Failed to clean target directory {}: {}", dst.display(), e);
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

If std::fs::remove_dir_all(dst) fails, the subsequent clone_dir_native_sync is almost certain to fail as well (especially on macOS where clonefile requires the destination to not exist).

Instead of just tracing a warning, this should return the error to provide immediate and clear feedback about the failure to prepare the target directory.

if dst.try_exists()? {
        std::fs::remove_dir_all(dst)
            .with_context(|| format!("Failed to clean target directory {}", dst.display()))?;
    }

@github-actions
Copy link
Copy Markdown

📊 pm-bench-phases · ddfac40 · linux (ubuntu-latest)

Workflow run — ant-design

PMs: utoo (this branch) · utoo-npm (latest published) · bun (latest)

npmjs.org

p0_full_cold

PM wall ±σ user sys RSS pgMinor
bun 9.40s 0.09s 10.56s 10.53s 734M 341.9K
utoo-next 8.41s 0.07s 10.82s 12.70s 941M 119.3K
utoo-npm 8.71s 1.09s 11.15s 12.96s 972M 127.7K
utoo 7.98s 0.11s 11.46s 12.40s 928M 144.2K
PM vCtx iCtx netRX netTX cache node_mod lock
bun 14.3K 18.3K 1.20G 6M 1.88G 1.76G 1M
utoo-next 119.6K 81.8K 1.17G 5M 1.73G 1.72G 2M
utoo-npm 128.2K 94.9K 1.17G 5M 1.73G 1.72G 2M
utoo 81.0K 46.6K 1.06G 5M 1.72G 1.72G 3M

p1_resolve

PM wall ±σ user sys RSS pgMinor
bun 1.96s 0.04s 4.19s 1.06s 495M 158.0K
utoo-next 3.10s 0.02s 5.60s 2.12s 627M 88.1K
utoo-npm 3.08s 0.02s 5.66s 2.08s 624M 79.2K
utoo 2.39s 0.04s 6.17s 1.67s 648M 121.4K
PM vCtx iCtx netRX netTX cache node_mod lock
bun 7.9K 4.8K 204M 3M 110M - 1M
utoo-next 71.0K 94.4K 201M 2M 7M 3M 2M
utoo-npm 70.4K 96.2K 201M 2M 7M 3M 2M
utoo 14.3K 20.2K 204M 3M 7M 3M 3M

p3_cold_install

PM wall ±σ user sys RSS pgMinor
bun 6.90s 0.41s 6.36s 10.19s 646M 212.9K
utoo-next 8.07s 1.63s 5.09s 11.43s 468M 60.9K
utoo-npm 6.18s 0.23s 5.11s 11.14s 471M 62.9K
utoo 5.59s 0.16s 4.91s 10.69s 503M 64.1K
PM vCtx iCtx netRX netTX cache node_mod lock
bun 4.2K 6.7K 1.00G 3M 1.77G 1.77G 1M
utoo-next 113.2K 50.1K 999M 3M 1.72G 1.72G 2M
utoo-npm 98.2K 45.6K 998M 2M 1.72G 1.72G 2M
utoo 65.7K 41.1K 998M 2M 1.72G 1.72G 2M

p4_warm_link

PM wall ±σ user sys RSS pgMinor
bun 3.29s 0.03s 0.21s 2.38s 134M 32.6K
utoo-next 2.24s 0.05s 0.52s 3.91s 81M 19.4K
utoo-npm 2.26s 0.04s 0.50s 3.95s 80M 18.7K
utoo 2.01s 0.10s 0.46s 3.63s 54M 12.1K
PM vCtx iCtx netRX netTX cache node_mod lock
bun 275 22 5M 53K 1.93G 1.76G 1M
utoo-next 42.5K 18.8K 306K 12K 1.72G 1.72G 3M
utoo-npm 43.2K 19.9K 306K 11K 1.72G 1.72G 3M
utoo 21.5K 13.6K 306K 34K 1.73G 1.72G 3M

npmmirror.com: no output captured.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

A-Pkg Manager Area: Package Manager benchmark Run pm-bench on PR

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant