Skip to content

perf(pm): skip missing node_modules cleanup#2980

Draft
elrrrrrrr wants to merge 7 commits into
perf/pm-resolver-demand-bfsfrom
exp/pm-clean-missing-node-modules-fast-path
Draft

perf(pm): skip missing node_modules cleanup#2980
elrrrrrrr wants to merge 7 commits into
perf/pm-resolver-demand-bfsfrom
exp/pm-clean-missing-node-modules-fast-path

Conversation

@elrrrrrrr
Copy link
Copy Markdown
Contributor

@elrrrrrrr elrrrrrrr commented May 18, 2026

Summary

  • skip unused-package glob cleanup when the target node_modules directory does not exist
  • keep stale-entry cleanup behavior and existing cleanup semantics for present directories
  • add a missing-directory noop test

Hypothesis

The phase bench prepare step removes node_modules before p0, p3, and p4. clean_deps still enters the unused-package cleanup path and builds glob iterators for node_modules/*/package.json and node_modules/@*/*/package.json. Returning early for a missing directory avoids those filesystem probes.

Expected effect is small; this is mainly a ctx/syscall cleanup experiment.

Validation

  • cargo fmt
  • CARGO_NET_OFFLINE=true cargo test -p utoo-pm service::clean::tests::test_remove_unused_packages_missing_node_modules
  • CARGO_NET_OFFLINE=true cargo clippy -p utoo-pm --all-targets -- -D warnings --no-deps

Benchmark Result

GHA run: https://github.com/utooland/utoo/actions/runs/26025635458

npmjs phase bench:

phase wall vCtx iCtx note
p0 7.75s ±0.61 87.5K 53.9K runner differs; not isolated
p1 2.49s ±0.10 16.2K 18.4K expected mostly unchanged
p3 6.07s ±1.01 73.8K 44.5K no improvement vs #2975 best
p4 1.88s ±0.05 13.5K 9.7K within #2975 natural range

Conclusion: no clear ctx/syscall win. #2975 already had p4 around 13.1-13.7K / 9.3-9.9K; this branch is 13.5K / 9.7K. Treat as rejected/hold unless a syscall trace shows cleanup glob is material.

@elrrrrrrr elrrrrrrr added A-Pkg Manager Area: Package Manager benchmark Run pm-bench on PR labels May 18, 2026
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces significant performance improvements to the package manager by transitioning several file system and extraction operations from asynchronous tasks to synchronous ones, utilizing rayon for parallel execution. It also adds a prefetching mechanism for registry tarballs and optimizes the InstallScheduler to handle cache hits more efficiently. The reviewer identified two areas for code cleanup: the redundant check for io::ErrorKind::AlreadyExists in create_dir_all and a redundant cleanup check in clone_sync that is already handled by its caller.

Comment on lines +129 to +132
&& let Err(e) = fs::create_dir_all(dir)
&& e.kind() != io::ErrorKind::AlreadyExists
{
return Err(e).with_context(|| err_msg.clone());
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The check e.kind() != io::ErrorKind::AlreadyExists is redundant here. std::fs::create_dir_all already handles existing directories and returns Ok(()) if the directory already exists. If it returns an error, it's for other reasons (such as a file existing at that path), which should likely be treated as a failure rather than being ignored.

            if created_dirs.insert(dir.clone())
                && let Err(e) = fs::create_dir_all(dir)
            {

Comment on lines +268 to +272
if dst.try_exists()?
&& let Err(e) = std::fs::remove_dir_all(dst)
{
tracing::warn!("Failed to clean target directory {}: {}", dst.display(), e);
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This cleanup logic is redundant because the only caller, clone_package_sync, already performs the same try_exists check and remove_dir_all call (lines 484-491) before calling clone_sync. Removing this here avoids unnecessary filesystem probes on the hot path.

@github-actions
Copy link
Copy Markdown

📊 pm-bench-phases · d7223e4 · linux (ubuntu-latest)

Workflow run — ant-design

PMs: utoo (this branch) · utoo-npm (latest published) · bun (latest)

npmjs.org

p0_full_cold

PM wall ±σ user sys RSS pgMinor
bun 9.11s 0.23s 10.54s 10.27s 695M 327.2K
utoo-next 7.90s 0.37s 10.57s 12.13s 961M 116.2K
utoo-npm 8.81s 0.92s 10.75s 12.24s 982M 125.7K
utoo 7.75s 0.61s 11.13s 11.72s 956M 147.1K
PM vCtx iCtx netRX netTX cache node_mod lock
bun 17.7K 19.5K 1.20G 7M 1.88G 1.75G 1M
utoo-next 127.0K 91.1K 1.17G 5M 1.73G 1.72G 2M
utoo-npm 136.0K 96.6K 1.17G 5M 1.73G 1.72G 2M
utoo 87.5K 53.9K 1.18G 6M 1.73G 1.72G 2M

p1_resolve

PM wall ±σ user sys RSS pgMinor
bun 2.10s 0.08s 3.99s 1.18s 514M 173.7K
utoo-next 3.12s 0.07s 5.31s 2.23s 609M 87.4K
utoo-npm 3.09s 0.05s 5.32s 2.29s 618M 80.9K
utoo 2.49s 0.10s 5.99s 1.77s 660M 126.0K
PM vCtx iCtx netRX netTX cache node_mod lock
bun 9.6K 4.5K 203M 3M 108M - 1M
utoo-next 76.5K 93.0K 201M 3M 7M 3M 2M
utoo-npm 77.5K 97.1K 201M 3M 7M 3M 2M
utoo 16.2K 18.4K 204M 3M 7M 3M 3M

p3_cold_install

PM wall ±σ user sys RSS pgMinor
bun 6.57s 0.05s 6.42s 9.66s 616M 203.8K
utoo-next 5.86s 0.15s 5.10s 10.35s 470M 62.0K
utoo-npm 6.94s 1.41s 5.25s 10.58s 483M 61.9K
utoo 6.07s 1.01s 4.92s 10.11s 554M 63.5K
PM vCtx iCtx netRX netTX cache node_mod lock
bun 5.5K 7.9K 1.00G 4M 1.77G 1.77G 1M
utoo-next 96.7K 49.2K 1000M 2M 1.72G 1.72G 2M
utoo-npm 112.9K 50.5K 1000M 3M 1.72G 1.72G 2M
utoo 73.8K 44.5K 1000M 3M 1.72G 1.72G 2M

p4_warm_link

PM wall ±σ user sys RSS pgMinor
bun 3.35s 0.00s 0.20s 2.45s 135M 32.3K
utoo-next 2.34s 0.10s 0.48s 3.72s 79M 18.1K
utoo-npm 2.13s 0.02s 0.47s 3.76s 80M 18.2K
utoo 1.88s 0.05s 0.34s 3.38s 49M 11.0K
PM vCtx iCtx netRX netTX cache node_mod lock
bun 243 23 5M 34K 1.93G 1.74G 1M
utoo-next 39.2K 17.2K 7K 10K 1.72G 1.72G 2M
utoo-npm 42.8K 20.5K 5K 6K 1.72G 1.72G 2M
utoo 13.5K 9.7K 2K 6K 1.73G 1.72G 2M

npmmirror.com: no output captured.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

A-Pkg Manager Area: Package Manager benchmark Run pm-bench on PR

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant