Skip to content

perf(pm): run install clone workers on rayon#2965

Draft
elrrrrrrr wants to merge 1 commit into
perf/pm-resolver-demand-bfsfrom
exp/pm-install-link-path-b33d922
Draft

perf(pm): run install clone workers on rayon#2965
elrrrrrrr wants to merge 1 commit into
perf/pm-resolver-demand-bfsfrom
exp/pm-install-link-path-b33d922

Conversation

@elrrrrrrr
Copy link
Copy Markdown
Contributor

What

AB experiment for p3/p4 install/link ctx: move scheduler-owned clone workers from the nested tokio::spawn -> spawn_blocking path to a single rayon worker that reports completion back to the install scheduler main loop.

Why

Current p4 is mostly warm-cache clone/link work. Each package clone currently enters a tokio task and then hops again into the blocking pool for sync hardlink/copy. This tests whether removing that extra async/blocking-pool hop reduces context switches without changing scheduler ownership, clone ordering, cache lookup, or package semantics.

Notes

  • Scheduler state remains in the main loop.
  • Clone workers still only perform atomic filesystem work and send an OpDone::Clone result back.
  • Existing async clone_package remains for non-scheduler call sites.
  • This is an AB PR; fold only if p3/p4 GHA data improves.

Validation

  • cargo fmt
  • cargo test -p utoo-pm
  • cargo clippy -p utoo-pm --all-targets -- -D warnings --no-deps

Bench plan

Trigger linux/npmjs phase bench and compare mainly:

phase baseline #2948 target
p3 cold install 7.62s, ctx 142.2K/68.3K wall neutral/better, ctx lower
p4 warm link noisy 4.83s, ctx 54.6K/23.5K; recent ABs around 2.2s, ctx ~52K/~24K ctx lower without wall regression

@elrrrrrrr elrrrrrrr added A-Pkg Manager Area: Package Manager benchmark Run pm-bench on PR labels May 18, 2026
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request transitions the package cloning process from an asynchronous model using tokio::spawn to a synchronous one utilizing rayon::spawn and a worker pool. It introduces several synchronous utility functions in cloner.rs and updates the install_scheduler to manage task completion through an internal MPSC channel. Feedback includes identifying redundant cleanup and validation logic in clone_package_sync, improving error context for metadata retrieval, and simplifying directory creation and JSON parsing logic to remove unnecessary checks.

Comment on lines +484 to +491
if dst.try_exists()? {
if validate_name_version_sync(dst, name, version) {
return Ok(false);
}
if let Err(e) = std::fs::remove_dir_all(dst) {
tracing::warn!("Failed to clean target directory {}: {}", dst.display(), e);
}
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

There is redundant cleanup logic here. clone_sync (called on line 492) already contains logic to check for the existence of dst and perform remove_dir_all if it exists. It's cleaner to let the internal helper handle the destructive phase. Additionally, per repository rules for linking-like operations, only the package name should be validated, not the version.

    if dst.try_exists()? && validate_name_sync(dst, name) {
        return Ok(false);
    }
References
  1. For package linking operations (similar to 'npm link'), only the package name should be validated as required. Do not enforce a version check if the version is not used in the linking logic.

pub fn clone_dir_sync(src: &Path, dst: &Path) -> Result<()> {
let err_msg = format!("Failed to clone {} to {}", src.display(), dst.display());

if !fs::metadata(src)?.is_dir() {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The error returned here when fs::metadata fails (e.g., if the source directory does not exist) will not have the err_msg context applied, unlike other errors in this function. This makes debugging harder as the specific source/destination paths won't be included in the error message.

Suggested change
if !fs::metadata(src)?.is_dir() {
if !fs::metadata(src).with_context(|| err_msg.clone())?.is_dir() {

Comment on lines +126 to +134
let mut created_dirs = HashSet::new();
for dir in &dirs {
if created_dirs.insert(dir.clone())
&& let Err(e) = fs::create_dir_all(dir)
&& e.kind() != io::ErrorKind::AlreadyExists
{
return Err(e).with_context(|| err_msg.clone());
}
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The HashSet and the AlreadyExists check are redundant here. dirs is populated by collect_entries, which performs a recursive walk of a file system tree, ensuring each destination path is unique. Furthermore, std::fs::create_dir_all is idempotent and returns Ok if the directory already exists. If it returns an error because the path exists but is not a directory, that error should be propagated rather than ignored.

        // Phase 2: Create all directories
        for dir in &dirs {
            if let Err(e) = fs::create_dir_all(dir) {
                return Err(e).with_context(|| err_msg.clone());
            }
        }

Comment on lines +191 to +200
match serde_json::from_str(&content) {
Ok(v) => Ok(v),
Err(original_err) => match serde_json::from_str::<serde_json::Value>(&content) {
Ok(value) => serde_json::from_value(value)
.with_context(|| format!("Failed to deserialize {pkg_path:?}")),
Err(_) => {
Err(original_err).with_context(|| format!("Failed to parse JSON from {pkg_path:?}"))
}
},
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The fallback logic of parsing into serde_json::Value and then using from_value is redundant. serde_json::from_str::<T> already handles the parsing and deserialization in one step. If it fails, it's either a syntax error or a schema mismatch, both of which are captured by the original error. Simplifying this improves readability and slightly reduces overhead.

    serde_json::from_str(&content)
        .with_context(|| format!("Failed to parse JSON from {pkg_path:?}"))

@elrrrrrrr
Copy link
Copy Markdown
Contributor Author

GHA run 1 read

Run: https://github.com/utooland/utoo/actions/runs/26015669415

phase utoo wall utoo ctx same-run utoo-next same-run bun read
p3_cold_install 6.80s ±1.90 96.2K / 55.9K 8.36s, 128.2K / 54.6K 6.89s, 5.7K / 7.9K wall competitive, vCtx lower than utoo-next/npm
p4_warm_link 2.03s ±0.07 21.7K / 13.8K 2.21s, 43.7K / 20.6K 3.34s, 311 / 19 strong positive; clone worker path cuts about half the utoo-next ctx and wins wall

Conclusion: keep this candidate. This is the first p4 result that is both faster than same-run utoo-next/utoo-npm and materially lower in ctx. Next step is to combine only with other positive single-factor results after #2967 finishes.

@github-actions
Copy link
Copy Markdown

📊 pm-bench-phases · 761e4dd · linux (ubuntu-latest)

Workflow run — ant-design

PMs: utoo (this branch) · utoo-npm (latest published) · bun (latest)

npmjs.org

p0_full_cold

PM wall ±σ user sys RSS pgMinor
bun 8.93s 0.73s 10.49s 6.27s 596M 305.3K
utoo-next 8.88s 1.10s 11.01s 8.38s 1019M 124.1K
utoo-npm 10.82s 3.91s 11.36s 8.57s 1003M 112.2K
utoo 7.25s 0.27s 11.64s 8.11s 929M 145.5K
PM vCtx iCtx netRX netTX cache node_mod lock
bun 24.7K 15.7K 1.20G 7M 1.89G 1.76G 1M
utoo-next 170.5K 126.6K 1.17G 6M 1.73G 1.72G 2M
utoo-npm 175.7K 125.3K 1.17G 6M 1.73G 1.72G 2M
utoo 125.7K 79.5K 1.18G 7M 1.73G 1.72G 3M

p1_resolve

PM wall ±σ user sys RSS pgMinor
bun 2.83s 0.16s 4.41s 0.87s 484M 188.8K
utoo-next 3.26s 0.01s 5.86s 1.57s 628M 81.9K
utoo-npm 5.27s 3.19s 5.84s 1.59s 613M 79.8K
utoo 0.00s 0.00s 6.11s 1.17s 643M 122.2K
PM vCtx iCtx netRX netTX cache node_mod lock
bun 14.8K 3.6K 205M 3M 108M - 1M
utoo-next 87.5K 107.3K 202M 3M 7M 3M 2M
utoo-npm 88.3K 112.1K 202M 3M 7M 3M 2M
utoo 19.3K 19.8K 204M 3M 6M - -

p3_cold_install

PM wall ±σ user sys RSS pgMinor
bun 6.99s 2.70s 6.03s 6.31s 574M 207.7K
utoo-next 8.67s 2.91s 5.22s 7.36s 497M 59.3K
utoo-npm 8.44s 3.13s 5.31s 7.30s 503M 62.1K
utoo 9.26s 4.26s 5.14s 7.14s 464M 58.0K
PM vCtx iCtx netRX netTX cache node_mod lock
bun 10.6K 7.4K 1.00G 4M 1.77G 1.77G 1M
utoo-next 125.8K 57.2K 1000M 3M 1.72G 1.72G 2M
utoo-npm 130.5K 56.3K 1000M 3M 1.72G 1.72G 2M
utoo 123.2K 63.8K 1001M 4M 1.72G 1.72G 2M

p4_warm_link

PM wall ±σ user sys RSS pgMinor
bun 2.13s 0.04s 0.14s 1.20s 137M 33.5K
utoo-next 1.78s 0.03s 0.53s 2.40s 80M 18.8K
utoo-npm 1.84s 0.01s 0.52s 2.43s 80M 18.5K
utoo 1.73s 0.07s 0.39s 2.15s 52M 11.6K
PM vCtx iCtx netRX netTX cache node_mod lock
bun 241 15 5M 35K 1.93G 1.74G 1M
utoo-next 43.3K 18.5K 573B 2K 1.72G 1.72G 2M
utoo-npm 44.1K 18.9K 2K 971B 1.72G 1.72G 2M
utoo 22.1K 13.8K 2K 5K 1.73G 1.72G 2M

npmmirror.com: no output captured.

@elrrrrrrr
Copy link
Copy Markdown
Contributor Author

GHA run 2 read

Run: https://github.com/utooland/utoo/actions/runs/26016347609

phase utoo wall utoo ctx same-run utoo-next same-run bun read
p3_cold_install 9.26s ±4.26 123.2K / 63.8K 8.67s, 125.8K / 57.2K 6.99s, 10.6K / 7.4K mixed/noisy; not a p3 win in this run
p4_warm_link 1.73s ±0.07 22.1K / 13.8K 1.78s, 43.3K / 18.5K 2.13s, 241 / 15 stable positive; N=2 repeats lower wall and about half utoo-next/npm vCtx

Conclusion: keep as a p4 candidate. The useful boundary is still clone worker scheduling: run filesystem clone/hardlink work on rayon workers and let the scheduler receive completion events. It should not be used as evidence for p3 yet; p3 remains noisy and needs a separate lever.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

A-Pkg Manager Area: Package Manager benchmark Run pm-bench on PR

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant