Skip to content

perf(pm): run install io workers on rayon#2972

Draft
elrrrrrrr wants to merge 2 commits into
perf/pm-resolver-demand-bfsfrom
exp/pm-install-io-workers-rayon-b33d922
Draft

perf(pm): run install io workers on rayon#2972
elrrrrrrr wants to merge 2 commits into
perf/pm-resolver-demand-bfsfrom
exp/pm-install-io-workers-rayon-b33d922

Conversation

@elrrrrrrr
Copy link
Copy Markdown
Contributor

@elrrrrrrr elrrrrrrr commented May 18, 2026

Summary

  • combine the clone-worker experiment with extract workers on rayon
  • clone and extract completion are returned to the scheduler through the shared done channel
  • goal: keep scheduler as coordinator while CPU/disk-heavy install workers run off the async runtime

Validation

  • cargo fmt
  • cargo test -p utoo-pm service::install_scheduler::tests
  • cargo test -p utoo-pm util::cloner::tests
  • cargo test -p utoo-pm util::downloader::tests
  • cargo test -p utoo-pm test_extract_and_write_sync
  • cargo clippy -p utoo-pm --all-targets -- -D warnings --no-deps

Benchmark

N=1 run 26018170215: p3 positive, p4 ctx positive but wall noisy.

N=2 run 26019249657 (npmjs):

phase utoo same-run utoo-next same-run utoo-npm bun read
p3 wall 5.70s ±0.08 7.46s ±1.00 6.18s ±0.09 6.77s ±0.03 stable positive
p3 ctx 68.3K / 41.7K 119.0K / 55.3K 96.2K / 45.5K 6.3K / 7.7K strong positive
p4 wall 2.02s ±0.08 2.57s ±0.47 2.30s ±0.08 3.44s ±0.03 positive
p4 ctx 21.7K / 14.0K 42.0K / 18.5K 41.2K / 18.8K 267 / 24 positive

Conclusion: stable p3 ctx win; p4 preserves clone-worker gains. Compare against #2973 before folding, because #2973 currently has better p4 and better p3 wall.

@elrrrrrrr elrrrrrrr added A-Pkg Manager Area: Package Manager benchmark Run pm-bench on PR labels May 18, 2026
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request refactors the installation scheduler to offload I/O-heavy tasks like package extraction and cloning to a synchronous worker pool using Rayon. It introduces synchronous versions of several utility functions and adds macOS-specific clonefile support with retry logic. Feedback focuses on removing inappropriate panic recovery logic using catch_unwind, simplifying redundant directory creation in the cloner to avoid logic errors, and optimizing JSON parsing to improve efficiency and error reporting.

Comment on lines +470 to +481
let result = std::panic::catch_unwind(std::panic::AssertUnwindSafe(|| {
clone_package_from_cache_sync(
&job.spec.package.name,
&job.spec.package.version,
&job.spec.package.tarball_url,
&job.cache_path,
&job.spec.target,
)
.map_err(|e| format!("{e:#}"))
}))
.unwrap_or_else(|_| Err("install clone worker panicked".to_string()));
let _ = done_tx.send(OpDone::Clone { target, result });
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The use of std::panic::catch_unwind to handle panics in the worker thread violates the general rule against implementing recovery logic for panics. Panics should be treated as unrecoverable bugs that indicate a logic error, rather than transient errors to be caught and converted into a result.

                let result = clone_package_from_cache_sync(
                    &job.spec.package.name,
                    &job.spec.package.version,
                    &job.spec.package.tarball_url,
                    &job.cache_path,
                    &job.spec.target,
                )
                .map_err(|e| format!("{e:#}"));
                let _ = done_tx.send(OpDone::Clone { target, result });
References
  1. Do not implement recovery logic for panics. Panics should be treated as unrecoverable bugs that need to be fixed, not as transient, recoverable errors.

Comment on lines +126 to +134
let mut created_dirs = HashSet::new();
for dir in &dirs {
if created_dirs.insert(dir.clone())
&& let Err(e) = fs::create_dir_all(dir)
&& e.kind() != io::ErrorKind::AlreadyExists
{
return Err(e).with_context(|| err_msg.clone());
}
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The directory creation logic in Phase 2 is redundant and contains a potential logic error.

  1. Redundancy: std::fs::create_dir_all is idempotent and returns Ok(()) if the directory already exists. Since collect_entries visits each directory exactly once in a top-down order, the HashSet is unnecessary.
  2. Logic Error: The check e.kind() != io::ErrorKind::AlreadyExists is problematic. create_dir_all only returns an AlreadyExists error if a file exists at the path where a directory is expected. In such a case, the operation should fail because subsequent file writes into that path will inevitably fail. By skipping the error, the code allows the process to continue in an invalid state.
        for dir in &dirs {
            fs::create_dir_all(dir).with_context(|| err_msg.clone())?;
        }

Comment on lines +191 to +200
match serde_json::from_str(&content) {
Ok(v) => Ok(v),
Err(original_err) => match serde_json::from_str::<serde_json::Value>(&content) {
Ok(value) => serde_json::from_value(value)
.with_context(|| format!("Failed to deserialize {pkg_path:?}")),
Err(_) => {
Err(original_err).with_context(|| format!("Failed to parse JSON from {pkg_path:?}"))
}
},
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The double-parsing logic in load_package_json_sync is inefficient and potentially degrades error reporting. serde_json::from_str::<T> is already capable of parsing the JSON and validating it against the target type T. If the JSON is valid but doesn't match the schema, it provides a descriptive error including line and column information. Falling back to serde_json::Value and then using from_value performs the same validation again but loses the original source location information in the resulting error.

    serde_json::from_str(&content).with_context(|| format!("Failed to parse JSON from {pkg_path:?}"))

@elrrrrrrr
Copy link
Copy Markdown
Contributor Author

Benchmark readout for run 26018170215 (npmjs, label-triggered):

phase utoo same-run utoo-next same-run utoo-npm bun read
p3 wall 6.47s ±0.30 7.67s ±0.93 8.04s ±2.20 7.45s ±0.04 positive
p3 ctx 93.7K / 33.8K 119.3K / 43.6K 134.7K / 42.2K 7.7K / 5.0K positive vs utoo baselines
p4 wall 3.36s ±0.53 2.88s ±0.04 2.84s ±0.07 3.88s ±0.09 mixed / noisy
p4 ctx 22.4K / 13.3K 44.2K / 18.3K 44.6K / 18.1K 216 / 51 preserves #2965-level ctx

Conclusion: clone+extract-on-rayon is not ready to fold from this single run. p3 is clearly better, but p4 wall regressed in this run even though ctx stayed at the good clone-worker level. I would re-run N=2 before deciding whether the extract worker should join the clone-worker change; otherwise keep only the clone-worker candidate.

@elrrrrrrr elrrrrrrr added benchmark Run pm-bench on PR and removed benchmark Run pm-bench on PR labels May 18, 2026
@elrrrrrrr
Copy link
Copy Markdown
Contributor Author

N=2 readout for run 26019249657 (npmjs, label-triggered):

phase utoo same-run utoo-next same-run utoo-npm bun read
p3 wall 5.70s ±0.08 7.46s ±1.00 6.18s ±0.09 6.77s ±0.03 stable positive
p3 ctx 68.3K / 41.7K 119.0K / 55.3K 96.2K / 45.5K 6.3K / 7.7K strong positive
p4 wall 2.02s ±0.08 2.57s ±0.47 2.30s ±0.08 3.44s ±0.03 positive in N=2
p4 ctx 21.7K / 14.0K 42.0K / 18.5K 41.2K / 18.8K 267 / 24 preserves clone-worker gains

Updated conclusion: clone+extract-on-rayon is stable for p3 and no longer shows the p4 wall regression from N=1. Its main value is p3 ctx. #2973 still has the better p4 and better first-run p3 wall, so I would wait for #2973 N=2 before deciding whether to fold this directly or test a combined branch (#2973 + extract worker).

@github-actions
Copy link
Copy Markdown

📊 pm-bench-phases · 1aab506 · linux (ubuntu-latest)

Workflow run — ant-design

PMs: utoo (this branch) · utoo-npm (latest published) · bun (latest)

npmjs.org

p0_full_cold

PM wall ±σ user sys RSS pgMinor
bun 9.25s 0.20s 10.55s 10.09s 769M 338.7K
utoo-next 7.93s 0.47s 10.57s 12.08s 982M 127.6K
utoo-npm 8.07s 0.21s 10.69s 12.01s 949M 127.4K
utoo 8.12s 1.52s 11.13s 11.56s 934M 134.7K
PM vCtx iCtx netRX netTX cache node_mod lock
bun 15.2K 18.4K 1.20G 6M 1.86G 1.74G 1M
utoo-next 123.7K 84.9K 1.17G 5M 1.73G 1.72G 2M
utoo-npm 124.1K 86.4K 1.17G 5M 1.73G 1.72G 2M
utoo 96.8K 53.5K 1.18G 7M 1.73G 1.72G 3M

p1_resolve

PM wall ±σ user sys RSS pgMinor
bun 1.91s 0.04s 4.07s 1.10s 516M 160.9K
utoo-next 3.02s 0.06s 5.34s 2.22s 625M 91.9K
utoo-npm 3.07s 0.15s 5.38s 2.15s 604M 77.6K
utoo 2.34s 0.01s 6.01s 1.64s 658M 123.0K
PM vCtx iCtx netRX netTX cache node_mod lock
bun 8.1K 4.9K 203M 3M 108M - 1M
utoo-next 73.4K 89.0K 201M 2M 7M 3M 2M
utoo-npm 73.6K 90.6K 201M 2M 7M 3M 2M
utoo 14.1K 19.5K 204M 3M 7M 3M 2M

p3_cold_install

PM wall ±σ user sys RSS pgMinor
bun 6.80s 0.24s 6.31s 9.74s 650M 211.7K
utoo-next 7.00s 1.46s 5.20s 10.70s 519M 62.6K
utoo-npm 6.98s 1.82s 5.25s 10.71s 525M 66.3K
utoo 5.48s 0.40s 4.92s 10.05s 501M 63.3K
PM vCtx iCtx netRX netTX cache node_mod lock
bun 4.0K 7.2K 1.00G 3M 1.77G 1.77G 1M
utoo-next 113.6K 49.1K 999M 3M 1.72G 1.72G 2M
utoo-npm 120.2K 50.5K 999M 3M 1.72G 1.72G 2M
utoo 73.6K 43.2K 998M 2M 1.72G 1.72G 2M

p4_warm_link

PM wall ±σ user sys RSS pgMinor
bun 3.51s 0.00s 0.17s 2.50s 135M 31.7K
utoo-next 2.23s 0.05s 0.50s 3.76s 79M 18.7K
utoo-npm 2.17s 0.03s 0.48s 3.78s 81M 18.5K
utoo 1.86s 0.04s 0.41s 3.47s 53M 11.2K
PM vCtx iCtx netRX netTX cache node_mod lock
bun 194 21 109K 29K 1.88G 1.77G 1M
utoo-next 38.0K 17.4K 405K 28K 1.72G 1.72G 2M
utoo-npm 40.2K 17.3K 404K 14K 1.72G 1.72G 2M
utoo 21.6K 13.7K 400K 24K 1.73G 1.72G 2M

npmmirror.com: no output captured.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

A-Pkg Manager Area: Package Manager benchmark Run pm-bench on PR

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant