perf(pm): batch install clone completions#2989
Conversation
There was a problem hiding this comment.
Code Review
This pull request introduces batching for clone operations in the installation scheduler to improve performance by reducing scheduler wakeups. It replaces individual clone tasks with batches of up to three operations and introduces a worker-based concurrency limit. Feedback indicates that the heuristic used to calculate the worker limit causes total concurrency to scale non-linearly and potentially exceed intended limits. It is suggested to either refine the formula to account for the batch size or document the rationale for the current implementation.
| fn clone_worker_limit(clone_limit: usize) -> usize { | ||
| clone_limit | ||
| .saturating_div(2) | ||
| .saturating_add(2) | ||
| .clamp(1, clone_limit.max(1)) | ||
| } |
There was a problem hiding this comment.
The clone_worker_limit calculation uses a heuristic (limit / 2 + 2) that doesn't explicitly account for CLONE_BATCH_LIMIT. This results in a total potential concurrency (workers * batch size) that scales non-linearly with the original clone_limit. For example, a clone_limit of 4 results in 4 workers (up to 12 concurrent clones), while a clone_limit of 16 results in 10 workers (up to 30 concurrent clones).
If the intention is to maintain a total concurrency close to the original clone_limit while batching, consider a formula like (clone_limit / CLONE_BATCH_LIMIT).max(1). If the increased concurrency is intentional to saturate the Rayon pool, documenting the rationale for this specific heuristic would improve maintainability.
📊 pm-bench-phases ·
|
| PM | wall | ±σ | user | sys | RSS | pgMinor |
|---|---|---|---|---|---|---|
| bun | 9.30s | 0.27s | 10.44s | 10.07s | 751M | 345.7K |
| utoo-next | 8.06s | 0.17s | 10.73s | 12.23s | 973M | 121.7K |
| utoo-npm | 8.32s | 0.10s | 11.00s | 12.48s | 1008M | 123.8K |
| utoo | 7.34s | 0.18s | 11.50s | 11.04s | 976M | 143.6K |
| PM | vCtx | iCtx | netRX | netTX | cache | node_mod | lock |
|---|---|---|---|---|---|---|---|
| bun | 13.9K | 17.1K | 1.19G | 6M | 1.86G | 1.75G | 1M |
| utoo-next | 115.4K | 81.4K | 1.16G | 4M | 1.71G | 1.70G | 2M |
| utoo-npm | 130.4K | 94.8K | 1.16G | 5M | 1.71G | 1.70G | 2M |
| utoo | 65.1K | 45.7K | 1.16G | 6M | 1.71G | 1.70G | 2M |
p1_resolve
| PM | wall | ±σ | user | sys | RSS | pgMinor |
|---|---|---|---|---|---|---|
| bun | 1.97s | 0.10s | 4.10s | 1.01s | 521M | 166.2K |
| utoo-next | 3.05s | 0.04s | 5.53s | 1.99s | 620M | 83.6K |
| utoo-npm | 3.04s | 0.03s | 5.60s | 2.00s | 610M | 83.5K |
| utoo | 2.39s | 0.01s | 6.14s | 1.64s | 631M | 122.5K |
| PM | vCtx | iCtx | netRX | netTX | cache | node_mod | lock |
|---|---|---|---|---|---|---|---|
| bun | 7.8K | 4.5K | 202M | 3M | 108M | - | 1M |
| utoo-next | 68.3K | 85.8K | 199M | 2M | 7M | 3M | 2M |
| utoo-npm | 68.7K | 87.2K | 199M | 2M | 7M | 3M | 2M |
| utoo | 14.9K | 19.3K | 202M | 3M | 7M | 3M | 2M |
p3_cold_install
| PM | wall | ±σ | user | sys | RSS | pgMinor |
|---|---|---|---|---|---|---|
| bun | 6.68s | 0.19s | 6.29s | 9.85s | 606M | 206.3K |
| utoo-next | 9.25s | 2.74s | 5.11s | 11.25s | 458M | 63.1K |
| utoo-npm | 6.95s | 1.69s | 5.24s | 11.13s | 512M | 63.0K |
| utoo | 5.30s | 0.25s | 4.99s | 9.62s | 564M | 60.2K |
| PM | vCtx | iCtx | netRX | netTX | cache | node_mod | lock |
|---|---|---|---|---|---|---|---|
| bun | 4.1K | 6.4K | 1018M | 3M | 1.76G | 1.76G | 1M |
| utoo-next | 110.4K | 47.3K | 988M | 3M | 1.70G | 1.70G | 3M |
| utoo-npm | 106.8K | 50.2K | 988M | 3M | 1.70G | 1.70G | 3M |
| utoo | 52.4K | 33.8K | 988M | 2M | 1.70G | 1.70G | 3M |
p4_warm_link
| PM | wall | ±σ | user | sys | RSS | pgMinor |
|---|---|---|---|---|---|---|
| bun | 3.34s | 0.09s | 0.21s | 2.39s | 134M | 32.8K |
| utoo-next | 2.37s | 0.02s | 0.56s | 3.93s | 79M | 18.2K |
| utoo-npm | 2.37s | 0.05s | 0.54s | 3.91s | 79M | 18.7K |
| utoo | 2.20s | 0.04s | 0.36s | 3.38s | 50M | 11.0K |
| PM | vCtx | iCtx | netRX | netTX | cache | node_mod | lock |
|---|---|---|---|---|---|---|---|
| bun | 278 | 21 | 5M | 22K | 1.91G | 1.73G | 1M |
| utoo-next | 44.8K | 21.1K | 3K | 25K | 1.70G | 1.70G | 2M |
| utoo-npm | 44.3K | 20.3K | 3K | 23K | 1.70G | 1.70G | 2M |
| utoo | 7.9K | 4.6K | 2K | 6K | 1.71G | 1.70G | 2M |
npmmirror.com: no output captured.
Summary
Why
p3/p4 are dominated by materialization pressure after the earlier install scheduler changes. Batching clone completions reduces scheduler wakeups, but parent packages are on the nested placement critical path. If a parent clone completes early inside a batch but its completion is reported only after other leaf clones finish, its children cannot enter the queue. The #2991 follow-up fixes that without disabling batching for ordinary leaf clones.
Validation
cargo fmtcargo test -p utoo-pm install_schedulercargo clippy -p utoo-pm --all-targets -- -D warnings --no-depsNote: full workspace
cargo clippy --all-targets -- -D warnings --no-depsis currently blocked in this worktree by pack-core/next.js submodule API mismatch, unrelated to the PM scheduler change.Benchmark
Latest #2989 reference before #2991 integration, GHA Linux npmjs run
26089858682:#2991 AB before integration, GHA Linux npmjs run
26093544293:Integrated #2989 head (
fd38cfff), GHA Linux npmjs run26094747554:Latest full green integrated run
26095851789:Conclusion: #2991 remains directionally positive for p4/warm materialize (
2.23s -> 2.07s/2.16s, ctx stable). The latest full-green run has p0/p3 noise, so p4 is the clearer signal for this change. p3 is neutral to slightly noisy, which is acceptable because this change targets clone/materialize ordering rather than cold extract/cache population.Rejected / Inconclusive Follow-ups
6.22s,56.8K/36.5K); reject for now.4.84s) but ctx regressed (59.0K/39.2K) and p4 no-op control moved heavily; inconclusive.Additional GHA rerun: p0 noise check
Label-triggered Linux npmjs rerun
26102023925:This confirms the previous p0
10.51sresult was runner noise; p0 returns to the expected fast range. p3 was noisy in the opposite direction on this rerun, so p3 should continue to be judged from multiple runs rather than this single sample.