Add code size tracking workflow and scripts#8113
Conversation
Adds a Code Size CI job that runs tokei over the workspace and posts a single collapsible PR comment: a one-line total in the summary, with the full per-crate line-count breakdown (and deltas against the base) on expand. Nested crates are attributed to their longest path prefix so they are not double counted. Signed-off-by: Joe Isaacs <joe.isaacs@live.co.uk>
Code size: 286,481 lines of Rust across 60 crates
Total: 286,481 → 286,481 (—) |
Merging this PR will improve performance by 13.65%
|
| Mode | Benchmark | BASE |
HEAD |
Efficiency | |
|---|---|---|---|---|---|
| ⚡ | Simulation | chunked_bool_canonical_into[(1000, 10)] |
45 µs | 30.8 µs | +46.26% |
| ❌ | Simulation | chunked_varbinview_opt_canonical_into[(1000, 10)] |
187.7 µs | 225.3 µs | -16.71% |
| ⚡ | Simulation | chunked_varbinview_opt_into_canonical[(1000, 10)] |
239.5 µs | 202.1 µs | +18.49% |
| ⚡ | Simulation | new_alp_prim_test_between[f32, 16384] |
118.5 µs | 104 µs | +13.94% |
| ⚡ | Simulation | new_alp_prim_test_between[f32, 32768] |
182.2 µs | 153.3 µs | +18.83% |
| ⚡ | Simulation | new_bp_prim_test_between[i16, 32768] |
132.3 µs | 120 µs | +10.26% |
Tip
Investigate this regression by commenting @codspeedbot fix this regression on this PR, or directly use the CodSpeed MCP with your agent.
Comparing claude/quirky-mccarthy-1MC1W (152f7d3) with develop (ae30d83)
Adds a Crate Binary Size CI job that builds the datafusion-bench binary on stable and runs cargo-bloat to attribute its machine code back to each first-party Vortex crate. Posts a single collapsible PR comment: a one-line Vortex total in the summary, with the full per-crate .text breakdown on expand. Third-party crates (datafusion, arrow, tokio, std) are filtered out using the workspace member set from cargo metadata. Signed-off-by: Joe Isaacs <joe.isaacs@live.co.uk>
Builds datafusion-bench for both the PR head and develop on the same runner (reusing the target directory) and reports the per-crate .text delta against develop in the sticky PR comment. Removes the tokei lines-of-code report and its workflow, leaving binary size as the sole code-size metric. Signed-off-by: Joe Isaacs <joe.isaacs@live.co.uk>
code size change −5.8% (datafusion-bench)
Vortex total: 14.47 MiB → 13.63 MiB (−860.7 KiB) |
When the PR introduces no .text delta against develop, emit just the summary line instead of the full per-crate table to keep the comment quiet. Signed-off-by: Joe Isaacs <joe.isaacs@live.co.uk>
Collapsed line now reads "no code size change (datafusion-bench)" when unchanged, or "code size change +/-XX% (datafusion-bench)" with the per-crate details table on expand when it changed. Signed-off-by: Joe Isaacs <joe.isaacs@live.co.uk>
|
@joseph-isaacs Code size should ignore auto-generated stuff like vortex-duckdb/{cpp.rs,include/vortex.h} |
Summary
This PR adds automated code size tracking for pull requests. It introduces two Python scripts and a GitHub Actions workflow to measure and report Rust lines of code per crate, with delta comparisons against the base branch.
Changes
scripts/crate-loc.py: New script that computes lines of code per workspace crate by:Cargo.tomltokeito analyze the repositoryscripts/compare-loc.py: New script that renders per-crate LOC as a collapsible markdown comment by:<details>block with a summary line and expandable table.github/workflows/code-size.yml: New GitHub Actions workflow that:tokeifor code analysisThe workflow enables developers and reviewers to quickly see code size changes at a glance, with detailed per-crate breakdowns available by expanding the comment.
Testing
The scripts are straightforward Python utilities with no external dependencies beyond
tokei(which is installed by the workflow). The workflow will be tested automatically on the first PR that uses it. Manual verification can be done by:python3 scripts/crate-loc.py .locally to verify LOC computationpython3 scripts/compare-loc.py <json-file>to verify markdown renderinghttps://claude.ai/code/session_01Df7kNNDHfHnoa9uhTdGH9c