Skip to content

[Autoloop: tsb-perf-evolve]#272

Merged
mrjf merged 2 commits intomainfrom
autoloop/tsb-perf-evolve
May 5, 2026
Merged

[Autoloop: tsb-perf-evolve]#272
mrjf merged 2 commits intomainfrom
autoloop/tsb-perf-evolve

Conversation

@github-actions
Copy link
Copy Markdown
Contributor

@github-actions github-actions Bot commented May 5, 2026

tsb-perf-evolve — Performance Optimization: sortValues

🤖 This PR is maintained by Autoloop. Each accepted iteration adds a commit to this branch.

Program Goal: Minimize tsb_mean_ms / pandas_mean_ms ratio for Series.sortValues on n=100k numeric data. Lower is better; < 1.0 means tsb beats pandas.

Current Best Metric: 21.048 (tsb=112.50ms / pandas=5.34ms)

Issue: #189

Iteration 35 (latest)

Three micro-optimizations on the 8-pass LSD radix sort:

  1. Merged histogram: All 8 histogram passes now accumulate inline during the partition/init loop — eliminates a separate O(n) scan over ~300k Uint32 values per call.
  2. Stride counter: si += 3 replaces i * 3 in scatter and gather loops — avoids 800k multiplies per call.
  3. RangeIndex fast path: When the index is a default 0-based RangeIndex, bypass 100k bounds-checked at() calls from index.take(perm) by constructing the output Index directly.

Branch: autoloop/tsb-perf-evolve

Warning

Firewall blocked 1 domain

The following domain was blocked by the firewall during workflow execution:

  • releaseassets.githubusercontent.com

To allow these domains, add them to the network.allowed list in your workflow frontmatter:

network:
  allowed:
    - defaults
    - "releaseassets.githubusercontent.com"

See Network Configuration for more information.

Generated by Autoloop · ● 5M ·

…Index fast path

- Merge all 8 histogram passes into the partition/init loop, eliminating
  one O(n) scan over the AoS buffer (~300k Uint32 reads saved per call)
- Use accumulated stride counter (si += 3) in scatter and gather loops
  instead of i*3 multiply (800k multiplies saved per call)
- RangeIndex fast path: bypass 100k bounds-checked at() calls from
  index.take(perm) when the index is a default 0-based RangeIndex

Run: https://github.com/githubnext/tsessebe/actions/runs/25363647445

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@mrjf mrjf marked this pull request as ready for review May 5, 2026 13:55
@mrjf mrjf merged commit eb74a1f into main May 5, 2026
9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant