Skip to content

perf: use stdlib bisect and attrgetter in tablets.py#757

Open
mykaul wants to merge 1 commit intoscylladb:masterfrom
mykaul:perf/tablets-stdlib-bisect
Open

perf: use stdlib bisect and attrgetter in tablets.py#757
mykaul wants to merge 1 commit intoscylladb:masterfrom
mykaul:perf/tablets-stdlib-bisect

Conversation

@mykaul
Copy link

@mykaul mykaul commented Mar 20, 2026

Summary

  • Use stdlib bisect.bisect_left on Python >= 3.10 (C implementation) instead of the bundled pure-Python copy. The fallback is retained for Python < 3.10 where key= is not supported.
  • Replace per-call lambda closures with module-level operator.attrgetter for first_token / last_token extraction, avoiding repeated function-object allocation.
  • Fix bug in the pure-Python bisect_left fallback: the key=None branch executed bare return (returning None) instead of return lo. This bug is currently latent (all callers pass key=), but would cause silent incorrect behavior if bisect_left were ever called without key.
  • Add unit tests for the fallback bisect_left (7 tests) and get_tablet_for_key (3 tests).

Benchmark Results

Measured on Intel i7-1270P, Python 3.14.3, pytest-benchmark.

get_tablet_for_key (hit — the primary hot path)

Tablets Before (ns) After (ns) Speedup
10 517 365 1.42x
100 616 351 1.75x
1,000 1,008 529 1.91x
10,000 1,339 610 2.20x

bisect_left with key= (isolated)

Size Before (ns) After (ns) Speedup
10 278 216 1.29x
100 450 289 1.56x
1,000 697 406 1.72x
10,000 1,043 496 2.10x

bisect_left without key= (plain ints — bug fix path)

Size Before (ns) After (ns) Speedup
10 162 56 2.89x
100 268 73 3.67x
1,000 495 96 5.16x
10,000 717 117 6.13x

The full TokenAwarePolicy.make_query_plan with tablet lookup shows minimal change because it is dominated by other costs (policy iteration, host filtering), but the get_tablet_for_key component is substantially faster.

- Use bisect.bisect_left from stdlib on Python >= 3.10 (C implementation)
  instead of the bundled pure-Python copy; 1.3-2.1x faster for key= lookups,
  up to 6x faster for plain int searches
- Replace per-call lambda closures with module-level operator.attrgetter
  for first_token/last_token extraction
- Fix bug in the pure-Python bisect_left fallback (Python < 3.10): the
  key=None branch executed bare 'return' (returning None) instead of
  'return lo'
- Move bisect_left definition before the classes that use it
- Add unit tests for the fallback bisect_left and get_tablet_for_key

Benchmark results (get_tablet_for_key hit):
  10 tablets:    517 ns -> 365 ns (1.42x)
  100 tablets:   616 ns -> 351 ns (1.75x)
  1000 tablets:  1008 ns -> 529 ns (1.91x)
  10000 tablets: 1339 ns -> 610 ns (2.20x)
@Lorak-mmk Lorak-mmk self-requested a review March 20, 2026 19:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant