Skip to content

(improvement) (cython only) cache deserializer instances in find_deserializer and m…#741

Draft
mykaul wants to merge 2 commits intoscylladb:masterfrom
mykaul:perf/cache-deserializer-lookup
Draft

(improvement) (cython only) cache deserializer instances in find_deserializer and m…#741
mykaul wants to merge 2 commits intoscylladb:masterfrom
mykaul:perf/cache-deserializer-lookup

Conversation

@mykaul
Copy link

@mykaul mykaul commented Mar 13, 2026

…ake_deserializers

Cache find_deserializer() and make_deserializers() results in Cython cdef dict caches keyed on cqltype objects to avoid repeated class lookups and Deserializer object creation on every result set.

Using cqltype objects (not id()) as cache keys holds strong references, preventing GC/id-reuse correctness issues with parameterized types.

Motivation

On every result set, make_deserializers(coltypes) is called from row_parser.pyx:37, which in turn calls find_deserializer() for each column type. These functions perform class name lookups and issubclass() chains, then create fresh Deserializer objects -- all redundant work when the same column types appear repeatedly (which is always the case for prepared statements).

Benchmark results

Benchmarks compare the original code (Before) against the new cached implementation (After).

find_deserializer (single type lookup):

Variant Min Mean Median Ops/sec
Before (original) 266.0 ns 305.0 ns 292.0 ns 3.3 Mops/s
After (with cache) 44.0 ns 49.0 ns 47.8 ns 20.4 Mops/s

make_deserializers (5 types):

Variant Min Mean Median Ops/sec
Before (original) 1,976 ns 2,438 ns 2,435 ns 410 Kops/s
After (with cache) 74.9 ns 83.5 ns 81.7 ns 12,000 Kops/s

make_deserializers (10 types):

Variant Min Mean Median Ops/sec
Before (original) 3,553 ns 3,812 ns 3,761 ns 262 Kops/s
After (with cache) 89.7 ns 105.1 ns 97.6 ns 9,511 Kops/s

Design notes

  • Caches are cdef dict (C-level, not accessible from Python) for minimal overhead
  • Cache keys are the cqltype objects themselves, not id(cqltype) -- holds strong references preventing GC and id() reuse
  • For prepared statements (the hot path), cache hit rate is effectively 100%
  • Cache is naturally bounded by the number of distinct cqltype objects in use

Tests

All existing unit tests pass (108 passed, 1 skipped).

Pre-review checklist

  • I have split my patch into logically separate commits.
  • All commit messages clearly explain what they change and why.
  • I added relevant tests for new features and bug fixes.
  • All commits compile, pass static checks and pass test.
  • PR description sums up the changes and reasons why they should be introduced.
  • I have provided docstrings for the public items that I want to introduce.
  • I have adjusted the documentation in ./docs/source/.
  • I added appropriate Fixes: annotations to PR description.

@mykaul mykaul changed the title (improvement) cache deserializer instances in find_deserializer and m… (improvement) (cython only) cache deserializer instances in find_deserializer and m… Mar 13, 2026
@mykaul mykaul marked this pull request as draft March 13, 2026 10:13
@mykaul mykaul requested a review from Copilot March 13, 2026 13:10
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR improves performance in the Cython fast-path by caching deserializer lookups and per-column deserializer arrays, avoiding repeated class-resolution work and repeated Deserializer object creation across result sets.

Changes:

  • Add module-level Cython dict caches for find_deserializer() and make_deserializers() keyed by cqltype objects / tuples of cqltype objects.
  • Update find_deserializer() / make_deserializers() to consult and populate these caches.
  • Add a pytest-based benchmark module to compare cached vs uncached behavior.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 5 comments.

File Description
cassandra/deserializers.pyx Adds caching for deserializer instance resolution and for per-column deserializer arrays.
benchmarks/test_deserializer_cache_benchmark.py Introduces correctness checks + pytest-benchmark-style benchmarks for the new caching behavior.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +443 to +447
# Cache make_deserializers results keyed on the tuple of cqltype objects.
# Using the cqltype objects themselves (rather than id()) as keys ensures
# the dict holds strong references, preventing GC and id() reuse issues
# with non-singleton parameterized types.
cdef dict _make_deserializers_cache = {}
Comment on lines +471 to +474
try:
return <Deserializer>_deserializer_cache[cqltype]
except KeyError:
pass
Comment on lines +451 to +458
cdef tuple key = tuple(cqltypes)
try:
return _make_deserializers_cache[key]
except KeyError:
pass
result = obj_array([find_deserializer(ct) for ct in cqltypes])
_make_deserializers_cache[key] = result
return result
Comment on lines +21 to +27
import pytest

from cassandra import cqltypes
from cassandra.deserializers import (
find_deserializer,
make_deserializers,
)
Comment on lines +23 to +27
from cassandra import cqltypes
from cassandra.deserializers import (
find_deserializer,
make_deserializers,
)
mykaul added 2 commits March 20, 2026 19:15
…ake_deserializers

Cache find_deserializer() and make_deserializers() results in Cython cdef
dict caches keyed on cqltype objects to avoid repeated class lookups and
Deserializer object creation on every result set.

Using cqltype objects (not id()) as cache keys holds strong references,
preventing GC/id-reuse correctness issues with parameterized types.

## Motivation

On every result set, make_deserializers(coltypes) is called from
row_parser.pyx:37, which in turn calls find_deserializer() for each column
type. These functions perform class name lookups and issubclass() chains,
then create fresh Deserializer objects -- all redundant work when the same
column types appear repeatedly (which is always the case for prepared
statements).

## Benchmark results

Benchmarks compare the original code (Before) against the new cached
implementation (After).

find_deserializer (single type lookup):
| Variant | Min | Mean | Median | Ops/sec |
|---|---|---|---|---|
| Before (original) | 266.0 ns | 305.0 ns | 292.0 ns | 3.3 Mops/s |
| After (with cache) | 44.0 ns | 49.0 ns | 47.8 ns | 20.4 Mops/s |

make_deserializers (5 types):
| Variant | Min | Mean | Median | Ops/sec |
|---|---|---|---|---|
| Before (original) | 1,976 ns | 2,438 ns | 2,435 ns | 410 Kops/s |
| After (with cache) | 74.9 ns | 83.5 ns | 81.7 ns | 12,000 Kops/s |

make_deserializers (10 types):
| Variant | Min | Mean | Median | Ops/sec |
|---|---|---|---|---|
| Before (original) | 3,553 ns | 3,812 ns | 3,761 ns | 262 Kops/s |
| After (with cache) | 89.7 ns | 105.1 ns | 97.6 ns | 9,511 Kops/s |

## Design notes

- Caches are cdef dict (C-level, not accessible from Python) for minimal
  overhead
- Cache keys are the cqltype objects themselves, not id(cqltype) -- holds
  strong references preventing GC and id() reuse
- For prepared statements (the hot path), cache hit rate is effectively 100%
- Cache is naturally bounded by the number of distinct cqltype objects in use

## Tests

All existing unit tests pass (108 passed, 1 skipped).
…ports

- Add 256-entry size cap to both _deserializer_cache and
  _make_deserializers_cache to prevent unbounded growth from
  non-interned parameterized types in unprepared queries.
- Add clear_deserializer_caches() public API so that runtime Des*
  class overrides (e.g. DesBytesType = DesBytesTypeByteArray for
  cqlsh) can flush stale cached instances.
- Add get_deserializer_cache_sizes() diagnostic helper.
- Document override/cache interaction in code comments.
- Fix benchmark copyright (DataStax -> ScyllaDB), add
  pytest.importorskip guards for pytest-benchmark and Cython.
- Add 11 unit tests for cache hit/miss, clear, eviction bounds,
  and size reporting.
@mykaul mykaul force-pushed the perf/cache-deserializer-lookup branch from 5894e22 to 44cdd0d Compare March 20, 2026 17:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants