Skip to content

Conversation

@mykaul
Copy link

@mykaul mykaul commented Jan 29, 2026

Add slots definition to multiple classes in the project.
It started with one set (frames), then I did additional more, each separately, in theory, independently (but they were tested only one on top of the other).
Example from frame:

Optimizes memory usage for protocol frame objects
_Frame is created for every protocol message, making this a high-impact optimization
Reduces object overhead from ~300+ bytes to ~40-60 bytes per frame
All existing functionality preserved, comprehensive tests pass

This replaces #647 - which had a performance regression. This one seems fine.

Pre-review checklist

  • I have split my patch into logically separate commits.
  • All commit messages clearly explain what they change and why.
  • I added relevant tests for new features and bug fixes.
  • All commits compile, pass static checks and pass test.
  • PR description sums up the changes and reasons why they should be introduced.
  • I have provided docstrings for the public items that I want to introduce.
  • I have adjusted the documentation in ./docs/source/.
  • I added appropriate Fixes: annotations to PR description.

- Add __slots__ definition to _Frame class in connection.py
- Optimizes memory usage for protocol frame objects
- _Frame is created for every protocol message, making this a high-impact optimization
- Reduces object overhead from ~300+ bytes to ~40-60 bytes per frame
- All existing functionality preserved, comprehensive tests pass

Signed-off-by: Yaniv Kaul <yaniv.kaul@scylladb.com>
Add __slots__ to _MessageType base class and all protocol message classes
to reduce memory overhead. Each message instance saves approximately 280-300
bytes by eliminating the per-instance __dict__.

Changes:
- _MessageType: Added __slots__ with 'custom_payload' and 'tracing'
- _MessageType: Added __init__ to initialize attributes with proper defaults
- _DecodableMessageType: Removed duplicate 'custom_payload' from __slots__
- All message subclasses: Added super().__init__() calls for proper initialization

Key attributes must be in __slots__ because:
- custom_payload: Accessed by encode_message() for ALL message types (line 1127)
- tracing: Set on message instances in cluster.py (line 2972)

Without these in __slots__, attempting to set them raises AttributeError,
causing connection failures with: 'OptionsMessage' object has no attribute
'custom_payload'

Message classes covered:
- Outgoing (_MessageType): StartupMessage, OptionsMessage, QueryMessage,
  ExecuteMessage, PrepareMessage, BatchMessage, RegisterMessage, etc.
- Incoming (_DecodableMessageType): ResultMessage, EventMessage,
  AuthenticateMessage, SupportedMessage, etc.

Signed-off-by: Yaniv Kaul <yaniv.kaul@scylladb.com>
Implemented __slots__ for ColumnMetadata and IndexMetadata classes
in cassandra/metadata.py to reduce per-instance memory overhead.

Changes:
- ColumnMetadata: Added __slots__ with 6 attributes (table, name,
  cql_type, is_static, is_reversed, _cass_type)
- IndexMetadata: Added __slots__ with 5 attributes (keyspace_name,
  table_name, name, kind, index_options)
- Removed class-level attribute definitions that conflicted with
  __slots__ declarations
- All attribute initialization moved to __init__ methods
- Preserved attribute documentation in class docstrings

Memory Impact:
Each instance saves approximately 280-300 bytes by eliminating the
per-instance __dict__. These metadata objects are created frequently
when parsing schema information, making this optimization valuable
for clusters with many tables and indexes.

Testing:
- All 648 unit tests pass successfully
- Metadata-specific tests: 53 passed, 2 skipped
- No functionality changes or regressions

Signed-off-by: Yaniv Kaul <yaniv.kaul@scylladb.com>
Implemented __slots__ for tablet-related classes in cassandra/tablets.py
to reduce per-instance memory overhead.

Changes:
- Tablet class: Added __slots__ with 3 attributes (first_token, last_token, replicas)
- Tablets class: Added __slots__ with 2 attributes (_tablets, _lock)
- Removed class-level attribute definitions that conflicted with __slots__
- Preserved and enhanced attribute documentation in class docstrings

Memory Impact:
Each Tablet instance saves approximately 280-300 bytes by eliminating the
per-instance __dict__. Tablet objects are created for each token range in
ScyllaDB tablets-enabled clusters. In a cluster with many tablets, this
optimization significantly reduces the driver's memory footprint.

The Tablets class is typically a singleton per cluster, so the memory
savings are minimal there, but the change maintains consistency with the
optimization pattern and prevents future __dict__ usage.

Testing:
- All 7 tablet-specific unit tests pass
- Verified __dict__ is not created (memory optimization confirmed)
- No functionality changes or regressions

Signed-off-by: Yaniv Kaul <yaniv.kaul@scylladb.com>
These attributes are defined in ResponseFuture but never used:
- default_timeout: Only used in Session class
- _profile_manager: Only used in Session class
- _warned_timeout: Never referenced anywhere

Removing these attributes before adding __slots__ to ensure
clean slot definitions.

Signed-off-by: Yaniv Kaul <yaniv.kaul@scylladb.com>
Implements __slots__ for Statement, SimpleStatement, BoundStatement, and
BatchStatement classes to reduce memory overhead by eliminating per-instance
__dict__.

Measured memory savings (empirically verified):
- Statement: 240 bytes saved (344 → 104 bytes) = 69.8% reduction
- SimpleStatement: 232 bytes saved (344 → 112 bytes) = 67.4% reduction
- BoundStatement: inherits full parent optimization
- BatchStatement: inherits full parent optimization

Classes modified:
- Statement: Added __slots__ with 9 attributes (retry_policy,
  consistency_level, fetch_size, keyspace, table, custom_payload,
  is_idempotent, _serial_consistency_level, _routing_key)
- SimpleStatement: Added __slots__ with 1 additional attribute
  (_query_string)
- BoundStatement: Added __slots__ with 3 additional attributes
  (prepared_statement, values, raw_values)
- BatchStatement: Added __slots__ with 4 additional attributes
  (batch_type, _statements_and_parameters, _session, _is_lwt)

All 562 unit tests pass. No behavior changes.

Signed-off-by: Yaniv Kaul <yaniv.kaul@scylladb.com>
This commit introduces __slots__ to the Host and HostConnection classes
in cassandra/pool.py to reduce memory footprint of frequently instantiated
objects in connection pools and cluster metadata.

Changes:
- Added __slots__ tuple to Host class with 18 actively used attributes
- Added __slots__ tuple to HostConnection class with 17 actively used attributes
- Consolidated class documentation from individual attribute docstrings
  to comprehensive class-level docstrings
- Added explicit attribute initialization in __init__ methods to set
  default values for attributes that previously relied on class variables
- Removed unused attributes (listen_address, listen_port) from Host class
  after comprehensive usage analysis showed they were never used
- Fixed trailing whitespace issues in modified code

The __slots__ implementation prevents dynamic attribute creation and
reduces per-instance memory overhead, which is significant for drivers
managing large numbers of hosts and connections. All existing functionality
is preserved and tests continue to pass.

Memory impact: Each Host instance saves ~2 unused slots, and both classes
benefit from __slots__ memory layout optimization.

Signed-off-by: Yaniv Kaul <yaniv.kaul@scylladb.com>
Add __slots__ to frequently instantiated token-related classes to reduce
memory overhead by preventing dynamic attribute creation:

* Token base class with __slots__ = ('value',)
* All Token subclasses (HashToken, Murmur3Token, MD5Token, BytesToken)
  with __slots__ = ()
* TokenMap class with comprehensive attribute tuple and consolidated
  docstring documentation

These classes are instantiated frequently during cluster metadata parsing
and query planning. The __slots__ optimization prevents Python from
creating __dict__ for each instance, significantly reducing memory usage
especially in large clusters with many tokens.

All existing functionality is preserved and unit tests pass.

Signed-off-by: Yaniv Kaul <yaniv.kaul@scylladb.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant