[SPARK-56734][CORE] Optimize RocksDBPersistenceEngine with Column Families and zero-allocation prefix matching by darion-yaphet · Pull Request #55696 · apache/spark

darion-yaphet · 2026-05-06T02:33:26Z

What changes were proposed in this pull request?

This PR refactors RocksDBPersistenceEngine to improve performance and operational flexibility by:

Introducing dedicated Column Families (app_, worker_, driver_) for different metadata types.
Optimizing the read operation from O(N_total) to O(N_type) by using type-specific Column Family iterators.
Replacing expensive string-based prefix matching (new String(iter.key()).startsWith(...)) with a zero-allocation byte-level comparison helper.
Implementing an automatic data migration path to move existing records from the default Column Family to their respective new Column Families upon startup.
Ensuring proper resource management by overriding close() to release RocksDB handles and the database instance.

Why are the changes needed?

Previously, all metadata was stored in the default Column Family. This caused several issues:

Scan efficiency: Even when reading a specific type of data (e.g., Applications), the iterator had to be filtered via prefix checks across the entire keyspace.
Performance overhead: Every iteration involved creating a new String object from the byte array key for prefix verification, leading to significant GC pressure in metadata-heavy clusters.
Operational limits: Lack of granular configuration for different data types (e.g., Memtable size, compression strategy).

Does this PR introduce any user-facing change?

No. The migration logic ensures that existing persisted state is transparently moved to the new structure without data loss.

How was this patch tested?

Verified with existing Standalone Master recovery tests.
Manual verification of data migration from legacy single-CF RocksDB instances.

…ilies and zero-allocation prefix matching This PR refactors RocksDBPersistenceEngine to improve performance and operational flexibility by: 1. Introducing dedicated Column Families (app_, worker_, driver_) for different metadata types. 2. Optimizing the read operation from O(N_total) to O(N_type) by using type-specific Column Family iterators. 3. Replacing expensive string-based prefix matching (new String(iter.key()).startsWith(...)) with a zero-allocation byte-level comparison helper. 4. Implementing an automatic data migration path to move existing records from the default Column Family to their respective new Column Families upon startup. 5. Ensuring proper resource management by overriding close() to release RocksDB handles and the database instance. Previously, all metadata was stored in the default Column Family. This caused several issues: - Scan efficiency: Even when reading a specific type of data (e.g., Applications), the iterator had to be filtered via prefix checks across the entire keyspace. - Performance overhead: Every iteration involved creating a new String object from the byte array key for prefix verification, leading to significant GC pressure in metadata-heavy clusters. - Operational limits: Lack of granular configuration for different data types (e.g., Memtable size, compression strategy). No. The migration logic ensures that existing persisted state is transparently moved to the new structure without data loss. - Verified with existing Standalone Master recovery tests. - Manual verification of data migration from legacy single-CF RocksDB instances.

darion-yaphet force-pushed the SPARK-56734 branch from a3ffe9d to fd74ac3 Compare May 6, 2026 02:35

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[SPARK-56734][CORE] Optimize RocksDBPersistenceEngine with Column Families and zero-allocation prefix matching#55696

[SPARK-56734][CORE] Optimize RocksDBPersistenceEngine with Column Families and zero-allocation prefix matching#55696
darion-yaphet wants to merge 1 commit intoapache:masterfrom
darion-yaphet:SPARK-56734

darion-yaphet commented May 6, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

darion-yaphet commented May 6, 2026

What changes were proposed in this pull request?

Why are the changes needed?

Does this PR introduce any user-facing change?

How was this patch tested?

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant