New crate: vortex-clickhouse (ClickHouse integration via C FFI)#6420
Draft
fastio wants to merge 2 commits intovortex-data:developfrom
Draft
New crate: vortex-clickhouse (ClickHouse integration via C FFI)#6420fastio wants to merge 2 commits intovortex-data:developfrom
fastio wants to merge 2 commits intovortex-data:developfrom
Conversation
Contributor
|
Can you please follow contributions guidelines? In particular bigger changes should start with a discussion https://github.com/vortex-data/vortex/blob/develop/CONTRIBUTING.md#contributing-to-vortex |
Author
Thanks for the pointer! I should have started with a discussion first — my apologies for skipping that step. I've opened a discussion here: #6425 Happy to wait for community feedback there before proceeding with the PR. I'll convert this PR to draft in the meantime. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Does this PR closes an open issue or discussion?
No existing issue — this introduces a new integration crate.
What changes are included in this PR?
Add a new vortex-clickhouse crate that enables ClickHouse to natively read and write Vortex files through FORMAT Vortex. The crate compiles to a C static library (staticlib) and exposes an opaque-handle-based C FFI that ClickHouse links against via its IInputFormat / IOutputFormat framework.
Crate structure:
Supported ClickHouse types
Int8–Int256, UInt8–UInt256, Float32/64, Decimal32/64/128/256, String, FixedString(N), Bool, Date, Date32, DateTime, DateTime64, Array(T), Tuple(...), Map(K,V), Nullable(T), LowCardinality(T), Enum8/16, IPv4, IPv6, UUID, and Geo types.
Types without native Vortex equivalents are modeled as Vortex extension types with custom metadata, enabling lossless round-trip through the file format.
What is the rationale for this change?
ClickHouse is one of the most widely deployed analytical databases. Adding native Vortex format support allows ClickHouse users to directly query and produce Vortex files, benefiting from Vortex's adaptive encoding and compression without requiring format conversion pipelines.
The C FFI approach was chosen because:
ClickHouse's format system requires implementing C++ interfaces (IInputFormat, IOutputFormat), so a thin C++ shim calling into Rust via FFI is the natural integration point.
This follows the same pattern as other Rust integrations already in ClickHouse (e.g., BLAKE3, skim).
Opaque handles with _new/_free pairs provide a safe, simple ownership model across the language boundary.
How is this change tested?
225 unit tests covering:
Bidirectional type conversion for all supported ClickHouse types (primitives, strings, decimals, nested, nullable, extension types)
Column data construction and round-trip via VortexColumnBuilder
Extension type registration and metadata serialization
Scanner and writer FFI interface contracts
End-to-end file read/write cycle (e2e_test.rs)
All tests pass: cargo test -p vortex-clickhouse → 225 passed, 0 failed.
Are there any user-facing changes?
No breaking changes to existing APIs. This is a new, additive crate (vortex-clickhouse) with publish = false. It adds the crate to the workspace members and [workspace.dependencies] in the root Cargo.toml.