Skip to content

chore(tesseract): document core types of the SQL planner#10866

Draft
waralexrom wants to merge 7 commits into
masterfrom
tesseract-docs
Draft

chore(tesseract): document core types of the SQL planner#10866
waralexrom wants to merge 7 commits into
masterfrom
tesseract-docs

Conversation

@waralexrom
Copy link
Copy Markdown
Member

Summary

Add rustdoc to the central types of Tesseract's SQL planner

Add module-level rustdoc to the `MemberSymbol` enum describing it as the
first-class atomic unit of query planning that flows through every layer
(logical planning, physical-plan construction, SQL rendering), and short
docs on selected public methods that capture non-obvious invariants
(`is_dimension` covers `TimeDimension`, `apply_recursive` recurses on
the result of `f` rather than the original, `has_member_in_reference_chain`
includes self, etc.).

Inline `is_basic_dimension` into its two call sites in
`multi_stage/query_description.rs`. The method's name claimed the
opposite of what it returned — true meant "has multi-stage members in
the dependency tree" — and both callers relied on a double-negative
match. Replaced with a direct `has_multi_stage_members(&d, true)` call.
Add rustdoc to `DimensionSymbol`, `CalendarDimensionTimeShift`,
`DimensionSymbolFactory`, `DimensionKind` enum and its four variants
(`RegularDimension`, `CaseDimension`, `SwitchDimension`, `GeoDimension`),
`DimensionType` and the `Case` enum in `common/case.rs`.

Docs frame each type in terms of its data-model declaration (e.g.
`type: switch`, `type: geo`, the `case` field) rather than listing
struct fields. Calc group is documented as the derived state of a
switch dimension declared without `sql` — an abstract enumeration
cross-joined into the query as a virtual table of values. The `Case`
enum doc spells out the two forms (classic and switch-style) and the
fact that switch-style `case.switch` may reference another dimension.

Method-level docs are added only where the signature does not already
carry the meaning: `owned_by_cube` (cube on the path is required in
the join), `is_sub_query` / `propagate_filters_to_sub_query` (direct
mappings from the data-model definition), `iter_sql_calls` (mask_sql
is intentionally excluded — compiled against the owning cube of a
view-exposed dimension), the calendar time-shift lookups, and a few
others. `MemberSymbol::full_name` is updated symmetrically.
…ssions

Add rustdoc to the remaining member-symbol kinds:

- `MeasureSymbol` and its measure-kind family (`MeasureKind`,
  `Count` / `Aggregated` / `Calculated` / `Rank`, `AggregateWrap`,
  `CountSql`, `AggregationType`, `CalculatedMeasureType`,
  `MeasureTimeShifts`, `DimensionTimeShift`, `MeasureOrderBy`).
  Method-level docs cover non-obvious behaviour: `new_unrolling`
  (drops the windowing context, multi-stage collapses to a
  `Calculated` kind), `new_patched`, `is_additive` (multi-stage
  measures are never additive), `is_additive_in_multiplied`,
  `iter_sql_calls` (mask, filters and order_by are intentionally
  excluded), `aggregate_wrap` / `pre_aggregate_wrap`,
  `can_replace_type_with`, `supports_additional_filters`.

- `TimeDimensionSymbol`: a base time dimension viewed at a chosen
  granularity and date range, created at query time. Docs explain
  `change_granularity`, `is_reference` (a calendar-SQL granularity
  blocks the reference forwarding), `get_dependencies_as_time_dimensions`,
  `rollup_granularity`, `get_range_for_time_series` and the
  `alias_suffix` (granularity name like `day` / `month`).

- `MemberExpressionSymbol`: a synthetic, query-time member built
  either from a raw SQL expression or from a patched existing member;
  lives in the `expr:` namespace. Docs cover `with_parenthesized`,
  `cube_names_if_dimension_only_expression`, `reference_member`.

Drive-by cleanups:

- Rename typo'd public API: `is_addictive` → `is_additive`,
  `can_used_as_addictive_in_multplied` → `is_additive_in_multiplied`,
  `MeasureMatcher.only_addictive` → `only_additive`. Callsites in
  `multi_stage_query_planner.rs`, `measure_matcher.rs`,
  `optimizer.rs`, `query_properties.rs` and tests updated.
- Remove dead state `MeasureSymbol.is_splitted_source`: the field
  was always initialised to `false`, only propagated through
  `new_unrolling` / `new_patched`, and had no external callers.
Add rustdoc to `SqlCall` and its supporting types: `CubeRef`,
`SqlDependency`, `SqlCallArg`, `SqlCallFilterParamsItem`,
`SqlCallFilterGroupItem`, `SqlCallBuilder`.

Frame `SqlCall` as the Tesseract representation of a SQL-like
function from the data model (member `sql:`, `mask_sql:`, case
branches, measure filters, etc.) that plays two roles at once:
a store of the template's dependencies already resolved to live
member symbols and cube references, and the renderer that turns
those dependencies into the final SQL string via `{arg:N}` /
`{fp:N}` / `{fg:N}` / `{sv:N}` placeholder substitution.

`CubeRef` doc spells out the difference between the two variants:
`Name` resolves to the cube's quoted identifier (rendered from
`{CUBE}` / `{TABLE}`), `Table` resolves to the cube's `sql:`
function result (rendered from `{CUBE.sql()}`, with pre-aggregation
source substitution where registered).

Method-level docs cover the non-obvious bits: `is_owned_by_cube`
(no deps or any cube ref), `is_direct_reference` (exactly
`{arg:0}` with one symbol dep), `eval` vs `eval_vec` (single vs
multi-template; multi-template arises from pre-aggregation
dimension/measure ref lists), `dependencies_count` and
`get_dependencies` (symbol deps only, cube refs excluded),
`apply_recursive`.

Also rewrite `arg_paren_contexts` from `///` to plain `//` line
comments — it is a private field.
Add rustdoc to the filter tree types, their compiler and operators.

- `tree.rs`: `Filter`, `FilterItem` (Group / Item / Segment),
  `FilterGroup`, `FilterGroupOperator`, plus method docs for
  `Filter::all_member_evaluators`, `Filter::to_filter_item`,
  `FilterItem::find_subtree_for_members` (only AND groups are
  traversed for partial matching) and
  `FilterItem::find_value_restriction` (AND intersect, OR union;
  empty Vec encodes a contradiction). Private
  `extract_filter_members` switched from `///` to plain `//`.

- `base_filter.rs`: `FilterType` (drives WHERE vs HAVING placement),
  `BaseFilter` as a thin wrapper over `TypedFilter` with the
  existing collapse TODO preserved, plus docs for
  `member_evaluator` vs `raw_member_evaluator` (TimeDimension
  unwrap),  `filter_operator` (raw enum) vs `operation` (decoded
  `FilterOp`), `is_single_value_equal`, `get_value_restrictions`,
  `time_dimension_symbol`.

- `typed_filter.rs`: `TypedFilter`, `FilterOp`.

- `filter_operator.rs`: `FilterOperator` doc spelling out that
  `RegularRollingWindowDateRange` / `ToDateRollingWindowDateRange`
  are synthetic — never coming from a query.

- `base_segment.rs`: `BaseSegment` as one `segments:` entry from
  the data model, materialised as a synthetic `MemberExpression`.

- `compiler.rs`: `FilterCompiler` and
  `FilterCompiler::add_time_dimension_item` (lifts a
  time-dimension request's `date_range` into an `InDateRange`
  filter), plus `extract_result`.

- `operators/*.rs`: a one-line `///` per `*Op` struct
  (`ComparisonOp`, `EqualityOp`, `InListOp`, `LikeOp`,
  `DateRangeOp`, `DateSingleOp`, `MeasureFilterOp`, `NullabilityOp`,
  `RegularRollingWindowOp`, `ToDateRollingWindowOp`) plus
  `DateRangeOp::formatted_date_range`.
Document the cube_bridge module as the DTO layer between Tesseract
and the JavaScript schema compiler — every type in the module
mirrors the shape of an object delivered by JS, with no business
logic.

Add `///` docs to the non-DTO members:

- `BaseTools` — dialect-independent JS callbacks used during
  compilation and planning (SQL templates, time-series generation,
  allocated params, pre-aggregation lookup, join-tree resolution).
  Dialect-specific helpers live behind `DriverTools`, reachable via
  `driver_tools()`.
- `DriverTools` — dialect-specific JS helpers used only at
  SQL-generation time, once the target dialect is known.
- `SqlTemplatesRender` — Jinja2 template rendering; the native
  implementation keeps the parsed templates locally and renders
  without crossing the JS boundary.
- `SqlUtils` — opaque holder for the JS-side `SQL_UTILS` object
  that Tesseract only forwards into member `sql` functions.
- `SecurityContext` — type-erased reference to the JS-side security
  context object, used to materialise
  `SECURITY_CONTEXT.x.filter(...)` / `unsafeValue()` proxies.
- `MemberSql` / `NativeMemberSql` — a member's `sql:` JS function;
  `compile_template_sql` calls it under proxied
  `{CUBE}` / `FILTER_PARAMS` / `FILTER_GROUP` / `SECURITY_CONTEXT` /
  `SQL_UTILS` arguments that record every access into a shared state.
- Supporting types: `SqlTemplate` (single string or one per
  pre-aggregation member ref), `FilterParamsColumn` (plain column
  name or JS callback), `SqlTemplateArgs` (dependencies collected
  during `compile_template_sql`).

Plain DTO types (cube/dimension/measure/segment/join/pre-aggregation
definitions, case variants, geo, timeshift, member order/expression
etc.) are intentionally left undocumented — the field shape is the
documentation.
- `Compiler`: compilation context for the planner — resolves
  data-model declarations into `MemberSymbol`s, caches them by
  `SymbolPath`, holds the JS-side interfaces (cube evaluator, base
  tools, security context) plus query-level metadata (timezone,
  alias overrides). Method docs added to the resolver surface:
  `add_auto_resolved_member_evaluator`, `add_measure_evaluator`,
  `add_dimension_evaluator` (segment fallback to a parenthesized
  member expression), `add_segment_evaluator`,
  `add_cube_name_evaluator`, `add_cube_table_evaluator`,
  `alias_for_member`, `compile_sql_call`.
- `SymbolFactory`: builds a `MemberSymbol` from a `SymbolPath`
  through the `Compiler`.
- `SymbolPath`: resolved path of a member or cube reference in
  the data model, plus docs for `parse` / `parse_parts`.
- `CubeNameSymbol` and `CubeTableSymbol`: the two cube-reference
  flavours (`{CUBE}` identifier vs `{CUBE.sql()}` table
  expression).

Also remove `Compiler::join_hints` and its no-longer-needed
imports (`JoinHintsCollector`, `JoinHintItem`, `TraversalVisitor`):
no callers anywhere in the workspace.
@github-actions github-actions Bot added the rust Pull requests that update Rust code label May 12, 2026
@claude
Copy link
Copy Markdown
Contributor

claude Bot commented May 12, 2026

Claude finished @waralexrom's task in 2m 15s —— View job


Review of PR #10866 — Document core types of the SQL planner

  • Gather context and review the full diff
  • Review documentation accuracy and completeness
  • Check for code changes (non-doc) and verify correctness
  • Review test changes
  • Post final review

This is a high-quality documentation PR that adds rustdoc to 52 files across Tesseract's SQL planner. It also bundles three welcome housekeeping changes: a typo fix, dead code removal, and an inlining refactor. Overall this is ready to merge — one bug to fix first, then a handful of nits.


Bug: Truncated doc comment

The doc comment on MeasureOrderBy is cut off mid-sentence:

rust/cube/cubesqlplanner/cubesqlplanner/src/planner/symbols/measure_symbol.rs:18

/// Per-measure `order_by` entry from the data-model definition: a
#[derive(Clone)]
pub struct MeasureOrderBy {

The sentence after the colon is missing — likely a copy-paste issue. Please complete it (e.g. "a SQL expression and sort direction pair.").

Fix this →


Code changes (non-doc) — all look correct

  1. Typo fix: is_addictiveis_additive and can_used_as_addictive_in_multpliedis_additive_in_multiplied — good catch. All call sites are updated consistently across measure_matcher.rs, optimizer.rs, multi_stage_query_planner.rs, query_properties.rs, and the tests.

  2. Removal of is_splitted_source field from MeasureSymbol — the field was always false (set in the constructor, never written), and has no remaining callers. Clean removal.

  3. Removal of MemberSymbol::is_basic_dimension and MemberSymbol::is_leaf — both had single call sites. is_basic_dimension is inlined at its only caller in query_description.rs (which now calls has_multi_stage_members directly, also adding the import). is_leaf had no callers at all. Both removals are clean.

  4. Removal of Compiler::join_hints() — the unused method and its associated imports (JoinHintsCollector, TraversalVisitor, JoinHintItem) are removed. The JoinHintsCollector itself and collect_join_hints free function remain in the codebase and are used elsewhere. Clean.

  5. Doc /// → regular // on private items (arg_paren_contexts field in SqlCall, extract_filter_members in tree.rs) — sensible, since doc comments on private items only appear with --document-private-items.

  6. iter_sql_calls comment rewrite (both MeasureSymbol and DimensionSymbol) — the old inline //FIXME / // comments explaining why mask_sql is excluded are promoted to proper doc comments. The explanation is preserved accurately.


Nits (non-blocking)

  • Some doc comments repeat the same phrase across multiple types verbatim (e.g. "Full unique identifier of the symbol: cube path, member name and any suffix..." appears on MeasureSymbol::full_name, DimensionSymbol::full_name, MemberSymbol::full_name, MemberExpressionSymbol::full_name). This is fine for rustdoc — each type's page stands alone — but if you want to reduce maintenance burden you could consider a shared module-level note and shorter per-method comments.

  • The MeasureKind doc says Rank — type: rank. Window-function rank, no sql. — worth noting the sql field is None for rank only because rank() has no inner expression, not because it's forbidden. (No change required — just flagging for doc accuracy.)


Summary

The documentation is thorough, well-written and correctly describes the planner's architecture. The bundled code cleanups (typo fix, dead code removal) are all correct and have no remaining callers. The only actionable item is the truncated MeasureOrderBy doc comment.

@codecov
Copy link
Copy Markdown

codecov Bot commented May 12, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 83.47%. Comparing base (e60ad99) to head (f7331b8).
⚠️ Report is 8 commits behind head on master.

Additional details and impacted files
@@             Coverage Diff             @@
##           master   #10866       +/-   ##
===========================================
+ Coverage   58.36%   83.47%   +25.11%     
===========================================
  Files         216      254       +38     
  Lines       16939    75350    +58411     
  Branches     3435        0     -3435     
===========================================
+ Hits         9887    62901    +53014     
- Misses       6550    12449     +5899     
+ Partials      502        0      -502     
Flag Coverage Δ
cube-backend ?
cubesql 83.47% <ø> (?)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

&self.mask_sql
}

pub fn is_addictive(&self) -> bool {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

rust Pull requests that update Rust code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants