-
Notifications
You must be signed in to change notification settings - Fork 0
reaggregate support for groupBy. #178
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
Pull Request titles must follow the Conventional Commits specification. Details: |
No docs changes detected for 2ca4ae4 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR adds reaggregation support for groupBy operations in rollup tables, enabling hierarchical grouping operations to work correctly when aggregating already-aggregated data.
Key Changes:
- Introduces
GroupByReaggreagateOperatorfor handling re-aggregation of grouped data in rollup operations - Adds
getAggregatedSource()method to theAggregateColumnSourceinterface to expose underlying aggregated sources - Adds comprehensive test coverage for both static and incremental rollup group operations
Reviewed changes
Copilot reviewed 8 out of 8 changed files in this pull request and generated 10 comments.
Show a summary per file
| File | Description |
|---|---|
| GroupByReaggreagateOperator.java | New operator implementing reaggregation logic for groupBy operations on already-aggregated data |
| AggregateColumnSource.java | Adds new interface method to expose the underlying aggregated column source |
| BaseAggregateColumnSource.java | Implements the new getAggregatedSource() interface method |
| BaseAggregateSlicedColumnSource.java | Implements the new getAggregatedSource() interface method |
| RangeAggregateColumnSource.java | Implements the new getAggregatedSource() interface method |
| AggregationProcessor.java | Integrates GroupByReaggreagateOperator into rollup aggregation processing |
| TestRollupTable.java | Adds test coverage for static and incremental rollup group operations |
| RollupTableImpl.java | Adds import (appears unused) |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
engine/table/src/main/java/io/deephaven/engine/table/impl/by/GroupByReaggreagateOperator.java
Outdated
Show resolved
Hide resolved
engine/table/src/main/java/io/deephaven/engine/table/impl/by/GroupByReaggreagateOperator.java
Outdated
Show resolved
Hide resolved
engine/table/src/main/java/io/deephaven/engine/table/impl/by/GroupByReaggreagateOperator.java
Outdated
Show resolved
Hide resolved
...le/src/main/java/io/deephaven/engine/table/impl/sources/aggregate/AggregateColumnSource.java
Show resolved
Hide resolved
engine/table/src/main/java/io/deephaven/engine/table/impl/by/GroupByReaggreagateOperator.java
Show resolved
Hide resolved
engine/table/src/main/java/io/deephaven/engine/table/impl/by/GroupByReaggreagateOperator.java
Outdated
Show resolved
Hide resolved
engine/table/src/main/java/io/deephaven/engine/table/impl/by/GroupByReaggreagateOperator.java
Outdated
Show resolved
Hide resolved
engine/table/src/main/java/io/deephaven/engine/table/impl/by/AggregationProcessor.java
Outdated
Show resolved
Hide resolved
engine/table/src/main/java/io/deephaven/engine/table/impl/by/GroupByReaggreagateOperator.java
Outdated
Show resolved
Hide resolved
engine/table/src/main/java/io/deephaven/engine/table/impl/hierarchical/RollupTableImpl.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Copilot reviewed 7 out of 7 changed files in this pull request and generated 3 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
engine/table/src/main/java/io/deephaven/engine/table/impl/by/GroupByReaggregateOperator.java
Outdated
Show resolved
Hide resolved
...le/src/main/java/io/deephaven/engine/table/impl/sources/aggregate/AggregateColumnSource.java
Outdated
Show resolved
Hide resolved
| @NotNull final LongChunk<? extends RowKeys> postShiftRowKeys, | ||
| final long destination) { | ||
| // we don't need to deal with these yet | ||
| throw new IllegalStateException(); |
Copilot
AI
Dec 10, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The error message thrown by shiftChunk should be more descriptive. It should match the message from the BucketedContext version on line 159 to maintain consistency and provide clearer information to developers.
| throw new IllegalStateException(); | |
| throw new IllegalStateException("GroupByReaggregateOperator does not support shiftChunk; this operator does not support shifting."); |
…roupByReaggregateOperator.java Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
…ces/aggregate/AggregateColumnSource.java Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Copilot reviewed 12 out of 12 changed files in this pull request and generated 9 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| throw new IllegalArgumentException("AggFormula does not support virtual row variables"); | ||
| } | ||
| } | ||
|
|
Copilot
AI
Dec 11, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The method initializeVectorColumns is missing a JavaDoc comment explaining its purpose, parameters, and behavior. This is a non-trivial method that should be documented.
| /** | |
| * Initializes the {@code vectorColumnDefinitions} map if it has not already been initialized. | |
| * <p> | |
| * For each column in the provided {@code definition}, if the column is a group-by column (its name is in | |
| * {@code groupByColumnSet}), its original {@link ColumnDefinition} is used. Otherwise, a new | |
| * {@link ColumnDefinition} is created with a vector type corresponding to the column's data type. | |
| * <p> | |
| * After processing all columns from the definition, any additional columns from {@code extraColumns} are added. | |
| * | |
| * @param groupByColumnSet the set of column names used for grouping; these columns retain their original definitions | |
| * @param definition the {@link TableDefinition} containing all columns to process | |
| * @param extraColumns a map of additional column definitions to add to the vector column definitions | |
| */ |
| // final GroupByChunkedOperator groupByChunkedOperator = makeGroupByOperatorForFormula(inputNonKeyColumns, | ||
| // table); |
Copilot
AI
Dec 11, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a commented-out code block (lines 1378-1379) that should either be removed or replaced with a more specific TODO comment explaining why it's commented out and when it should be uncommented.
| // final GroupByChunkedOperator groupByChunkedOperator = makeGroupByOperatorForFormula(inputNonKeyColumns, | |
| // table); | |
| // The following code should be restored or implemented when issue #6363 is addressed: | |
| // final GroupByChunkedOperator groupByChunkedOperator = makeGroupByOperatorForFormula(inputNonKeyColumns, table); |
| vectorColumnDefinitions.putAll(extraColumns); | ||
| } | ||
|
|
||
|
|
Copilot
AI
Dec 11, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is an extra blank line that should be removed to maintain consistent code formatting.
| * already filled | ||
| * @param aggregations The aggregations | ||
| * @param groupByColumns The group-by columns | ||
| * @param source |
Copilot
AI
Dec 11, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The @param source JavaDoc is missing a description. Please add a description explaining what the source parameter represents (e.g., "the source table used for reaggregation").
| * @param source | |
| * @param source The source table used for reaggregation. |
|
|
||
| /** | ||
| * Given that there have been modified input columns, should we propagate changes? | ||
| * |
Copilot
AI
Dec 11, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The JavaDoc for the hasModifications method has a trailing space after "Given that there have been modified input columns, should we propagate changes?". This should be removed.
| * | |
| * |
| vectorColumnDefinitions.putAll(extraColumns); | ||
| } | ||
|
|
||
|
|
Copilot
AI
Dec 11, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The method makeGroupByOperatorForFormula is missing a JavaDoc comment explaining its purpose, parameters, and return value. This is a non-trivial factory method that should be documented.
| /** | |
| * Creates a {@link GroupByChunkedOperator} for use with formula aggregations. | |
| * | |
| * @param inputNonKeyColumns The array of input column names (non-key columns) to be grouped. | |
| * @param table The {@link QueryTable} on which the group by operation will be performed. | |
| * @param exposedRowsets The name of the row set column to be exposed in the result. | |
| * @return A {@link GroupByChunkedOperator} configured for the specified columns and table. | |
| */ |
|
|
||
| private static class WithSource extends AggregationProcessor { | ||
| private final @NotNull Table source; | ||
|
|
Copilot
AI
Dec 11, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The WithSource class is missing JavaDoc documentation explaining its purpose and why it's needed. Since it's extending AggregationProcessor with additional state, it should be properly documented.
| private static class WithSource extends AggregationProcessor { | |
| private final @NotNull Table source; | |
| /** | |
| * An {@link AggregationProcessor} that additionally retains a reference to the source {@link Table}. | |
| * <p> | |
| * This class is used when it is necessary to associate the aggregation processor with its originating table, | |
| * for example, to provide context or enable operations that require access to the original data. | |
| * | |
| * @see AggregationProcessor | |
| */ | |
| private static class WithSource extends AggregationProcessor { | |
| private final @NotNull Table source; | |
| /** | |
| * Constructs a WithSource instance. | |
| * | |
| * @param aggregations the collection of aggregations to process | |
| * @param type the type of aggregation | |
| * @param source the source table associated with this processor | |
| */ |
| addBasicOperators((t, n) -> makeVarOrStdOperator(t, n, false, false)); | ||
| } | ||
| } | ||
|
|
Copilot
AI
Dec 11, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The validateSelectColumnForFormula method is missing a JavaDoc comment explaining its purpose and when it should be used. This validation method should be documented to clarify what constraints it enforces.
| /** | |
| * Validates that the provided {@link SelectColumn} is suitable for use in an aggregation formula. | |
| * <p> | |
| * This method enforces the following constraints: | |
| * <ul> | |
| * <li>The {@code SelectColumn} must not reference column arrays.</li> | |
| * <li>The {@code SelectColumn} must not use virtual row variables.</li> | |
| * </ul> | |
| * <p> | |
| * This validation should be used when preparing a {@code SelectColumn} for use with {@code AggSpecFormula} | |
| * or similar aggregation operations that do not support these features. | |
| * | |
| * @param selectColumn the {@code SelectColumn} to validate | |
| * @throws IllegalArgumentException if the {@code SelectColumn} uses column arrays or virtual row variables | |
| */ |
| /* TODO: FIX THIS. */ | ||
| return true; |
Copilot
AI
Dec 11, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a TODO comment indicating that the hasModifications method needs to be fixed. This should either be addressed before merging or tracked with a more specific issue reference.
| /* TODO: FIX THIS. */ | |
| return true; | |
| return columnsModified; |
No description provided.