[controller] Reduce IdealState footprint in LLC segment commit path#17707
Conversation
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #17707 +/- ##
============================================
+ Coverage 63.16% 63.23% +0.06%
- Complexity 1499 1502 +3
============================================
Files 3181 3181
Lines 190816 190854 +38
Branches 29164 29172 +8
============================================
+ Hits 120533 120683 +150
+ Misses 60890 60787 -103
+ Partials 9393 9384 -9
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Pull request overview
This PR optimizes memory usage in the LLC (Low-Level Consumer) segment commit path by removing an eager IdealState fetch that was performed before each commit. The optimization introduces lazy loading via Supplier pattern and moves validation into the IdealState updater to work with the same snapshot used for mutations.
Changes:
- Removed eager IdealState fetch before Step-3, replacing it with lazy Supplier-based loading for partition-id fallback
- Moved CONSUMING state validation from pre-commit check to inside Step-3 IdealState updater
- Added best-effort cleanup for orphaned segment metadata when Step-3 fails or doesn't add the segment to IdealState
- Removed pause state checks from Step-2, relying on Step-3 to handle paused tables/topics
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 7 comments.
| File | Description |
|---|---|
| PinotLLCRealtimeSegmentManager.java | Core implementation changes: removed eager IdealState fetch, moved validation to Step-3, added cleanup logic, and implemented lazy partition-id loading |
| PinotLLCRealtimeSegmentManagerTest.java | Added test to verify IdealState is not fetched when partition IDs are available from stream |
| boolean newConsumingSegmentInIdealState = false; | ||
| if (newConsumingSegmentName != null) { | ||
| newConsumingSegmentInIdealState = idealState.getRecord().getMapFields().containsKey(newConsumingSegmentName); | ||
| if (!newConsumingSegmentInIdealState) { | ||
| removeSegmentZKMetadataBestEffort(realtimeTableName, newConsumingSegmentName); | ||
| newConsumingSegmentName = null; | ||
| } | ||
| } |
There was a problem hiding this comment.
When the table/topic is paused, new segment metadata is created in Step-2 but not added to IdealState in Step-3 (line 1469 passes null when paused). The cleanup logic then removes this orphaned metadata. However, there's a race condition: if the RealtimeSegmentValidationManager or another component reads segment metadata between Step-2 completion and the cleanup (lines 708-709), it will see a segment with IN_PROGRESS status that's not in IdealState. This could trigger unnecessary validation/repair actions or cause confusion in monitoring. While the cleanup mitigates this, consider checking pause state before creating metadata (possibly via the lazy IdealState supplier used in getPartitionIds) to avoid creating metadata that will be immediately discarded.
|
@xiangfu0 given group commit is already enabled by default, this PR itself doesn't reduce the memory allocation from roughly O(concurrency * IS) to O(batches * IS) right ? This PR reduces the memory allocation by half in the happy path right ? This is also a good improvement but just confirming that the description probably needs to be updated. |
krishan1390
left a comment
There was a problem hiding this comment.
LGTM. We need to add more tests that verify the fallback path, the cleanup logic, and the paused-table scenario.
The group commit logic takes the updater function, so only in the group commit thread, IS will be cloned and processed. |
|
Formatting correction for the previous note (shell escaped function names). Addressed follow-up comments in
I intentionally left the two broader design-tradeoff threads unresolved in this patch (re-introducing pre-Step2 pause checks / early pre-Step1 CONSUMING check), because both approaches require an additional large IdealState read in the hot path and would partially negate the memory-footprint objective of this PR. Happy to follow up with a dedicated tradeoff change if desired. |
| idealState = | ||
| updateIdealStateForSegments(tableConfig, committingSegmentName, newConsumingSegmentName, instancePartitions); | ||
| } catch (RuntimeException e) { | ||
| if (newConsumingSegmentName != null) { |
There was a problem hiding this comment.
(MAJOR) This logic doesn't handle the table paused scenario, where there will be a dangling consuming segment ZK metadata. Please add a test to guard this
899b092 to
5baefa3
Compare
5baefa3 to
f4a0b39
Compare
[controller] Add LLC commit-path edge-case coverage
f4a0b39 to
bb667b1
Compare
What
commitSegmentMetadataInternalSupplierso normal commits skip large IS fetchMemory Impact Analysis (100 Concurrent Commits)
controller.segment.completion.group.commit.enabled=true).O(concurrency * IS)toO(batches * IS).Lower-bound example if IdealState is 160MB serialized:
100 * 160MB(eager reads) + Step-3 clone/read overhead.Caveat:
Correctness
Test
timeout 600 ./mvnw -pl pinot-controller -DskipTests compiletimeout 600 ./mvnw -pl pinot-controller -Dtest=PinotLLCRealtimeSegmentManagerTest -Dsurefire.failIfNoSpecifiedTests=false testtestCommitSegmentMetadataSkipsIdealStateFetchWhenPartitionIdsAvailableWhy
For very large realtime tables, loading full IdealState during each commit can amplify controller heap pressure. This change keeps correctness checks while reducing unnecessary large-object materialization on the common commit path.