HDDS-14400. Avoid collecting keys in memory during parallel OM table processing#9624
HDDS-14400. Avoid collecting keys in memory during parallel OM table processing#9624navinko wants to merge 22 commits intoapache:masterfrom
Conversation
|
Hi @swamirishi, |
|
This PR has been marked as stale due to 21 days of inactivity. Please comment or remove the stale label to keep it open. Otherwise, it will be automatically closed in 7 days. |
rnblough
left a comment
There was a problem hiding this comment.
This looks good to me! Dropping the keys accumulating in memory has good simplification effects; thumbs up on additional tests.
|
@ArafatKhan2198 please review |
|
Hi @ArafatKhan2198 please review once and help me with next step if possible. |
|
@navinko please check compile error |
…iction race appropriately. (apache#9869) Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
…processing # Conflicts: # hadoop-ozone/recon/src/main/java/org/apache/hadoop/ozone/recon/tasks/OmTableInsightTask.java # hadoop-ozone/recon/src/main/java/org/apache/hadoop/ozone/recon/tasks/util/ParallelTableIteratorOperation.java
# Conflicts: # hadoop-ozone/recon/src/main/java/org/apache/hadoop/ozone/recon/tasks/util/ParallelTableIteratorOperation.java
Thanks @adoroszlai for reviewing . After aligning with the latest upstream changes, I see that the current implementation already introduces batching with maxKeysInMemory and worker threads, which addresses the original concern around in-memory accumulation. |
|
Thanks @navinko for updating the patch. Since no changes are left in |
What changes were proposed in this pull request?
Avoid collecting keys in memory during parallel OM table processing.
Please describe your PR in detail:
https://issues.apache.org/jira/browse/HDDS-14155
What is the link to the Apache JIRA
https://issues.apache.org/jira/browse/HDDS-14400
How was this patch tested?
CI:
https://github.com/navinko/ozone/actions/runs/20884674236
Validated with junit test and tested the flow by populating data to fileTable and validated the parallel processing for individual table in debug mode and normal.
bash-5.1$ ozone debug ldb --db=/data/metadata/om.db scan --column_family=fileTable --count
23916
Recon log