Skip to content

HDDS-14400. Avoid collecting keys in memory during parallel OM table processing#9624

Closed
navinko wants to merge 22 commits intoapache:masterfrom
navinko:HDDS-14400
Closed

HDDS-14400. Avoid collecting keys in memory during parallel OM table processing#9624
navinko wants to merge 22 commits intoapache:masterfrom
navinko:HDDS-14400

Conversation

@navinko
Copy link
Copy Markdown
Contributor

@navinko navinko commented Jan 11, 2026

What changes were proposed in this pull request?

Avoid collecting keys in memory during parallel OM table processing.

Please describe your PR in detail:

  • The new implementation keeps the iterator thread pool but removes the value-executor pool and in-memory batching.
  • Each table iterator is now owned by a single worker thread and scans only its assigned key range.
  • Each table iterator now runs on single thread and validated it, works as it is with ByteArrayCode .
  • There will be another PR for replacing ByteArrayCodec with CodecBufferCodec under ParallelTableOperation.
    https://issues.apache.org/jira/browse/HDDS-14155
  • Added unit test case for validating new flow .
  • Fixed findbugs comments.

What is the link to the Apache JIRA

https://issues.apache.org/jira/browse/HDDS-14400

How was this patch tested?

CI:
https://github.com/navinko/ozone/actions/runs/20884674236
Validated with junit test and tested the flow by populating data to fileTable and validated the parallel processing for individual table in debug mode and normal.
bash-5.1$ ozone debug ldb --db=/data/metadata/om.db scan --column_family=fileTable --count
23916

Screenshot 2026-01-10 at 5 46 55 PM

Recon log

 Run mode with 21627 key uploaded to fileTable followed by reprocessing.

2026-01-10T15:42:37.870472510Z 2026-01-10 15:42:37,870 [ReconTaskThread-0] INFO tasks.ReconTaskControllerImpl: Task OmTableInsightTask started execution on thread ReconTaskThread-0
2026-01-10T15:42:37.870724094Z 2026-01-10 15:42:37,870 [ReconTaskThread-0] INFO tasks.OmTableInsightTask: OmTableInsightTask: Starting reprocess
2026-01-10T15:42:37.878627094Z 2026-01-10 15:42:37,878 [ReconTaskThread-0] INFO tasks.OmTableInsightTask: OmTableInsightTask: Processing table dTokenTable sequentially (non-String keys)
2026-01-10T15:42:37.888022094Z 2026-01-10 15:42:37,887 [ReconTaskThread-0] INFO util.ParallelTableIteratorOperation: OmTableInsightTask: Parallel iteration completed - Total keys processed: 2
2026-01-10T15:42:37.888184677Z 2026-01-10 15:42:37,888 [ReconTaskThread-0] INFO tasks.OmTableInsightTask: OmTableInsightTask: Processing table s3SecretTable sequentially (non-String keys)
2026-01-10T15:42:37.899993135Z 2026-01-10 15:42:37,899 [ReconTaskThread-0] INFO util.ParallelTableIteratorOperation: OmTableInsightTask: Parallel iteration completed - Total keys processed: 3
2026-01-10T15:42:37.944590219Z 2026-01-10 15:42:37,944 [ReconTaskThread-0] INFO util.ParallelTableIteratorOperation: OmTableInsightTask: Parallel iteration completed - Total keys processed: 21627
2026-01-10T15:42:37.947238802Z 2026-01-10 15:42:37,947 [ReconTaskThread-0] INFO tasks.OmTableInsightTask: OmTableInsightTask: Reprocess completed in 76 ms
2026-01-10T15:42:37.947249094Z 2026-01-10 15:42:37,947 [ReconTaskThread-0] INFO tasks.ReconTaskControllerImpl: Task OmTableInsightTask completed execution

@navinko
Copy link
Copy Markdown
Contributor Author

navinko commented Jan 11, 2026

Hi @swamirishi,
As suggested created a new PR - Avoid collecting keys in memory during parallel OM table processing #9624
Kindly review .

@jojochuang
Copy link
Copy Markdown
Contributor

@rnblough

@github-actions
Copy link
Copy Markdown

github-actions Bot commented Feb 6, 2026

This PR has been marked as stale due to 21 days of inactivity. Please comment or remove the stale label to keep it open. Otherwise, it will be automatically closed in 7 days.

@github-actions github-actions Bot added the stale label Feb 6, 2026
Copy link
Copy Markdown
Contributor

@rnblough rnblough left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me! Dropping the keys accumulating in memory has good simplification effects; thumbs up on additional tests.

@github-actions github-actions Bot removed the stale label Feb 7, 2026
@adoroszlai
Copy link
Copy Markdown
Contributor

@ArafatKhan2198 please review

@navinko
Copy link
Copy Markdown
Contributor Author

navinko commented Mar 2, 2026

Hi @ArafatKhan2198 please review once and help me with next step if possible.

@adoroszlai
Copy link
Copy Markdown
Contributor

@navinko please check compile error

navinko added 3 commits March 19, 2026 16:42
# Conflicts:
#	hadoop-ozone/recon/src/main/java/org/apache/hadoop/ozone/recon/tasks/util/ParallelTableIteratorOperation.java
@navinko
Copy link
Copy Markdown
Contributor Author

navinko commented Mar 19, 2026

@navinko please check compile error

Thanks @adoroszlai for reviewing .
To resolve the conflict with latest upstream changes i suppose to accept most of upstream changes .
I rebased the branch with the latest master and resolved all conflicts. Also updated the ParallelTableIteratorOperation constructor usage across all callers to match the current API and fix the compilation errors. The build is now passing locally and CI is successful.

After aligning with the latest upstream changes, I see that the current implementation already introduces batching with maxKeysInMemory and worker threads, which addresses the original concern around in-memory accumulation.
The PR now having some other changes which are not intensional due to reabse.
Could you please advise if any additional improvements are still expected as part of HDDS-14400, or if the current upstream changes already cover the intended behavior?

@adoroszlai
Copy link
Copy Markdown
Contributor

Thanks @navinko for updating the patch. Since no changes are left in ParallelTableIteratorOperation, I guess we can close this.

@adoroszlai adoroszlai closed this Mar 19, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants