Feature Request / Improvement
Description
RewriteTablePathUtil.rewriteDVFile currently rewrites deletion vector (DV) Puffin files by reading all rewritten blobs into an in-memory list before writing them out, which can create unnecessary memory pressure when rewriting larger DV files or files with many blobs.
Suggested improvement
Refactor RewriteTablePathUtil.rewriteDVFile to stream rewritten blobs directly to the output PuffinWriter while iterating through the source blobs, instead of collecting all rewritten blobs into a temporary list first.
Query engine
None
Willingness to contribute
Feature Request / Improvement
Description
RewriteTablePathUtil.rewriteDVFilecurrently rewrites deletion vector (DV) Puffin files by reading all rewritten blobs into an in-memory list before writing them out, which can create unnecessary memory pressure when rewriting larger DV files or files with many blobs.Suggested improvement
Refactor
RewriteTablePathUtil.rewriteDVFileto stream rewritten blobs directly to the outputPuffinWriterwhile iterating through the source blobs, instead of collecting all rewritten blobs into a temporary list first.Query engine
None
Willingness to contribute