ParquetMetaDataPushDecoder API to clear all buffered ranges#9673
ParquetMetaDataPushDecoder API to clear all buffered ranges#9673nathanb9 wants to merge 3 commits intoapache:mainfrom
Conversation
| } | ||
|
|
||
| /// Clear all buffered ranges and their corresponding data | ||
| #[cfg(feature = "arrow")] |
There was a problem hiding this comment.
Not useful i think.
Also, how does the CI work? Does it use the pr label to figure out which feature flags to use?
There was a problem hiding this comment.
Well, the feature gate is necessary without the changes in this PR 😉.
IIRC the CI will do builds with various sets of features enabled. The actual unit tests are then run with default features and all features.
There was a problem hiding this comment.
I added that flag in previous pr. The reason I want to remove it now is bc the flag causes failures in dev pr because we do not make change to a file in the arrow dir. The dev pr tests figures out which flag based on directory edited: https://github.com/apache/arrow-rs/blob/main/.github/workflows/dev_pr/labeler.yml
etseidl
left a comment
There was a problem hiding this comment.
Thanks for the submission @nathanb9. Seems logical to me.
If it's not to much trouble could you also add an issue with a motivating use case? That aids in documentation for releases.
I also think it would be nice to add an example of the use of this API to the documentation at the head of push_decoder.rs.
| } | ||
|
|
||
| /// Clear all buffered ranges and their corresponding data | ||
| #[cfg(feature = "arrow")] |
There was a problem hiding this comment.
Well, the feature gate is necessary without the changes in this PR 😉.
IIRC the CI will do builds with various sets of features enabled. The actual unit tests are then run with default features and all features.
|
@etseidl tysm. I added the motivating ticket which was create by alamb@ in the PR description |
This PR is a follow up for this ticket . Implement same API but for the metadata decoder.
See also #9624 (comment)
Rationale for this change
ParquetMetaDataPushDecoderclears exact requested ranges, but largerspeculative pushed ranges can remain buffered in
PushBuffers. Thisadds a way for callers to explicitly release non exact ranges
What changes are included in this PR?
This adds
clear_all_ranges(), which clears all byte ranges stillstaged in the decoder's internal
PushBuffersAre these changes tested?
yes
Are there any user-facing changes?
Yes, this adds a new public
clear_all_ranges()API onParquetMetaDataPushDecoder