Skip to content

perf(reader): Avoid second create_parquet_record_batch_stream_builder() call for migrated tables#2176

Open
mbutrovich wants to merge 7 commits intoapache:mainfrom
mbutrovich:double_open_fix
Open

perf(reader): Avoid second create_parquet_record_batch_stream_builder() call for migrated tables#2176
mbutrovich wants to merge 7 commits intoapache:mainfrom
mbutrovich:double_open_fix

Conversation

@mbutrovich
Copy link
Collaborator

@mbutrovich mbutrovich commented Feb 24, 2026

Which issue does this PR close?

What changes are included in this PR?

  • Load Parquet metadata once via ArrowReaderMetadata::load_async(), inspect it in-memory for field IDs, then pass it to a single ParquetRecordBatchStreamBuilder::new_with_metadata() call. This eliminates the second create_parquet_record_batch_stream_builder() call for migrated tables.

Are these changes tested?

Existing tests.

@mbutrovich mbutrovich changed the title Double open fix perf(reader): Avoid second create_parquet_record_batch_stream_builder() call for migrated tables Feb 24, 2026
@mbutrovich mbutrovich self-assigned this Feb 24, 2026
@mbutrovich mbutrovich marked this pull request as ready for review February 26, 2026 02:17
@mbutrovich mbutrovich requested a review from blackmwk February 26, 2026 02:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant