-
Notifications
You must be signed in to change notification settings - Fork 4k
GH-49114: [C++][Parquet] Fix converting schema failure with deep nested two-level encoding list structure #49125
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
@pitrou @emkornfield @wgtmac Please take a look. |
b494f79 to
7176212
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Fixes Parquet-to-Arrow schema conversion for deep nested two-level encoding LIST structures (and related MAP-in-LIST cases) that previously errored as illegal.
Changes:
- Refactors list element resolution into a new
ResolveList()helper to better handle nested two-level LIST encodings. - Relaxes repetition constraints to allow LIST/MAP nodes to be
REPEATEDwhen they are elements of an enclosing LIST. - Adds/updates unit tests to cover deep nested two-level LISTs and two-level LIST-of-MAP schemas, and adjusts expected error text.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
cpp/src/parquet/arrow/schema.cc |
Adjusts LIST/MAP schema conversion logic and introduces ResolveList() for nested/two-level handling. |
cpp/src/parquet/arrow/arrow_schema_test.cc |
Adds coverage for deep nested two-level LIST and allows previously-failing LIST-of-MAP case; updates an error expectation. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Copilot reviewed 2 out of 2 changed files in this pull request and generated no new comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Rationale for this change
Fix the failure when converting parquet schema with deep nested two-level encoding list structure to arrow schema.
What changes are included in this PR?
Modified the
ListToSchemaFieldandMapToSchemaFieldmethods.Are these changes tested?
Yes.
Are there any user-facing changes?
It allows some legal schemas that were once considered illegal.