Core, Orc, Data: Implementation of ORCFormatModel by pvary · Pull Request #15255 · apache/iceberg

pvary · 2026-02-07T11:59:29Z

Part of: #12298
Implementation of the new API: #12774

OrcFormatModel and related tests

stevenzwu

I left the Void.class schema class comment to the Parquet PR

stevenzwu · 2026-02-11T02:18:29Z

orc/src/main/java/org/apache/iceberg/orc/ORC.java

          file,
          conf,
-          schema,
+          // This is a behavioral change. Previously there were an error if metadata columns were


can we do this behavior change in a separate PR? or it is required to get the build pass?

This is needed, so ORC handles the partitioned reads the same way as the other readers

It is intuitive why we are remove the constant and metadata field ids from the read schema. but I need some help to understand how are they related to the partitioned reads?

Also can you point me to the code how Parquet readers handle this situation?

Also we should probably remove the This is a behavioral change from the code comment. It should be just a PR comment. code comment should just explain why we are stripping away the constant and metadata fields from the schema.

Removed the comment.

In nutshell, the VectorizedSparkOrcReaders, GenericOrcReader.buildReader, etc functions need the full schema to create the "readers" for the constant columns, but the physical reader don't need them. Every caller currently makes sure that the columns which are not necessary are removed when the physical reader is created. This will not work when the generic parametrization is used. We can do it in the ORCFormatModel, but this change is just fixing a bug in my opinion.

orc/src/main/java/org/apache/iceberg/orc/ORCFormatModel.java

pvary · 2026-02-13T19:03:37Z

Rebased on top of main

singhpk234

LGTM too, thanks @pvary !

pvary · 2026-02-15T14:27:08Z

Merged to main.
Thanks @stevenzwu and @singhpk234 for the review!

github-actions bot added core data ORC labels Feb 7, 2026

pvary force-pushed the orc_model branch from 5f65095 to a56f685 Compare February 7, 2026 20:19

pvary mentioned this pull request Feb 7, 2026

Core, Data: File Format API interfaces #12774

Merged

pvary requested a review from stevenzwu February 7, 2026 22:01

pvary force-pushed the orc_model branch from a56f685 to 5839756 Compare February 10, 2026 20:43

stevenzwu reviewed Feb 11, 2026

View reviewed changes

pvary force-pushed the orc_model branch from fa6243d to 4518d86 Compare February 13, 2026 19:03

stevenzwu approved these changes Feb 13, 2026

View reviewed changes

singhpk234 approved these changes Feb 14, 2026

View reviewed changes

Orc, Data: Implementation of ORCFormatModel

729969d

pvary force-pushed the orc_model branch from 4518d86 to 729969d Compare February 15, 2026 09:59

pvary merged commit 83fe4ff into apache:main Feb 15, 2026
32 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Core, Orc, Data: Implementation of ORCFormatModel#15255

Core, Orc, Data: Implementation of ORCFormatModel#15255
pvary merged 1 commit intoapache:mainfrom
pvary:orc_model

pvary commented Feb 7, 2026

Uh oh!

stevenzwu left a comment

Uh oh!

stevenzwu Feb 11, 2026

Uh oh!

pvary Feb 11, 2026

Uh oh!

stevenzwu Feb 12, 2026 •

edited

Loading

Uh oh!

stevenzwu Feb 12, 2026

Uh oh!

pvary Feb 13, 2026

Uh oh!

Uh oh!

pvary commented Feb 13, 2026

Uh oh!

singhpk234 left a comment

Uh oh!

Uh oh!

pvary commented Feb 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

pvary commented Feb 7, 2026

Uh oh!

stevenzwu left a comment

Choose a reason for hiding this comment

Uh oh!

stevenzwu Feb 11, 2026

Choose a reason for hiding this comment

Uh oh!

pvary Feb 11, 2026

Choose a reason for hiding this comment

Uh oh!

stevenzwu Feb 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

stevenzwu Feb 12, 2026

Choose a reason for hiding this comment

Uh oh!

pvary Feb 13, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

pvary commented Feb 13, 2026

Uh oh!

singhpk234 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

pvary commented Feb 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

stevenzwu Feb 12, 2026 •

edited

Loading