Skip to content

Scan performance regression after DataFusion 53 #3926

@comphead

Description

@comphead

Describe the bug

There is observable performance degradation for native_datafusion scan operations after #3574.
It was bunch of crates upgraded datafusion, arrow, parquet, opendal, object_store that directly involved into scan.

Currently I can see degradation on HDFS only, the scan task for typical test went from 3 minutes per task to 5 minutes

Steps to reproduce

No response

Expected behavior

No response

Additional context

No response

Metadata

Metadata

Assignees

Labels

area:scanParquet scan / data readingnative_datafusionSpecific to native_datafusion scan typeperformancepriority:mediumFunctional bugs, performance regressions, broken features

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions