Skip to content

[SPARK-56756][SQL] Add error class for recursiveFileLookup conflict with partitioned data source#55721

Open
markj-db wants to merge 2 commits intoapache:masterfrom
markj-db:recursive-file-lookup-error-class
Open

[SPARK-56756][SQL] Add error class for recursiveFileLookup conflict with partitioned data source#55721
markj-db wants to merge 2 commits intoapache:masterfrom
markj-db:recursive-file-lookup-error-class

Conversation

@markj-db
Copy link
Copy Markdown
Contributor

@markj-db markj-db commented May 6, 2026

What changes were proposed in this pull request?

PartitioningAwareFileIndex.listFiles rejects the combination of recursiveFileLookup=true and a non-empty partitionSpec().partitionColumns by throwing a raw java.lang.IllegalArgumentException with the message "Datasource with partition do not allow recursive file loading."

This PR replaces that with a tagged AnalysisException using a new error class:

  • New error class RECURSIVE_FILE_LOOKUP_NOT_SUPPORTED_FOR_PARTITIONED_DATA_SOURCE (sqlState 0A000) in error-conditions.json.
  • New helper QueryCompilationErrors.recursiveFileLookupNotSupportedForPartitionedDataSourceError().
  • Throw site in PartitioningAwareFileIndex.scala updated to use the helper.

Why are the changes needed?

The raw IllegalArgumentException is unclassified and does not surface as a user-facing error with a clear message. Replacing it with an AnalysisException using a proper error class ensures it is correctly classified as a user error with an actionable message.

Does this PR introduce any user-facing change?

Yes. Users who hit this error will now see a clearer message:

Recursive file loading is not supported when the data source has explicit partition columns. Either remove the option "recursiveFileLookup", or read the data without supplying partition columns (for example, do not read a partitioned table or set partition-column options such as "cloudFiles.partitionColumns").

Previously the error was a raw IllegalArgumentException with the message "Datasource with partition do not allow recursive file loading."

How was this patch tested?

Added "recursiveFileLookup with a partitioned catalog table is rejected" in FileBasedDataSourceSuite, which creates a partitioned Parquet catalog table, then asserts that reading it with recursiveFileLookup=true throws an AnalysisException with condition RECURSIVE_FILE_LOOKUP_NOT_SUPPORTED_FOR_PARTITIONED_DATA_SOURCE.

Was this patch authored or co-authored using generative AI tooling?

Generated-by: Claude (claude-sonnet-4-6)

…ith partitioned data source

Replace the raw IllegalArgumentException thrown when recursiveFileLookup=true
is combined with a partitioned data source with a tagged AnalysisException
using the new RECURSIVE_FILE_LOOKUP_NOT_SUPPORTED_FOR_PARTITIONED_DATA_SOURCE
error class, so the failure is correctly classified as a user error.
},
"RECURSIVE_FILE_LOOKUP_NOT_SUPPORTED_FOR_PARTITIONED_DATA_SOURCE" : {
"message" : [
"Recursive file loading is not supported when the data source has explicit partition columns. Either remove the option \"recursiveFileLookup\", or read the data without supplying partition columns (for example, do not read a partitioned table or set partition-column options such as \"cloudFiles.partitionColumns\")."
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"Recursive file loading is not supported when the data source has explicit partition columns. Either remove the option \"recursiveFileLookup\", or read the data without supplying partition columns (for example, do not read a partitioned table or set partition-column options such as \"cloudFiles.partitionColumns\")."
"Recursive file loading is not supported when the data source has explicit partition columns. Either remove the option \"recursiveFileLookup\", or read the data without supplying partition columns (for example, do not read a partitioned table)."

Strip the internal cloudFiles.partitionColumns example from the error
message for RECURSIVE_FILE_LOOKUP_NOT_SUPPORTED_FOR_PARTITIONED_DATA_SOURCE
as this is OSS code.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants