Skip to content

Aliyun/Dell: Classify missing object reads as NotFoundException#16300

Open
atovk wants to merge 1 commit into
apache:mainfrom
atovk:codex/classify-missing-object-reads
Open

Aliyun/Dell: Classify missing object reads as NotFoundException#16300
atovk wants to merge 1 commit into
apache:mainfrom
atovk:codex/classify-missing-object-reads

Conversation

@atovk
Copy link
Copy Markdown

@atovk atovk commented May 12, 2026

Summary

  • translate Aliyun OSS NoSuchKey / NoSuchBucket stream-open failures to Iceberg NotFoundException
  • translate Dell ECS HTTP 404 readObjectStream failures to Iceberg NotFoundException
  • add regression coverage for both missing-object input stream paths

Root cause

BaseMetastoreTableOperations.refreshFromMetadataLocation stops metadata read retries on Iceberg NotFoundException. Aliyun OSS and Dell ECS already treat missing objects as absent in exists(), but their stream-read paths leaked provider SDK exceptions. A stale catalog metadata location could therefore retry a permanent missing-object failure instead of failing fast.

Closes #16299.

Testing

  • JAVA_HOME=/Users/nullwo/.gradle/jdks/amazon_com_inc_-17-aarch64-os_x/amazon-corretto-17.jdk/Contents/Home ./gradlew --no-daemon --max-workers=1 :iceberg-aliyun:check :iceberg-dell:check

Aliyun OSS and Dell ECS stream reads were leaking provider-specific missing-object exceptions, unlike S3, ADLS, GCS, and Hadoop. The metastore refresh path stops retries on Iceberg NotFoundException, so stale metadata locations could retry a permanent missing-object condition until the retry budget is exhausted.

This keeps the storage-specific exists behavior and translates missing-object errors at the read boundary to NotFoundException. Non-not-found provider errors are preserved.

Constraint: BaseMetastoreTableOperations stops metadata refresh retries on NotFoundException
Rejected: Add an exists probe before every metadata read | adds an extra object-store request and duplicates behavior expected of FileIO read paths
Confidence: high
Scope-risk: narrow
Reversibility: clean
Directive: New FileIO read paths should classify permanent not-found conditions as NotFoundException
Tested: JAVA_HOME=/Users/nullwo/.gradle/jdks/amazon_com_inc_-17-aarch64-os_x/amazon-corretto-17.jdk/Contents/Home ./gradlew --no-daemon --max-workers=1 :iceberg-aliyun:check :iceberg-dell:check
Related: apache#16299
@atovk atovk changed the title Aliyun, Dell: Classify missing object reads as NotFoundException Aliyun/Dell: Classify missing object reads as NotFoundException May 12, 2026
@atovk atovk marked this pull request as ready for review May 15, 2026 05:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Aliyun/Dell: Classify missing object reads as NotFoundException

1 participant