[ams] Force ReachableFileCleanup when snapshots exist outside main ancestry#4231
Open
j1wonpark wants to merge 1 commit into
Open
[ams] Force ReachableFileCleanup when snapshots exist outside main ancestry#4231j1wonpark wants to merge 1 commit into
j1wonpark wants to merge 1 commit into
Conversation
… ancestry Iceberg's auto-selected IncrementalFileCleanup can silently truncate its ancestor walk when a parent snapshot is missing, deleting data files the current snapshot still references. Force the safe ReachableFileCleanup only when snapshots exist outside the main ancestry; healthy tables are unchanged. Signed-off-by: Jiwon Park <jpark92@outlook.kr>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Why are the changes needed?
When AMS expires snapshots on a single-ref Iceberg table, Iceberg auto-selects
IncrementalFileCleanup. That strategy walks the current snapshot's ancestor chain viaSnapshotUtil.ancestorIds, and if a parent snapshot is missing from metadata the walk stops silently. Snapshots below the break are then treated as non-ancestors, so theADDEDentries of the manifests they wrote get reverted during cleanup. If such a manifest was superseded by a merge but its data file is still carried over (asEXISTING) in a manifest the current snapshot references, cleanup deletes a file the live table still points at, and reads fail withNotFoundException.This is the same root cause as apache/iceberg#13568, fixed by apache/iceberg#13614 in Iceberg 1.10.0, where the strategy selector now falls back to
ReachableFileCleanupwhen snapshots exist outside the main ancestry. Iceberg1.6.x ~ 1.9.xlack that guard and unconditionally pick the incremental strategy for a single ref. Since bumping Iceberg core isn't always possible for existing deployments, this PR backports an equivalent guard at the Amoro layer.Brief change log
IcebergTableMaintainer: before committingexpireSnapshots, checkhasSnapshotsOutsideMainAncestry()— the sameSnapshotUtil.ancestorIdswalk the cleanup uses — and forceReachableFileCleanuponly in that at-risk state. Healthy tables keep the automatic incremental selection, so there is no cost impact.ReachableFileCleanupBridge(new, in packageorg.apache.iceberg):RemoveSnapshots#withIncrementalCleanup(boolean)is package-private, so the bridge calls it without reflection.How was this patch tested?
TestExpireSnapshotsKeepReferencedFiles) that reproduces the carry-over + broken-chain state and asserts referenced data files survive expiration.Documentation