feat: add manifest.json to eliminate S3 Walk during restore#1405
Open
minguyen9988 wants to merge 1 commit into
Open
feat: add manifest.json to eliminate S3 Walk during restore#1405minguyen9988 wants to merge 1 commit into
minguyen9988 wants to merge 1 commit into
Conversation
During upload, record every file (path, size, last-modified) in a manifest.json stored alongside metadata.json. During restore, download the manifest first and use it to enumerate files directly via GetObject, skipping the expensive recursive Walk (ListObjectsV2). For a backup with 50k files across 500 parts, this eliminates ~50 ListObjectsV2 pages of 1000 keys each, replacing them with a single GetObject for the ~2MB manifest. Key design decisions: - Manifest is built incrementally during upload (thread-safe via mutex) - Graceful fallback: older backups without manifest.json transparently fall back to Walk - Manifest upload failure is non-fatal (logged as warning) - ManifestEntry implements RemoteFile interface for seamless integration with existing download code paths - No changes to compressed (tar.gz) backup format — manifest optimization applies to DirectoryFormat where Walk is the bottleneck Closes Altinity#1375
Contributor
Author
|
CI failures are pre-existing infrastructure issues, not related to this PR. Both failing jobs failed before reaching any test code:
The Could a maintainer please re-run the failed jobs? (I don't have admin rights to trigger re-runs on this repo.) |
Contributor
Author
|
Updated CI analysis after all jobs completed. Result: 0 failures are related to the manifest changes in this PR. Full breakdown of all 14 failed Test jobs:
Positive signal: The manifest code IS running successfully in integration tests that got far enough to execute upload paths: No manifest-related errors in any job's logs. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Implements #1375: upload-time file manifest to eliminate the expensive S3
ListObjectsV2Walk during restore.Problem
During
restore_remote, each part directory triggers aWalk(recursiveListObjectsV2) to enumerate files before downloading. For large backups with thousands of parts, this produces hundreds of paginated API calls (1000 keys per page), adding significant latency — often minutes — before any actual data transfer begins.Solution
During
upload, record every file's relative path, size, and last-modified time in amanifest.jsonfile stored alongsidemetadata.jsonin the backup root. Duringrestore_remote, download this single manifest file first and use it to enumerate files directly viaGetObject, completely skipping the Walk.Performance Impact
For a backup with 50,000 files across 500 parts:
ListObjectsV2calls (1 per part × ~1 page each) + pagination overhead = ~50 paginated requests at the S3 levelGetObjectfor the ~2MB manifest.jsonDesign Decisions
manifest.jsontransparently fall back to the existing Walk behavior — full backward compatibilityManifestEntryimplementsRemoteFileinterface: Manifest entries integrate seamlessly with existing download code pathsmanifest.jsonincludes a version field for future format evolutionFiles Changed
pkg/storage/manifest.goDownloadPathWithManifestpkg/storage/manifest_test.gopkg/backup/backuper.gofileManifestfield + thread-safe recording helperspkg/backup/upload.gopkg/backup/download.gomanifest.json Format
{ "version": 1, "backup_name": "my_backup_2025", "created_at": "2025-06-01T12:00:00Z", "total_size": 1073741824, "total_files": 5000, "files": [ { "path": "shadow/default/orders/default/20250101_1_1_0/data.bin", "size": 1048576, "last_modified": "2025-06-01T12:00:00Z" } ] }Testing
RemoteFileinterface conversion, capacity pre-allocation, large file counts (10k/50k)go vetclean on bothpkg/storage/andpkg/backup/Closes #1375