Skip to content

feat: add manifest.json to eliminate S3 Walk during restore#1405

Open
minguyen9988 wants to merge 1 commit into
Altinity:masterfrom
minguyen9988:feature/manifest-skip-walk
Open

feat: add manifest.json to eliminate S3 Walk during restore#1405
minguyen9988 wants to merge 1 commit into
Altinity:masterfrom
minguyen9988:feature/manifest-skip-walk

Conversation

@minguyen9988
Copy link
Copy Markdown
Contributor

Summary

Implements #1375: upload-time file manifest to eliminate the expensive S3 ListObjectsV2 Walk during restore.

Problem

During restore_remote, each part directory triggers a Walk (recursive ListObjectsV2) to enumerate files before downloading. For large backups with thousands of parts, this produces hundreds of paginated API calls (1000 keys per page), adding significant latency — often minutes — before any actual data transfer begins.

Solution

During upload, record every file's relative path, size, and last-modified time in a manifest.json file stored alongside metadata.json in the backup root. During restore_remote, download this single manifest file first and use it to enumerate files directly via GetObject, completely skipping the Walk.

Performance Impact

For a backup with 50,000 files across 500 parts:

  • Before: ~500 ListObjectsV2 calls (1 per part × ~1 page each) + pagination overhead = ~50 paginated requests at the S3 level
  • After: 1 GetObject for the ~2MB manifest.json

Design Decisions

  • Incremental manifest building: Files are recorded as they're uploaded (thread-safe via mutex), not via a post-upload Walk
  • Graceful fallback: Older backups without manifest.json transparently fall back to the existing Walk behavior — full backward compatibility
  • Non-fatal upload failure: If manifest upload fails, a warning is logged but the backup succeeds normally; restore will fall back to Walk
  • ManifestEntry implements RemoteFile interface: Manifest entries integrate seamlessly with existing download code paths
  • DirectoryFormat only: Manifest optimization applies to uncompressed backups where Walk is the bottleneck; compressed (tar.gz) backups already have file listings in their metadata
  • Version field: manifest.json includes a version field for future format evolution

Files Changed

File Change
pkg/storage/manifest.go New — manifest types, serialization, upload/download, DownloadPathWithManifest
pkg/storage/manifest_test.go New — 15 unit tests covering all manifest operations
pkg/backup/backuper.go Added fileManifest field + thread-safe recording helpers
pkg/backup/upload.go Initialize manifest, record files during upload, upload manifest after metadata.json
pkg/backup/download.go Download manifest at restore start, use manifest-based download before Walk fallback

manifest.json Format

{
  "version": 1,
  "backup_name": "my_backup_2025",
  "created_at": "2025-06-01T12:00:00Z",
  "total_size": 1073741824,
  "total_files": 5000,
  "files": [
    {
      "path": "shadow/default/orders/default/20250101_1_1_0/data.bin",
      "size": 1048576,
      "last_modified": "2025-06-01T12:00:00Z"
    }
  ]
}

Testing

  • 15 unit tests added covering: creation, file recording, prefix filtering (including partial match prevention), marshal/unmarshal roundtrip, RemoteFile interface conversion, capacity pre-allocation, large file counts (10k/50k)
  • All existing tests continue to pass
  • go vet clean on both pkg/storage/ and pkg/backup/

Closes #1375

During upload, record every file (path, size, last-modified) in a
manifest.json stored alongside metadata.json. During restore, download
the manifest first and use it to enumerate files directly via GetObject,
skipping the expensive recursive Walk (ListObjectsV2).

For a backup with 50k files across 500 parts, this eliminates ~50
ListObjectsV2 pages of 1000 keys each, replacing them with a single
GetObject for the ~2MB manifest.

Key design decisions:
- Manifest is built incrementally during upload (thread-safe via mutex)
- Graceful fallback: older backups without manifest.json transparently
  fall back to Walk
- Manifest upload failure is non-fatal (logged as warning)
- ManifestEntry implements RemoteFile interface for seamless integration
  with existing download code paths
- No changes to compressed (tar.gz) backup format — manifest optimization
  applies to DirectoryFormat where Walk is the bottleneck

Closes Altinity#1375
@minguyen9988
Copy link
Copy Markdown
Contributor Author

CI failures are pre-existing infrastructure issues, not related to this PR.

Both failing jobs failed before reaching any test code:

  • Test (1.26, 24.8): Error response from daemon: No such image: panubo/sshd:latest — Docker image pull failure during test environment setup
  • Testflows (1.26, 23.3): could not bring up testcontainers cluster after 5 attempts — testcontainers startup failure

The master branch itself has the same CI failures — the last 5 runs on master all show failure status. The Build (1.26) job passed, confirming this PR compiles cleanly.

Could a maintainer please re-run the failed jobs? (I don't have admin rights to trigger re-runs on this repo.)

@minguyen9988
Copy link
Copy Markdown
Contributor Author

Updated CI analysis after all jobs completed.

Result: 0 failures are related to the manifest changes in this PR.

Full breakdown of all 14 failed Test jobs:

Failure Root Cause On master too? Manifest-related?
TestListFormat (5 jobs: 24.3, 26.3, 25.8, 23.3, 24.8) Flaky: parallel test creates tables in shared ClickHouse, empty backup picks them up, fails "tables": null assertion. Shuffle seed changes daily → different ordering. Yes (different versions due to shuffle) No — test does create (local), manifest only runs on upload
nil pointer deref (1 job: 21.3) b.dst is nil in ReadBackupMetadataRemote during restoreRequiredPart Yes — same panic at same PC 0x3bfc83c on master 21.3 No — panic is in restore.go code we didn't touch
TestForceRebalance (3 jobs: 26.3, 25.8, 23.3) UNKNOWN_DISK — ClickHouse storage policy error during table attach Yes — pre-existing disk config issue No
Docker pull failure (1 job: 24.8) No such image: panubo/sshd:latest Yes No
Testflows container failure (1 job: 23.3) could not bring up testcontainers cluster after 5 attempts Yes No
Test timeout (1 job: 1.1.54394) panic: test timed out after 2h0m0s Yes — same version times out on master No
Various integration failures (remaining jobs) Mix of UNKNOWN_DISK, disk policy errors, file not found in restore — all pre-existing Yes No

Positive signal: The manifest code IS running successfully in integration tests that got far enough to execute upload paths:

INF pkg/backup/upload.go:286 > uploaded backup manifest, total_files=33, total_size=455199
INF pkg/backup/upload.go:286 > uploaded backup manifest, total_files=21, total_size=387671

No manifest-related errors in any job's logs.

@Slach Slach added this to the 2.8.0 milestone Jun 2, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Proposal: eliminate S3 Walk (ListObjectsV2) during restore via upload-time file manifest

2 participants