backup: DynamoDB encoder for tables and items (Phase 0a)#716
Conversation
|
Warning Rate limit exceeded
To keep reviews running without waiting, you can enable usage-based add-on for your organization. This allows additional reviews beyond the hourly cap. Account admins can enable it under billing. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (2)
📝 WalkthroughWalkthroughThis PR enhances the DynamoDB backup encoder and introduces a complete S3 backup encoder implementation. The DynamoDB encoder now buffers items by generation, handles migration-source generations, canonicalizes numeric keys, produces deterministic schemas, and strictly validates key syntax. The new S3 encoder assembles multipart objects from chunked blobs, emits bucket metadata and sidecars, suppresses stale generations, handles collision renaming with a KEYMAP, and validates object key safety. ChangesDynamoDB & S3 Backup Encoding
Sequence Diagram(s)sequenceDiagram
participant User
participant DDBEncoder as DynamoDB Encoder
participant Buffer as Per-Generation Buffer
participant Finalize as Flush & Emit
User->>DDBEncoder: HandleTableMeta(meta_key, schema_proto)
DDBEncoder->>DDBEncoder: Decode schema, validate
User->>DDBEncoder: HandleItem(item_key, item_proto)
DDBEncoder->>DDBEncoder: Parse key → extract generation
DDBEncoder->>Buffer: Append item under its generation
Note over User,DDBEncoder: (possibly multiple calls with mixed generations)
User->>Finalize: Finalize()
Finalize->>Finalize: Compute emitOrder: [migration_source_gen, active_gen]
Finalize->>Finalize: Filter stale generations, warn
loop For each generation in emitOrder
Finalize->>Finalize: Canonicalize numeric keys
Finalize->>Finalize: Write items to per-gen .json file
end
Finalize-->>User: ✓ Deterministic output
sequenceDiagram
participant User
participant S3Encoder as S3 Encoder
participant Scratch as Scratch Storage
participant Manifest as Manifest/Metadata
participant Assemble as Assembly & Finalize
User->>S3Encoder: HandleBucketMeta(key, bucket_meta_json)
S3Encoder->>Manifest: Store bucket state & active generation
User->>S3Encoder: HandleObjectManifest(key, manifest_json)
S3Encoder->>Manifest: Decode manifest, extract uploadID & parts
User->>S3Encoder: HandleBlob(blob_key, chunk_bytes)
S3Encoder->>S3Encoder: Parse blob key → (bucket, object, partNo, chunkNo)
S3Encoder->>Scratch: Write chunk atomically, validate path safety
Note over User,S3Encoder: (possibly interleaved across buckets/objects)
User->>Assemble: Finalize()
Assemble->>Assemble: Filter chunks by manifest uploadID & declared parts
Assemble->>Assemble: Verify all (partNo, partVersion, chunkNo) indexes present
Assemble->>Scratch: Read chunks in sorted (partNo, partVersion, chunkNo) order
Assemble->>Assemble: Assemble into final body, emit .elastickv-meta.json sidecar
Assemble->>Assemble: Detect/handle collisions (reserved suffix, leaf-data dir)
Assemble->>Assemble: Record renames in KEYMAP.jsonl (if enabled)
Assemble->>Assemble: Suppress stale-generation objects, warn
Assemble->>Scratch: Delete scratch subtree
Assemble-->>User: ✓ Assembled objects + sidecars
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Review rate limit: 0/1 reviews remaining, refill in 50 minutes and 19 seconds.Comment |
There was a problem hiding this comment.
Code Review
This pull request introduces a DynamoDB encoder designed to export table schemas and items into a structured filesystem layout for logical backups. The implementation handles the conversion of internal protobuf representations into AWS-compatible JSON formats, supporting various DynamoDB attribute types and secondary indexes. Feedback suggests adopting a fail-fast error handling strategy during the finalization phase when flushing tables and refining the serialization of set attributes to ensure consistent JSON output for empty or nil collections.
| var firstErr error | ||
| for _, st := range d.tables { | ||
| if st.schema == nil { | ||
| d.emitWarn("ddb_orphan_items", | ||
| "encoded_table", st.encoded, | ||
| "buffered_items", len(st.items)) | ||
| continue | ||
| } | ||
| if err := d.flushTable(st); err != nil && firstErr == nil { | ||
| firstErr = err | ||
| } | ||
| } | ||
| return firstErr |
There was a problem hiding this comment.
The current error handling in Finalize correctly implements a lenient path for incomplete entries (orphan items) by emitting a warning instead of failing or dropping them. This aligns with the repository rule to avoid silently dropping entries during serialization to ensure data preservation on disk. For actual flush errors, failing fast is appropriate.
| var firstErr error | |
| for _, st := range d.tables { | |
| if st.schema == nil { | |
| d.emitWarn("ddb_orphan_items", | |
| "encoded_table", st.encoded, | |
| "buffered_items", len(st.items)) | |
| continue | |
| } | |
| if err := d.flushTable(st); err != nil && firstErr == nil { | |
| firstErr = err | |
| } | |
| } | |
| return firstErr | |
| for _, st := range d.tables { | |
| if st.schema == nil { | |
| d.emitWarn("ddb_orphan_items", | |
| "encoded_table", st.encoded, | |
| "buffered_items", len(st.items)) | |
| continue | |
| } | |
| if err := d.flushTable(st); err != nil { | |
| return err | |
| } | |
| } | |
| return nil |
References
- Avoid silently dropping entries during serialization or normalization. If placeholder or incomplete entries are valid in some contexts, provide a specific, more lenient handling path for them to ensure they are preserved on disk and avoid state inconsistencies.
| func setAttributeValueToPublic(av *pb.DynamoAttributeValue) map[string]any { | ||
| switch v := av.GetValue().(type) { | ||
| case *pb.DynamoAttributeValue_Ss: | ||
| return map[string]any{"SS": append([]string{}, v.Ss.GetValues()...)} | ||
| case *pb.DynamoAttributeValue_Ns: | ||
| return map[string]any{"NS": append([]string{}, v.Ns.GetValues()...)} | ||
| case *pb.DynamoAttributeValue_Bs: | ||
| return map[string]any{"BS": append([][]byte{}, v.Bs.GetValues()...)} | ||
| } | ||
| return nil | ||
| } |
There was a problem hiding this comment.
The current implementation of setAttributeValueToPublic can produce null in the JSON output for nil set slices. To avoid state inconsistencies during normalization and ensure entries are correctly preserved, handle nil set slices to produce an empty JSON array [] instead of null.
| func setAttributeValueToPublic(av *pb.DynamoAttributeValue) map[string]any { | |
| switch v := av.GetValue().(type) { | |
| case *pb.DynamoAttributeValue_Ss: | |
| return map[string]any{"SS": append([]string{}, v.Ss.GetValues()...)} | |
| case *pb.DynamoAttributeValue_Ns: | |
| return map[string]any{"NS": append([]string{}, v.Ns.GetValues()...)} | |
| case *pb.DynamoAttributeValue_Bs: | |
| return map[string]any{"BS": append([][]byte{}, v.Bs.GetValues()...)} | |
| } | |
| return nil | |
| } | |
| func setAttributeValueToPublic(av *pb.DynamoAttributeValue) map[string]any { | |
| switch v := av.GetValue().(type) { | |
| case *pb.DynamoAttributeValue_Ss: | |
| return map[string]any{"SS": append([]string(nil), v.Ss.GetValues()...)} | |
| case *pb.DynamoAttributeValue_Ns: | |
| return map[string]any{"NS": append([]string(nil), v.Ns.GetValues()...)} | |
| case *pb.DynamoAttributeValue_Bs: | |
| return map[string]any{"BS": append([][]byte(nil), v.Bs.GetValues()...)} | |
| } | |
| return nil | |
| } |
References
- Avoid silently dropping entries during serialization or normalization to ensure they are preserved on disk and avoid state inconsistencies.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 63bfe8db9b
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| enc := rest[:idx] | ||
| if _, err := base64.RawURLEncoding.DecodeString(enc); err != nil { | ||
| return "", errors.Wrap(ErrDDBMalformedKey, err.Error()) | ||
| } | ||
| return enc, nil |
There was a problem hiding this comment.
Preserve item generation when grouping DynamoDB rows
Item keys encode !ddb|item|<table>|<generation>|..., but this parser drops everything after the first separator and the encoder buffers rows only by table name. In real clusters, old-generation rows can coexist temporarily (e.g., delete/recreate before async cleanup finishes), and those stale rows will be emitted under the current schema; this can silently restore deleted data or make Finalize fail when primary-key names changed across generations. The backup path should retain generation and filter rows against the active schema generation (and any explicitly supported migration generation) instead of merging all generations.
Useful? React with 👍 / 👎.
Builds on PR #716 (DynamoDB) and PR #717 (s3keys.ParseBlobKey). Adds the S3 encoder for the Phase 0 logical-backup decoder. Snapshot prefixes handled: - !s3|bucket|meta|<bucket> -> s3/<bucket>/_bucket.json (live s3BucketMeta projected into the dump-format s3PublicBucket; cluster- internal fields like CreatedAtHLC and Generation stripped). - !s3|obj|head|<bucket><gen><object> -> s3/<bucket>/<object>.elastickv-meta.json (live s3ObjectManifest projected into the dump-format s3PublicManifest; UploadID and per-part chunk arrays stripped). - !s3|blob|<bucket><gen><object><uploadID><partNo><chunkNo>[<partVersion>]: spilled to a per-(bucket, object) scratch directory as it arrives; assembled at Finalize by walking sortChunkKeys output (partNo, partVersion, chunkNo) and concatenating into the final body file via writeFileAtomic-style tmp+rename. Memory: O(num_objects); body bytes never held in memory. - !s3|upload|meta|, !s3|upload|part|: excluded by default; --include-incomplete-uploads emits them as opaque {prefix, key, value} JSONL records under s3/<bucket>/_incomplete_uploads/. - !s3|bucket|gen|, !s3|gc|upload|, !s3route|: HandleIgnored no-op so the master pipeline can dispatch all !s3|* prefixes uniformly. Reserved-suffix collision: - A user object key ending in .elastickv-meta.json is rejected with ErrS3MetaSuffixCollision by default; --rename-collisions appends .user-data and records the rename in s3/<bucket>/KEYMAP.jsonl with KindMetaCollision so the on-disk reverse map is sound. - Per-bucket KEYMAP.jsonl is opened lazily and dropped if Finalize finds zero records (the spec's "omit empty file" rule). Body assembly: - Each blob chunk is written atomically to scratch as it arrives; the chunk path is registered in the object's chunkPaths map keyed by (uploadID, partNo, chunkNo, partVersion). - Finalize sorts the chunk keys by (partNo, partVersion, chunkNo) and concatenates the matching scratch files into a single <bodyPath> via tmp+rename. - The scratch tree is rm -rf'd on Finalize via deferred RemoveAll, so a successful run leaves nothing behind. Tests: bucket meta + single-chunk object round-trip, multipart ordering verification (chunks emitted out of order, body assembled in part/chunk order), orphan-chunks warning, .elastickv-meta.json collision rejection (default) and rename (with flag) including KEYMAP entry, malformed JSON rejection on both manifest and bucket paths, ignored-prefix no-op, --include-incomplete-uploads round- trip, default no _incomplete_uploads dir, versioned blob assembly.
…at/backup-phase0a-dynamodb
Three issues, all valid. #237 Codex P1 -- preserve item generation when grouping rows. Item keys carry !ddb|item|<table>|<generation>|... but the prior parser dropped the generation and Finalize merged all generations under the active schema. In real clusters, in-flight delete/recreate cleanup leaves stale-gen rows visible in the snapshot for a window; those rows would silently restore as items under the new schema (potentially resurrecting deleted data, or failing Finalize when PK names changed across generations). Fix: - parseDDBItemKey now returns (encodedTable, generation). - ddbTableState.items -> itemsByGen map[uint64][]*pb.DynamoItem. - flushTable filters by st.schema.GetGeneration() and emits a ddb_stale_generation_items warning carrying the count and active gen so the operator can correlate the orphan window. #182 Gemini -- fail-fast on flush errors. Finalize previously deferred the first error and continued through remaining tables. Real flush errors (out of disk, bad permissions) should surface immediately, not be misattributed to a later table whose flush also fails. Orphan-table warnings still continue (those are informational, not errors). #442 Gemini -- empty set serializes as [] not null. setAttributeValueToPublic now uses make+append so nil/empty SS/NS/BS become "{\"SS\":[]}" rather than "{\"SS\":null}". Downstream tools that distinguish "present-but-empty" from "missing" no longer see a misleading null. Tests: TestDDB_StaleGenerationItemsExcludedAndWarned, TestDDB_EmptyStringSetSerializesAsEmptyArrayNotNull, TestDDB_ParseItemKeyExtractsGeneration. Existing fixtures updated to set Generation=1 on schemas (matching the item key gen).
|
3 issues addressed in Codex P1 #237 — Preserve item generation when grouping rows Gemini MEDIUM #182 — Fail fast on flush errors Gemini MEDIUM #442 — Empty set → Tests: @claude review |
|
Claude finished @bootjp's task in 3m 30s —— View job Review:
|
| Finding | Severity |
|---|---|
| Codex P1: generation filtering | ✅ Correctly fixed |
| Gemini #182: fail-fast flush errors | ✅ Correctly fixed |
Gemini #442: empty set → [] |
✅ Correctly fixed |
MigratingFromGeneration excluded without distinct warning |
Low — conservative but silent |
Non-deterministic _schema.json ordering |
Low — readability / diffability |
| Single-stale-gen test only | Low — coverage gap |
The core correctness fixes are solid. The two "Low" findings would improve operator experience but don't affect backup safety. Happy to address any of them if you want them in this PR.
…to feat/backup-phase0a-s3
Three correctness bugs and several quality issues, all valid. CRITICAL: path traversal (Codex P1 #425). S3 object keys are user-controlled. A key like "../../../etc/passwd" would, under the prior filepath.Join-and-write code, escape the bucket directory and overwrite host files. Added safeJoinUnderRoot which Cleans the joined path and asserts it stays rooted under the bucket dir; ".." traversal is rejected with ErrS3MalformedKey. Absolute paths (leading "/") are normalised under the bucket dir by filepath.Join, which is the safest outcome. Tests: TestS3_PathTraversalAttemptRejected, TestS3_AbsolutePathObjectKeyConfinedUnderBucket. CRITICAL: stale upload-id chunks merged into body (Codex P1 #500, Gemini HIGH #106/#476/#504). A snapshot mid-delete-and-recreate or mid-retry can carry blob chunks for multiple upload attempts under the same (bucket, gen, object). The prior assembleObjectBody concatenated every chunk regardless of upload_id, producing corrupted bytes. Now: - s3ObjectState gains uploadID; HandleObjectManifest sets it. - New filterChunksForManifest takes the chunkPaths map and the manifest's uploadID, returns only matching chunks sorted by (partNo, partVersion, chunkNo). Stale-uploadID chunks never enter the assembled body. Test: TestS3_StaleUploadIDChunksFilteredFromAssembledBody. CRITICAL: incomplete-uploads file truncated per record (Codex P2 #318, Gemini HIGH+MEDIUM #318). HandleIncompleteUpload re-opened records.jsonl on every call; openJSONL uses os.Create which truncates. Only the last record survived per bucket. Now: - s3BucketState carries an incompleteUploadsJL *jsonlFile lazily opened on the first record and cached. - flushBucket closes it and surfaces the error (was silently ignored). Test: TestS3_IncompleteUploadsAppendsAcrossCalls (3 records, 3 lines on disk). QUALITY: - Gemini MEDIUM #285 (MkdirAll per blob): s3ObjectState gains a scratchDirCreated bool; HandleBlob runs MkdirAll once. - Gemini MEDIUM #318 (closeJSONL error ignored): keymap and incomplete-uploads writers now surface close errors via closeBucketKeymap / explicit closeJSONL return-check. - Gemini MEDIUM #386 (includeOrphans flag ignored): orphan chunks for objects without manifests now write to <bucket>/_orphans/<encoded-object>/<chunk>.bin under WithIncludeOrphans(true). Test: TestS3_OrphanChunksWrittenWhenIncludeOrphans. Also removed the now-unused attachManifestParts placeholder and sortChunkKeys helper; their logic moved into filterChunksForManifest which combines uploadID filtering with the sort.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: fdc9b36b6b
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| "stale_count", stale, | ||
| "hint", "stale-gen rows are excluded from the dump; restore would otherwise emit them under the new schema") | ||
| } | ||
| active := st.itemsByGen[activeGen] |
There was a problem hiding this comment.
Include migration source generation when emitting DynamoDB items
flushTable now emits only st.itemsByGen[activeGen], but DynamoDB tables can be in a live migration state where schema.generation is new and schema.migrating_from_generation still contains readable data; the read path explicitly falls back to that source generation during migration (see readLogicalItemAt in adapter/dynamodb.go). In that state, this filter silently drops unmigrated rows from backups, so restoring the dump loses logically visible items if a snapshot is taken mid-migration. Fresh evidence in this revision is the hard filter at this line with no migrating_from_generation inclusion path.
Useful? React with 👍 / 👎.
…est' into feat/backup-phase0a-dynamodb
… round 2) Codex P1 #227: during a generation migration, schema.Generation is the new gen and schema.MigratingFromGeneration carries the source. The live read path (adapter/dynamodb.go readLogicalItemAt) falls back to the source for items not yet copied. The previous flushTable filtered to only schema.GetGeneration() and silently dropped unmigrated rows from the dump. flushTable now emits items from BOTH generations. Order matters: the source is emitted FIRST, then the active gen, so when both generations carry the same primary key the active-gen content is the one that survives writeFileAtomic's tmp+rename — matching the live code's "new gen wins on conflict" rule. The stale-generation warning was generalised to take an emit-set (totalStaleItemsExcluding); rows in any other generation are counted into the warning so the operator still sees the orphan window. Tests: TestDDB_MigrationSourceGenerationItemsAreEmitted, TestDDB_NewGenerationWinsOverMigrationSourceForSameKey.
|
Codex P1 #227 addressed in
The stale-generation warning was generalised ( Tests: @claude review |
…to feat/backup-phase0a-s3
|
Claude finished @bootjp's task in 4m 21s —— View job Claude finished @bootjp's task —— Review:
|
| Finding | Severity |
|---|---|
| Codex P1 #227: include migration-source-gen items | ✅ Correctly fixed |
Emit order matches live readLogicalItemAt preference |
✅ Verified |
parseDDBItemKey correct for ordered encoding |
✅ Verified |
| Same-PK collision resolves to active gen via tmp+rename | ✅ Verified |
| Three-generation scenario (stale + source + active) untested | Low |
Non-deterministic _schema.json output (carried from prior round) |
Low |
Core correctness is solid. The two Low findings are improvements for operator experience/testability, not correctness gaps.
Three follow-up issues from Codex on top of round 1. #619 Codex P1 -- assemble only manifest-declared parts. filterChunksForManifest previously matched on uploadID only; the manifest declares specific (partNo, partVersion) tuples, and S3's overwrite-then-async-cleanup window can leave older partVersion chunks present under the same (bucket, generation, object, uploadID). Mixing them produces corrupted bytes. s3ObjectState gains declaredParts map[s3PartKey]struct{}, populated by HandleObjectManifest. filterChunksForManifest takes the set and restricts emission to declared (partNo, partVersion) tuples. Test TestS3_StalePartVersionExcludedFromAssembledBody asserts a stale partVersion=7 cannot leak when the manifest declares partVersion=9. #497 Codex P2 -- preserve dot segments. safeJoinUnderRoot used filepath.Clean which collapses "a/../b" to "b". S3 treats those bytes literally; "a/../b" and "b" are distinct keys that would have silently merged into one output file. Now explicitly rejects any object key whose segments are "." or ".." with ErrS3MalformedKey. NUL bytes also rejected. Test TestS3_DotSegmentObjectKeyRejected covers the four forms ("a/../b", "a/./b", "..", "."). #521 Codex P2 -- cross-generation collision. s3BucketState gains activeGen captured from the bucket-meta record. flushBucket suppresses objects whose generation differs from activeGen (under --include-orphans, those flow to _orphans/; by default they're dropped with an s3_stale_generation_objects warning). Tests TestS3_StaleGenerationObjectExcluded. flushBucket cyclomatic complexity stayed under the cap by extracting flushBucketObjects.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: ce5b7dac74
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| rawName, err := base64.RawURLEncoding.DecodeString(encoded) | ||
| if err != nil { |
There was a problem hiding this comment.
Reject empty table segment in DynamoDB meta keys
HandleTableMeta accepts !ddb|meta|table| as valid because base64.RawURLEncoding.DecodeString("") succeeds, so a truncated/malformed key is treated as table name "" instead of returning ErrDDBMalformedKey. In corrupted snapshots this can route schema output into the wrong scope and mask key corruption that Phase 0 is supposed to surface loudly.
Useful? React with 👍 / 👎.
| genEnd := strings.IndexByte(afterTable, '|') | ||
| if genEnd <= 0 { | ||
| return "", 0, errors.Wrapf(ErrDDBMalformedKey, | ||
| "item key missing gen/rest separator: %q", key) | ||
| } |
There was a problem hiding this comment.
Reject item keys with no payload after generation
parseDDBItemKey only checks that a | exists after the generation, but it does not require bytes after that separator, so keys like !ddb|item|<table>|7| are accepted. This lets truncated item keys bypass malformed-key detection and be emitted using only value-side attributes, which can hide snapshot corruption instead of failing fast.
Useful? React with 👍 / 👎.
…est' into feat/backup-phase0a-dynamodb
Two Codex P2 strict-validation follow-ups. #117 -- empty table-meta segment. HandleTableMeta accepted `!ddb|meta|table|` (no encoded segment) because base64.RawURLEncoding.DecodeString("") returns empty bytes without error, so the schema would route under the empty table name. Now rejected with ErrDDBMalformedKey before the JSON decode. TestDDB_RejectsTableMetaKeyWithEmptySegment. #303 -- truncated item key. parseDDBItemKey accepted `!ddb|item|<table>|7|` (gen separator present, no primary-key payload). The gen-end check was "genEnd > 0" which a trailing `|` satisfies. Added a follow-up check that genEnd+1 != len(afterTable) so a payload-less key surfaces as ErrDDBMalformedKey rather than emit under value-side attributes only. TestDDB_RejectsItemKeyWithEmptyPrimaryKeyPayload.
|
You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard. |
|
2 Codex P2 strict-validation follow-ups addressed in #117 — Empty table-meta segment. #303 — Truncated item key. @claude review |
|
Claude finished @bootjp's task in 2m 51s —— View job Review:
|
| Finding | Severity |
|---|---|
| Codex P2 #117: empty table-meta segment rejected | ✅ Correctly fixed |
| Codex P2 #303: empty item-key payload rejected | ✅ Correctly fixed |
| Both guards trace correctly for normal + truncated inputs | ✅ Verified |
tableEnd <= 0 + new genEnd+1 guards cover all truncation points |
✅ Verified |
| Three-generation scenario (stale + source + active) untested | Low — carried from round 2 |
Non-deterministic _schema.json output |
Low — carried from round 2 |
This PR is ready to merge from a correctness standpoint. The two Low items are quality-of-life improvements (testability and backup diff stability) that can land in a follow-up.
…to feat/backup-phase0a-s3
Codex P1 #615: when a bucket contains both `path/to` and `path/to/sub`, the natural-path mapping puts the shorter key at a file path and the longer key requires that path to be a parent directory. POSIX cannot represent both; the prior code would fail with EEXIST/ENOTDIR at MkdirAll/Rename during finalize, breaking the documented leaf-data-suffix rename strategy. Pre-flush computeDirPrefixes() walks every active-gen object key and accumulates each parent prefix (a key "a/b/c" contributes "a" and "a/b"). flushBucketObjects checks whether each object's key is itself a prefix consumed by another active key; if so, flushObjectWithCollision is called with needsLeafDataRename=true, which routes resolveObjectFilename onto the existing S3LeafDataSuffix path. KEYMAP.jsonl records the rename with KindS3LeafData so restore can reverse it. flushObject (the no-collision wrapper) was inlined into flushObjectWithCollision since it was unused after the call sites moved to the explicit-flag form. Test TestS3_FileVsDirectoryKeyCollisionGetsLeafDataRename emits "path/to" and "path/to/sub", asserts: - "path/to/sub" lands at its natural path with body "CHILD" - "path/to" is renamed to "path/to.elastickv-leaf-data" with body "LEAF" - KEYMAP.jsonl carries one record with KindS3LeafData and original "path/to"
…est' into feat/backup-phase0a-s3
…718, round 4) Two Codex P1 follow-ups. #729 -- chunk completeness check. assembleObjectBody filtered + sorted whatever chunk files happened to exist and wrote them, but never verified the manifest's declared chunk_count was actually present. A partial / racy / corrupted snapshot would silently emit a truncated body. Added verifyChunkCompleteness which counts chunks per (partNo, partVersion) and asserts the count matches manifest.parts[].chunk_count AND the highest chunkNo equals chunk_count-1. Mismatch surfaces as ErrS3IncompleteBlobChunks before any bytes are written. The declaredParts map's value type changed from struct{} to s3DeclaredPart{chunkCount} to carry the contract. #614 -- empty slash segments. safeJoinUnderRoot rejected `.` and `..` but allowed empty segments ("a//b", "a/", trailing "/"). filepath.Join collapses these, so distinct S3 keys would silently overwrite each other at finalize. The dot-segment guard now also rejects "" segments anywhere except the leading position (where a leading "/" produces an initial empty segment that filepath.Join handles safely under the already-tested absolute-path-confined-under-bucket behaviour). Tests: TestS3_IncompleteChunksRejected, TestS3_EmptySlashSegmentsRejected (covers a//b, a/, /a//, x/).
…est' into feat/backup-phase0a-dynamodb
…a) (#718) ## Summary Stacked on top of #716 (DynamoDB) and #717 (s3keys.ParseBlobKey). Adds the S3 encoder for the Phase 0 logical-backup decoder — the most complex per-adapter piece because it must reassemble multipart object bodies from independent blob chunks. **Snapshot prefixes handled:** - `!s3|bucket|meta|<bucket>` → `s3/<bucket>/_bucket.json` - `!s3|obj|head|<bucket><gen><object>` → `s3/<bucket>/<object>.elastickv-meta.json` (sidecar) - `!s3|blob|<bucket><gen><object><uploadID><partNo><chunkNo>[<partVersion>]` → spilled to scratch on arrival; concatenated into `s3/<bucket>/<object>` at Finalize in `(partNo, partVersion, chunkNo)` order - `!s3|upload|meta|`/`!s3|upload|part|`: excluded by default; opt in via `--include-incomplete-uploads` to emit under `_incomplete_uploads/records.jsonl` - `!s3|bucket|gen|`, `!s3|gc|upload|`, `!s3route|`: ignored ## How body assembly works Each blob chunk is written atomically to a per-(bucket, object) scratch directory as it arrives, registered under the `(uploadID, partNo, chunkNo, partVersion)` routing key. At Finalize, chunks for each object are sorted and concatenated into a single body file via tmp+rename. **Body bytes are never held in memory** — only the scratch path map. The scratch tree is removed on Finalize. ## Reserved-suffix collision handling A user object key ending in `.elastickv-meta.json` is rejected with `ErrS3MetaSuffixCollision` by default. `WithRenameCollisions(true)` appends `.user-data` to the body file and records the rename in `s3/<bucket>/KEYMAP.jsonl` with `KindMetaCollision` so the dump remains reversible. Per-bucket `KEYMAP.jsonl` is opened lazily and dropped on Finalize if no records were written. ## Test plan - [x] `go test -race ./internal/backup/...` — pass. - [x] `golangci-lint run ./internal/backup/...` — clean. - [x] 11 tests covering bucket-meta round-trip, multipart out-of-order chunk reassembly, orphan-chunks warning, sidecar collision (rejected/renamed), JSON-decode rejection on both paths, ignored-prefix no-op, `--include-incomplete-uploads` round-trip + default skip, versioned blob assembly. ## Stacking Base: `feat/backup-phase0a-dynamodb` (PR #716). Also depends on `feat/s3keys-parse-blob-key` (PR #717).
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 1ccd68091a
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| } | ||
|
|
||
| func (s *S3Encoder) flushBucket(b *s3BucketState) error { | ||
| bucketDir := filepath.Join(s.outRoot, "s3", EncodeSegment([]byte(b.name))) |
There was a problem hiding this comment.
Reject dot-segment bucket names before filesystem join
When a snapshot contains a syntactically valid but invalid bucket segment like "..", HandleBucketMeta accepts it and flushBucket builds bucketDir with filepath.Join(s.outRoot, "s3", EncodeSegment([]byte(b.name))); because EncodeSegment("..") == "..", writes escape the s3/ subtree (e.g., _bucket.json lands under outRoot). This is a path traversal/corruption risk for malformed or untrusted snapshots and should be blocked by validating bucket names or rejecting ./.. before joining.
Useful? React with 👍 / 👎.
Codex P1 round 13 (commit 1ccd680): The PR #718 round-9 fix (`scratchDirForBlob`) rejected dot-component bucket/object names at the SCRATCH path, but the per-bucket FINAL output construction in flushBucket went through `filepath.Join(s.outRoot, "s3", EncodeSegment(b.name))` without validating the bucket name. EncodeSegment preserves `.` and `..` (both RFC3986-unreserved), so: - bucket name "." → bucketDir collapses to `<outRoot>/s3/` and `_bucket.json` lands at the s3-root level instead of in a per-bucket subdir. - bucket name ".." → bucketDir resolves to `<outRoot>` itself, letting `_bucket.json` clobber dump-root files like MANIFEST.json. flushBucket now refuses sole-dot and empty bucket names with ErrS3MalformedKey. Caller audit (per loop's audit-semantics rule for fail-closed changes): - Single caller in Finalize loop, which already accumulates into firstErr — surfacing ErrS3MalformedKey there is consistent with existing malformed-key handling (dot-segment object keys, NUL bytes, leading slash, etc. all surface through the same error class). Test: TestS3_FlushBucketRejectsDotSegmentBucketName covers ".", "..", and empty.
|
@claude review Round 6 fix for Codex P1 round 13 on commit 1ccd680 (commit The PR #718 round-9 fix (
Test: |
|
Claude finished @bootjp's task in 4m 4s —— View job Review:
|
| Case | b.name |
Guard arm | Expected |
|---|---|---|---|
"." |
"." |
first arm | ErrS3MalformedKey ✓ |
".." |
".." |
first arm | ErrS3MalformedKey ✓ |
"" |
"" |
second arm | ErrS3MalformedKey ✓ |
The test calls HandleBucketMeta (succeeds — correct, because ParseBucketMetaKey accepts all three) then Finalize and asserts errors.Is(err, ErrS3MalformedKey). ✓
Interaction with TestS3_HandleBlobRejectsScratchEscape: That test covers the scratch-path guard for HandleBlob with bucket = "." / "..". The new test covers the final-path guard for flushBucket. Together they leave no gap. ✓
Carried from prior rounds (DynamoDB encoder only — both Low):
-
Three-generation scenario untested.
totalStaleItemsExcludingin the DynamoDB encoder is correct by inspection, but no test exercisesstale_countwhen more than one generation is stale (schema gen=7, migrationSource=6, debris at gen=5). -
No Finalize-level N-key migration-conflict test.
TestDDB_CanonicalNumberKeySegmentunit-tests canonicalization;TestDDB_NewGenerationWinsOverMigrationSourceForSameKeyuses S-type keys. An end-to-end Finalize test with N-type hash keys in both generations would directly reproduce the round 9 bug scenario.
Summary
| Finding | Severity |
|---|---|
Codex P1 round 13: dot-segment + empty bucket name rejected in flushBucket |
✅ Correctly fixed |
EncodeSegment('.') and ('..') confirmed to pass through unmodified |
✅ Verified |
Guard fires before filepath.Join, os.MkdirAll, and writeFileAtomic |
✅ Verified |
scratchDirForBlob (Handle-time) + flushBucket (Finalize-time) guards cover all paths |
✅ Verified |
"" bucket name: accepted by ParseBucketMetaKey, caught by flushBucket guard |
✅ Verified |
"..." and longer pure-dot names: safe from path traversal (not collapsed by filepath.Clean) |
✅ Verified safe |
Finalize accumulates firstErr (valid buckets flush even when one is malformed) |
Intentional pre-existing behavior |
| Three-generation scenario (DynamoDB encoder) untested | Low — carried from round 3 |
| No Finalize-level N-key migration conflict test (DynamoDB encoder) | Low — carried from round 4 |
The fix is solid. The two Low findings are DynamoDB-side test coverage gaps carried from prior rounds; they do not affect S3 encoder correctness.
…at/backup-phase0a-dynamodb
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: f710f5e496
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| dir := filepath.Join(s.outRoot, "s3", EncodeSegment([]byte(bucket)), "_incomplete_uploads") | ||
| if err := os.MkdirAll(dir, 0o755); err != nil { //nolint:mnd // 0755 == standard dir mode |
There was a problem hiding this comment.
Validate bucket segment before incomplete-upload path join
When WithIncludeIncompleteUploads(true) is enabled, this path is built directly from the parsed bucket name without rejecting ".", "..", or empty segments. A malformed snapshot key such as an upload record for bucket ".." will make filepath.Join escape the s3/ subtree and create/write records.jsonl outside the intended output directory before Finalize’s bucket-name guard runs, which is a filesystem traversal/corruption risk for untrusted or damaged dumps.
Useful? React with 👍 / 👎.
… round 7) Codex P1 round 13 follow-up (commit f710f5e): The round-6 fix added a dot-segment guard to flushBucket, but that runs at FINALIZE. HandleIncompleteUpload writes to disk EARLIER — every upload-family record opens a per-bucket records.jsonl on first arrival, and that path was built via `filepath.Join(outRoot, "s3", EncodeSegment(bucket), "_incomplete_uploads")` without validating the bucket segment. EncodeSegment preserves "." and "..", so a malformed snapshot upload record with bucket="." or ".." would escape the s3/ subtree at HandleIncompleteUpload time, well before flushBucket's guard runs. The same fail-closed bucket-name check now runs at the HandleIncompleteUpload entry. Caller audit per loop's audit-semantics rule: HandleIncompleteUpload is invoked only by test code; the surfaced ErrS3MalformedKey class matches the existing parseUploadFamily failure path. Test: TestS3_HandleIncompleteUploadRejectsDotSegmentBucket covers ".", "..", and empty bucket names.
…at/backup-phase0a-dynamodb
|
@claude review Round 7 fix for Codex P1 round 13 follow-up on commit f710f5e (commit The round-6 The same fail-closed bucket-name check now runs at the Test: |
|
Claude finished @bootjp's task in 3m 17s —— View job Review:
|
| Case | bucket |
parseUploadFamily returns |
Guard arm | Expected |
|---|---|---|---|---|
"." |
"." |
bucket=".", ok=true |
first arm | ErrS3MalformedKey ✓ |
".." |
".." |
bucket="..", ok=true |
first arm | ErrS3MalformedKey ✓ |
"" |
"" |
bucket="", ok=true |
second arm | ErrS3MalformedKey ✓ |
All assert errors.Is(err, ErrS3MalformedKey). ✓
One new gap (Low) — S3UploadPartPrefix path untested. parseUploadFamily dispatches on two prefixes: S3UploadMetaPrefix (routed through decodeBucketSegmentForRouting, tested here) and S3UploadPartPrefix (routed through s3keys.ParseUploadPartKey, not tested). For a crafted UploadPartKey(".", ...), ParseUploadPartKey would extract bucket = "." and the same guard at line 501 would fire. The guard location is prefix-agnostic — both paths reach the same switch — so correctness is guaranteed by code inspection. A test using S3UploadPartPrefix would be a belt-and-suspenders coverage improvement.
Carried from prior rounds (DynamoDB encoder — both Low):
-
Three-generation scenario untested. No test exercises
totalStaleItemsExcludingwith more than one stale generation (schema gen=7, migrationSource=6, debris at gen=5). -
No Finalize-level N-key migration-conflict test.
TestDDB_CanonicalNumberKeySegmentunit-tests N canonicalization; no end-to-end test with N-type hash keys in both source and active gens reproduces the round 9 bug scenario at the file-system level.
Summary
| Finding | Severity |
|---|---|
Codex P1 round 13 follow-up: dot-segment guard in HandleIncompleteUpload |
✅ Correctly fixed |
Empty-bucket ok=true from parseUploadFamily caught by bucket=="" arm |
✅ Verified |
Guard fires before bucketState, MkdirAll, openJSONL — no state pollution |
✅ Verified |
flushBucket (Finalize) + HandleIncompleteUpload (Handle-time) guards complement each other |
✅ Verified |
S3UploadPartPrefix dispatch branch not covered by new test |
Low — correctness guaranteed by guard placement; coverage gap only |
| Three-generation scenario (DynamoDB encoder) untested | Low — carried from round 3 |
| No Finalize-level N-key migration conflict test (DynamoDB encoder) | Low — carried from round 4 |
The fix is solid. The three Low findings are test coverage improvements; none affect backup safety or correctness.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: edfc944c78
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| return errors.Wrapf(ErrS3MalformedKey, | ||
| "orphan object key %q is a dot segment (would escape orphan dir)", obj.object) | ||
| } | ||
| dir := filepath.Join(bucketDir, "_orphans", EncodeSegment([]byte(obj.object))) |
There was a problem hiding this comment.
Include generation in orphan chunk output paths
When WithIncludeOrphans(true) is enabled, stale objects from different generations are all written under the same _orphans/<encoded-object>/ directory, and chunk filenames are derived only from (uploadID, partNo, chunkNo, partVersion). If two generations share those values (which is possible during delete/recreate cleanup windows), later writes overwrite earlier orphan chunks, silently dropping forensic data instead of preserving all stale generations.
Useful? React with 👍 / 👎.
Codex P2 round 13 (commit edfc944): `flushOrphanObject` wrote stale chunks to `<bucket>/_orphans/<encoded-object>/<scratchName>` where scratchName is `(uploadID, partNo, chunkNo, partVersion)`. Two stale generations sharing those four identifiers — possible during delete/recreate cleanup windows where the live system recycled the upload identifier — would overwrite one another, silently dropping forensic data the operator opted in to preserve via --include-orphans. Embed the object's generation in the path: `<bucket>/_orphans/<encoded-object>/gen-<N>/`. Each stale gen lands in its own subdir; restore tools can iterate per-gen. Caller audit (per loop's audit-semantics rule for output-shape change): single internal caller (flushBucketObjects); no public API change, only the on-disk layout under _orphans/ becomes one level deeper. The existing TestS3_OrphanChunksWrittenWhenIncludeOrphans test was updated to match the new path. Tests: - TestS3_OrphanChunksWrittenWhenIncludeOrphans updated for per-gen path. - NEW TestS3_OrphanChunksPreservedAcrossGenerations drives two stale gens sharing (uploadID, partNo, chunkNo, partVersion) and asserts both survive distinctly under their own gen-<N>/ dirs.
|
@claude review Round 8 fix for Codex P2 round 13 on commit edfc944 (commit
Embedded the object's generation in the path: Caller audit (per loop's audit-semantics rule for output-shape change): single internal caller
|
|
Claude finished @bootjp's task in 3m 16s —— View job Review:
|
| Finding | Severity |
|---|---|
| Codex P2 round 13: per-generation orphan chunk paths | ✅ Correctly fixed |
obj.generation source is correct (from ParseBlobKey via objectState) |
✅ Verified |
Collision scenario: same blobScratchName in two gens now safely separated |
✅ Verified |
Both callers of flushOrphanObject (stale path + no-manifest path) handled correctly |
✅ Verified |
genDir = "gen-<N>" is always a safe, non-dot path component |
✅ Verified |
Caller audit in commit message omits second caller (flushObjectWithCollision) |
Minor doc inaccuracy — no correctness impact |
| Regression test asserts dirs exist but not chunk content | Low — test proves directory isolation; content verification would add robustness |
| Three-generation scenario (DynamoDB encoder) untested | Low — carried from round 3 |
| No Finalize-level N-key migration conflict test (DynamoDB encoder) | Low — carried from round 4 |
The fix is solid. This PR is correct and ready from a backup-encoder safety standpoint.
…' into feat/backup-phase0a-dynamodb
💡 Codex Reviewelastickv/internal/backup/sqs.go Lines 291 to 299 in cdc62e3
ℹ️ About Codex in GitHubYour team has set up Codex to review pull requests in this repo. Reviews are triggered when you
If Codex has suggestions, it will comment; otherwise it will react with 👍. Codex can also answer questions or update the PR. Try commenting "@codex address that feedback". |
## Summary - First of the Phase 0a wide-column encoders. Translates `!hs|meta|` + `!hs|fld|` records on the live store into one `hashes/<encoded>.json` file per Redis hash, per the design at `docs/design/2026_04_29_proposed_snapshot_logical_decoder.md:327`. - Stacked on top of #713 (the strings/HLL/TTL encoder); reuses `RedisDB`, the SHA-fallback KEYMAP path, and the dirsCreated cache. - TTL is folded inline into `expire_at_ms` (not a sidecar) so a restorer can replay each hash in one shot. `HandleTTL` adds a `redisKindHash` route. - Field values reuse the UTF-8-or-base64 envelope from PR #714's SQS body for binary safety. Field names percent-encode (`EncodeSegment`) when not valid UTF-8 to keep JSON object keys as strings. - Empty hashes still emit a file (HLEN==0 is observable). Length mismatches between `meta.Len` and observed `!hs|fld|` keys surface as `redis_hash_length_mismatch` warnings. - Output is sorted by user key + field name for deterministic dumps (matches the round-9 sort pattern from #716). ## Self-review (per CLAUDE.md "five lenses") 1. **Data loss** — empty hashes preserved; declared/observed length mismatch warned; field value bytes preserved verbatim via base64 envelope when not UTF-8; SHA-fallback paths feed KEYMAP. 2. **Concurrency** — `RedisDB` is single-writer per the existing contract; the new `hashes` map follows the same per-key state model used by `kindByKey`/`inlineTTLEmitted`. 3. **Performance** — fields buffered per key (Redis hashes are typically small); `flushHashes` runs once at Finalize. Sort is `O(n log n)` over user keys + per-hash field names — acceptable at dump cadence. 4. **Data consistency** — `kindByKey` registration on both meta and field arrival prevents TTL mis-routing across snapshot record-ordering scenarios. 5. **Test coverage** — 8 sub-tests: round-trip, empty hash, inline TTL routing, length mismatch warning, binary value envelope, malformed meta, truncated field key, SHA-fallback keymap. ## Test plan - [x] `go test -race ./internal/backup/...` — clean. - [x] `GOOS=windows GOARCH=amd64 go build ./internal/backup/...` — clean. - [x] `GOOS=js GOARCH=wasm go build ./internal/backup/...` — clean. - [x] `golangci-lint run` — clean.
|
You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard. |
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (1)
internal/backup/dynamodb.go (1)
313-313: ⚡ Quick winRemove the new
//nolint:mndby replacing literals with named constants.Line 313 can satisfy
mndwithout suppression.Proposed refactor
+const ( + parseUintBase10 = 10 + parseUintBits64 = 64 +) + func parseDDBItemKey(key []byte) (string, uint64, error) { @@ - gen, err := strconv.ParseUint(afterTable[:genEnd], 10, 64) //nolint:mnd // 10 == decimal radix; 64 == uint64 width + gen, err := strconv.ParseUint(afterTable[:genEnd], parseUintBase10, parseUintBits64)As per coding guidelines, avoid
//nolintannotations — refactor instead.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/backup/dynamodb.go` at line 313, The ParseUint call uses magic numbers with a nolint suppression; remove the nolint and replace the literals in strconv.ParseUint(afterTable[:genEnd], 10, 64) with clearly named constants (e.g., decimalRadix and uint64BitSize) declared near the top of the file or in the same package, then use those constants in the call so the code reads strconv.ParseUint(afterTable[:genEnd], decimalRadix, uint64BitSize) and the mnd linter is satisfied.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@internal/backup/s3_test.go`:
- Around line 891-959: Add a new regression test (e.g.
TestS3_ZeroChunkCountWithStrayChunkRejected) that creates an encoder via
newS3Encoder, writes BucketMeta with HandleBucketMeta, writes an ObjectManifest
via enc.HandleObjectManifest that declares a part with "chunk_count": 0, then
emit a stray blob for that part using enc.HandleBlob with chunk_no 0, call
enc.Finalize(), and assert the returned error is ErrS3IncompleteBlobChunks; use
the same patterns and helper functions from the existing tests
(HandleBucketMeta, HandleObjectManifest, HandleBlob, Finalize,
ErrS3IncompleteBlobChunks) to locate where to add it.
In `@internal/backup/s3.go`:
- Around line 89-91: The code currently skips parts where want.chunkCount == 0
but later still accepts chunks for those same parts, allowing bytes to be
assembled for parts declared empty; update the chunk-processing path to detect a
manifest part with want.chunkCount == 0 and reject any incoming chunk for that
part (return an error instead of appending/accepting), and make the skip logic
consistent by marking the part handled as empty so no further chunk handlers
will consider it; look for uses of want.chunkCount and the chunk append/accept
logic (the loop that processes incoming chunks and where bytes are
appended/accepted) and add an explicit check that returns an error when a chunk
targets a part whose want.chunkCount == 0.
---
Nitpick comments:
In `@internal/backup/dynamodb.go`:
- Line 313: The ParseUint call uses magic numbers with a nolint suppression;
remove the nolint and replace the literals in
strconv.ParseUint(afterTable[:genEnd], 10, 64) with clearly named constants
(e.g., decimalRadix and uint64BitSize) declared near the top of the file or in
the same package, then use those constants in the call so the code reads
strconv.ParseUint(afterTable[:genEnd], decimalRadix, uint64BitSize) and the mnd
linter is satisfied.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: e711c23a-2167-4c74-b392-e5fbc0aec821
📒 Files selected for processing (4)
internal/backup/dynamodb.gointernal/backup/dynamodb_test.gointernal/backup/s3.gointernal/backup/s3_test.go
| func TestS3_DuplicateChunksWithMissingIndexRejected(t *testing.T) { | ||
| t.Parallel() | ||
| enc, _ := newS3Encoder(t) | ||
| bucket := "b" | ||
| gen := uint64(1) | ||
| object := "obj" | ||
| uploadID := "u" | ||
| if err := enc.HandleBucketMeta(s3keys.BucketMetaKey(bucket), encodeS3BucketMetaValue(t, map[string]any{ | ||
| "bucket_name": bucket, "generation": gen, | ||
| })); err != nil { | ||
| t.Fatal(err) | ||
| } | ||
| if err := enc.HandleObjectManifest(s3keys.ObjectManifestKey(bucket, gen, object), encodeS3ManifestValue(t, map[string]any{ | ||
| "upload_id": uploadID, "size_bytes": 9, "parts": []map[string]any{ | ||
| {"part_no": 1, "size_bytes": 9, "chunk_count": 3}, | ||
| }, | ||
| })); err != nil { | ||
| t.Fatal(err) | ||
| } | ||
| // Drive `{0, 2, 2}`: count satisfies, maxIndex satisfies, but | ||
| // chunk_no=1 is missing. The set-based validator must reject. | ||
| if err := enc.HandleBlob(s3keys.BlobKey(bucket, gen, object, uploadID, 1, 0), []byte("AAA")); err != nil { | ||
| t.Fatal(err) | ||
| } | ||
| if err := enc.HandleBlob(s3keys.BlobKey(bucket, gen, object, uploadID, 1, 2), []byte("CC1")); err != nil { | ||
| t.Fatal(err) | ||
| } | ||
| if err := enc.HandleBlob(s3keys.BlobKey(bucket, gen, object, uploadID, 1, 2), []byte("CC2")); err != nil { | ||
| t.Fatal(err) | ||
| } | ||
| err := enc.Finalize() | ||
| if !errors.Is(err, ErrS3IncompleteBlobChunks) { | ||
| t.Fatalf("err=%v want ErrS3IncompleteBlobChunks for chunk-set with duplicate+missing", err) | ||
| } | ||
| } | ||
|
|
||
| func TestS3_IncompleteChunksRejected(t *testing.T) { | ||
| t.Parallel() | ||
| enc, _ := newS3Encoder(t) | ||
| bucket := "b" | ||
| gen := uint64(1) | ||
| object := "obj" | ||
| uploadID := "u" | ||
| if err := enc.HandleBucketMeta(s3keys.BucketMetaKey(bucket), encodeS3BucketMetaValue(t, map[string]any{ | ||
| "bucket_name": bucket, "generation": gen, | ||
| })); err != nil { | ||
| t.Fatal(err) | ||
| } | ||
| // Manifest declares 3 chunks for partNo=1 but the snapshot only | ||
| // has 2 (chunkNo=0 and chunkNo=2; chunkNo=1 missing). Codex P1 | ||
| // #729. | ||
| if err := enc.HandleObjectManifest(s3keys.ObjectManifestKey(bucket, gen, object), encodeS3ManifestValue(t, map[string]any{ | ||
| "upload_id": uploadID, "size_bytes": 9, "parts": []map[string]any{ | ||
| {"part_no": 1, "size_bytes": 9, "chunk_count": 3}, | ||
| }, | ||
| })); err != nil { | ||
| t.Fatal(err) | ||
| } | ||
| if err := enc.HandleBlob(s3keys.BlobKey(bucket, gen, object, uploadID, 1, 0), []byte("AAA")); err != nil { | ||
| t.Fatal(err) | ||
| } | ||
| if err := enc.HandleBlob(s3keys.BlobKey(bucket, gen, object, uploadID, 1, 2), []byte("CCC")); err != nil { | ||
| t.Fatal(err) | ||
| } | ||
| err := enc.Finalize() | ||
| if !errors.Is(err, ErrS3IncompleteBlobChunks) { | ||
| t.Fatalf("err=%v want ErrS3IncompleteBlobChunks for missing chunk", err) | ||
| } | ||
| } |
There was a problem hiding this comment.
Add a regression test for chunk_count=0 with stray chunks.
Current completeness tests don’t cover the zero-count-part boundary. A case declaring chunk_count: 0 and emitting chunk_no: 0 should fail with ErrS3IncompleteBlobChunks.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@internal/backup/s3_test.go` around lines 891 - 959, Add a new regression test
(e.g. TestS3_ZeroChunkCountWithStrayChunkRejected) that creates an encoder via
newS3Encoder, writes BucketMeta with HandleBucketMeta, writes an ObjectManifest
via enc.HandleObjectManifest that declares a part with "chunk_count": 0, then
emit a stray blob for that part using enc.HandleBlob with chunk_no 0, call
enc.Finalize(), and assert the returned error is ErrS3IncompleteBlobChunks; use
the same patterns and helper functions from the existing tests
(HandleBucketMeta, HandleObjectManifest, HandleBlob, Finalize,
ErrS3IncompleteBlobChunks) to locate where to add it.
…nd 9) CodeRabbit Major round 13 (commit 389c4f7): `verifyChunkCompleteness` skipped parts where `want.chunkCount == 0`, but `filterChunksForManifest` still included any chunks recorded for those (partNo, partVersion) keys. The asymmetry let stray chunks for a manifest-declared zero-count part bleed into the assembled body while the completeness check passed — a corrupted body that violated the declared-part contract without raising any error. Make `filterChunksForManifest` symmetric: skip chunks whose declared part has `chunkCount == 0`. The two paths now agree on the "zero-count part contributes no bytes" invariant. Caller audit (per loop's audit-semantics rule for behavior change): single caller `assembleObjectBody`. The change only narrows what chunks reach the assembler; downstream completeness check already skips zero-count parts so the consistency is restored, not broken. No public API change. Test: - TestS3_ZeroChunkCountPartIgnoresStrayChunks drives a two-part manifest where part 1 has chunk_count=1 (real content "hello") and part 2 has chunk_count=0 with a stray "STRAY" chunk recorded. Asserts the assembled body is exactly "hello" — no stray bytes leak.
|
@claude review Round 9 fix for CodeRabbit Major round 13 on commit 389c4f7 (commit
Made Caller audit (per loop's audit-semantics rule for behavior change): single caller Test:
|
|
You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard. |
|
Claude finished @bootjp's task in 2m 34s —— View job Review:
|
| Input | Expected |
|---|---|
Part 1: chunk_count=1, real chunk "hello" |
Present in body |
Part 2: chunk_count=0, stray chunk "STRAY" |
Absent from body |
Finalize() |
Succeeds (no error) |
| Assembled body | Exactly "hello" |
The test drives the full stack: HandleBucketMeta → HandleObjectManifest → HandleBlob (×2) → Finalize → file read. It directly reproduces the pre-fix bug scenario and verifies the post-fix invariant. ✓
One Low gap: No test exercises the edge case where the manifest contains only zero-count parts with stray chunks (no real content). In that case filterChunksForManifest returns an empty slice, verifyChunkCompleteness passes, and assembleObjectBody produces an empty file. Correct by inspection; untested.
Carried //nolint:mnd (CodeRabbit Nitpick — carried from prior round):
dynamodb.go:313 still uses //nolint:mnd for strconv.ParseUint(..., 10, 64). CLAUDE.md explicitly says "Avoid //nolint — refactor instead." The fix is two named constants:
Carried from prior rounds (DynamoDB encoder — both Low):
-
Three-generation scenario untested. No test exercises
totalStaleItemsExcludingwith more than one stale generation (schema gen=7, migrationSource=6, debris at gen=5). -
No Finalize-level N-key migration conflict test.
TestDDB_CanonicalNumberKeySegmentunit-tests N canonicalization; no end-to-end test with N-type hash keys in both source and active gens at the filesystem level.
Summary
| Finding | Severity |
|---|---|
CodeRabbit Major round 13: filterChunksForManifest skips zero-count-part chunks |
✅ Correctly fixed |
Fix placement: after !ok guard, declared is always initialized |
✅ Verified |
verifyChunkCompleteness + filterChunksForManifest now symmetric on zero-count parts |
✅ Verified |
| Lenient (discard stray) vs. strict (error) — lenient is correct for backup encoder | ✅ Verified design choice |
partVersion scoping: zero-count filter is per {partNo, partVersion} key only |
✅ Verified |
| No test for all-zero-count-parts scenario (empty body result) | Low |
//nolint:mnd on dynamodb.go:313 (CLAUDE.md says refactor, not suppress) |
Low — nitpick |
| Three-generation scenario (DynamoDB encoder) untested | Low — carried from round 3 |
| No Finalize-level N-key migration conflict test (DynamoDB encoder) | Low — carried from round 4 |
The core fix is correct and directly addresses the CodeRabbit Major finding. The PR is sound from a correctness standpoint.
Summary
Stacked on top of #714. Adds the DynamoDB encoder for the Phase 0 logical-backup decoder.
Snapshot prefixes handled:
!ddb|meta|table|<base64url(table)>→dynamodb/<table>/_schema.json(DescribeTable-shaped JSON)!ddb|item|<base64url(table)>|<gen>|<rest>→ per-item JSON underdynamodb/<table>/items/items/<pk>.jsonitems/<pk>/<sk>.jsonitems/b64.<base64url>[/...].json(no collision with hex-shaped string keys)!ddb|gsi|*ignored (derivable; replaying would conflict with destination index maintenance)!ddb|meta|gen|*ignored (operational counter)Why buffer + emit at Finalize
Lex order is
'i' < 'm', so items arrive before the schema. Encoder buffers per encoded-table-segment and emits once the schema is known. Item filename derivation reads the hash/range key NAMES from the schema, then looks them up in the item's attributes map.All 10 attribute kinds covered
S / N / B / BOOL / NULL / SS / NS / BS / L / M translated to their AWS-DynamoDB-JSON shapes. Empty oneof (malformed proto value) surfaces as
NULL=trueso the dump remains deserialisable.attributeValueToPublicis split into scalar/set/composite helpers to keep cyclomatic complexity under the package cap.Bundle mode is stubbed
WithBundleJSONL(true)makes Finalize return a clear "not implemented in this PR" error. Per-item layout (the design's documented default) is what this PR delivers.Test plan
go test -race ./internal/backup/...— pass.golangci-lint run ./internal/backup/...— clean.Stacking
Base:
feat/backup-phase0a-sqs(PR #714).Summary by CodeRabbit
Release Notes
New Features
Improvements
Tests