backup: DynamoDB encoder for tables and items (Phase 0a) by bootjp · Pull Request #716 · bootjp/elastickv

bootjp · 2026-04-30T09:31:47Z

Summary

Stacked on top of #714. Adds the DynamoDB encoder for the Phase 0 logical-backup decoder.

Snapshot prefixes handled:

!ddb|meta|table|<base64url(table)> → dynamodb/<table>/_schema.json (DescribeTable-shaped JSON)
!ddb|item|<base64url(table)>|<gen>|<rest> → per-item JSON under dynamodb/<table>/items/
- Hash-only: items/<pk>.json
- Composite: items/<pk>/<sk>.json
- Binary key: items/b64.<base64url>[/...].json (no collision with hex-shaped string keys)
!ddb|gsi|* ignored (derivable; replaying would conflict with destination index maintenance)
!ddb|meta|gen|* ignored (operational counter)

Why buffer + emit at Finalize

Lex order is 'i' < 'm', so items arrive before the schema. Encoder buffers per encoded-table-segment and emits once the schema is known. Item filename derivation reads the hash/range key NAMES from the schema, then looks them up in the item's attributes map.

All 10 attribute kinds covered

S / N / B / BOOL / NULL / SS / NS / BS / L / M translated to their AWS-DynamoDB-JSON shapes. Empty oneof (malformed proto value) surfaces as NULL=true so the dump remains deserialisable. attributeValueToPublic is split into scalar/set/composite helpers to keep cyclomatic complexity under the package cap.

Bundle mode is stubbed

WithBundleJSONL(true) makes Finalize return a clear "not implemented in this PR" error. Per-item layout (the design's documented default) is what this PR delivers.

Test plan

go test -race ./internal/backup/... — pass.
golangci-lint run ./internal/backup/... — clean.
11 tests covering hash-only / composite round-trip, binary primary key, orphan items, magic-prefix rejection on schema and item paths, missing-key-attribute rejection, GSI noop, all 10 attribute kinds round-trip, empty-oneof behavior, bundle-mode stub, malformed-key rejection.

Stacking

Base: feat/backup-phase0a-sqs (PR #714).

Summary by CodeRabbit

Release Notes

New Features
- Added S3 backup encoder for processing S3 snapshots with bucket metadata, object assembly, and multipart upload support.
- Enhanced DynamoDB backup handling with improved generation management and migration support.
Improvements
- Stricter validation of backup data to prevent path traversal and collision issues.
- Deterministic schema output for consistent backup reproducibility.
- Better handling of incomplete uploads and orphaned data across backup generations.
Tests
- Comprehensive test coverage for DynamoDB and S3 backup encoders including edge cases and error scenarios.

coderabbitai · 2026-04-30T09:31:54Z

Warning

Rate limit exceeded

@bootjp has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 50 minutes and 19 seconds before requesting another review.

To keep reviews running without waiting, you can enable usage-based add-on for your organization. This allows additional reviews beyond the hourly cap. Account admins can enable it under billing.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 212055e0-a7f7-4bb0-90c9-160e1ce69cc6

📥 Commits

Reviewing files that changed from the base of the PR and between 389c4f7 and 6a1bef1.

📒 Files selected for processing (2)

internal/backup/s3.go
internal/backup/s3_test.go

📝 Walkthrough

Walkthrough

This PR enhances the DynamoDB backup encoder and introduces a complete S3 backup encoder implementation. The DynamoDB encoder now buffers items by generation, handles migration-source generations, canonicalizes numeric keys, produces deterministic schemas, and strictly validates key syntax. The new S3 encoder assembles multipart objects from chunked blobs, emits bucket metadata and sidecars, suppresses stale generations, handles collision renaming with a KEYMAP, and validates object key safety.

Changes

DynamoDB & S3 Backup Encoding

Layer / File(s)	Summary
Data Structure & Buffering `internal/backup/dynamodb.go` (lines 68–81)	`ddbTableState` replaces single `items` list with `itemsByGen map[uint64]` to track items per generation.
Generation-Aware Item Handling `internal/backup/dynamodb.go` (lines 143–169)	`HandleItem` now extracts generation from encoded key and appends items under their generation; `parseDDBItemKey` parses generation and validates payload presence.
Generation Migration & Deterministic Emission `internal/backup/dynamodb.go` (lines 223–256)	`flushTable` rewritten to emit items from migration-source generation first, then active generation, filtering stale generations with warnings.
Numeric Key Canonicalization `internal/backup/dynamodb.go` (lines 396–442)	DynamoDB number attributes canonicalized via `math/big.Rat` to stable decimal form before filesystem encoding; dot-segment filenames rejected.
Deterministic Schema Output `internal/backup/dynamodb.go` (lines 464–503)	`schemaToPublic` sorts GSI names and attribute definitions to guarantee consistent JSON ordering; empty sets serialize as `[]` not `null`.
Key Validation Hardening `internal/backup/dynamodb.go` (lines 118–125, 288–328)	Table-meta rejects empty segments; item keys require both table/gen separators and non-empty primary-key payload.
S3 Encoder Core `internal/backup/s3.go` (lines 109–528)	`S3Encoder` processes bucket-meta and object-manifest records, spills blob chunks to scratch, handles incomplete uploads, and prepares state for assembly.
Object Assembly & Blob Completeness `internal/backup/s3.go` (lines 947–1034, 60–107)	`assembleObjectBody` deterministically concatenates scratch chunks filtered by manifest constraints; `verifyChunkCompleteness` ensures all declared `(partNo, partVersion, chunkNo)` indexes present.
Generation & Stale Data Filtering `internal/backup/s3.go` (lines 536–640)	`Finalize` and `flushBucketObjects` suppress stale-generation objects, optionally emit orphans, and warn on exclusions.
Collision Handling & Renames `internal/backup/s3.go` (lines 705–780, 782–901)	Reserved `.elastickv-meta.json` suffix and leaf-data directory collisions are detected; renames recorded in `KEYMAP.jsonl` when flag enabled; unresolvable collisions rejected.
Filesystem Path Safety `internal/backup/s3.go` (lines 782–901, 418–528)	`safeJoinUnderRoot` rejects NUL, backslashes, empty/`.`/`..` segments, leading slashes, and path traversal; bucket names and scratch paths validated similarly.
DynamoDB Tests `internal/backup/dynamodb_test.go` (lines 338–678)	Tests cover migration-generation item emission, dot-segment rejection, numeric key canonicalization, deterministic schema ordering, empty set JSON serialization, generation parsing, and malformed-key rejection.
S3 Tests `internal/backup/s3_test.go` (lines 41–1009)	Comprehensive tests verify single/multipart assembly, orphan warnings, collision handling, incomplete uploads, key safety (path traversal, backslash, leading slash, dot segments), stale-generation filtering, chunk completeness, and cross-generation isolation.

Sequence Diagram(s)

sequenceDiagram
    participant User
    participant DDBEncoder as DynamoDB Encoder
    participant Buffer as Per-Generation Buffer
    participant Finalize as Flush & Emit

    User->>DDBEncoder: HandleTableMeta(meta_key, schema_proto)
    DDBEncoder->>DDBEncoder: Decode schema, validate
    
    User->>DDBEncoder: HandleItem(item_key, item_proto)
    DDBEncoder->>DDBEncoder: Parse key → extract generation
    DDBEncoder->>Buffer: Append item under its generation
    
    Note over User,DDBEncoder: (possibly multiple calls with mixed generations)
    
    User->>Finalize: Finalize()
    Finalize->>Finalize: Compute emitOrder: [migration_source_gen, active_gen]
    Finalize->>Finalize: Filter stale generations, warn
    loop For each generation in emitOrder
        Finalize->>Finalize: Canonicalize numeric keys
        Finalize->>Finalize: Write items to per-gen .json file
    end
    Finalize-->>User: ✓ Deterministic output

sequenceDiagram
    participant User
    participant S3Encoder as S3 Encoder
    participant Scratch as Scratch Storage
    participant Manifest as Manifest/Metadata
    participant Assemble as Assembly & Finalize

    User->>S3Encoder: HandleBucketMeta(key, bucket_meta_json)
    S3Encoder->>Manifest: Store bucket state & active generation

    User->>S3Encoder: HandleObjectManifest(key, manifest_json)
    S3Encoder->>Manifest: Decode manifest, extract uploadID & parts

    User->>S3Encoder: HandleBlob(blob_key, chunk_bytes)
    S3Encoder->>S3Encoder: Parse blob key → (bucket, object, partNo, chunkNo)
    S3Encoder->>Scratch: Write chunk atomically, validate path safety

    Note over User,S3Encoder: (possibly interleaved across buckets/objects)

    User->>Assemble: Finalize()
    Assemble->>Assemble: Filter chunks by manifest uploadID & declared parts
    Assemble->>Assemble: Verify all (partNo, partVersion, chunkNo) indexes present
    Assemble->>Scratch: Read chunks in sorted (partNo, partVersion, chunkNo) order
    Assemble->>Assemble: Assemble into final body, emit .elastickv-meta.json sidecar
    Assemble->>Assemble: Detect/handle collisions (reserved suffix, leaf-data dir)
    Assemble->>Assemble: Record renames in KEYMAP.jsonl (if enabled)
    Assemble->>Assemble: Suppress stale-generation objects, warn
    Assemble->>Scratch: Delete scratch subtree
    Assemble-->>User: ✓ Assembled objects + sidecars

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

elastickv#706: Establishes the logical backup format design, including per-adapter filename encoding rules, generation handling, collision-rename behavior, and reserved-suffix/orphan semantics that this PR operationalizes for DynamoDB and S3.

Poem

🐰 A rabbit hops through generations bold,
Shuffles items, keeps them sorted, true to told.
Blobs dance in parts, chunks in place aligned,
Collisions dodged, sidecars signed.
Backup fields bloom—deterministic, sound, refined!

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 36.14% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately describes the main change: implementing a DynamoDB encoder for backup Phase 0a that handles tables and items, matching the primary purpose of the changeset.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch feat/backup-phase0a-dynamodb

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Review rate limit: 0/1 reviews remaining, refill in 50 minutes and 19 seconds.}

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

gemini-code-assist

Code Review

This pull request introduces a DynamoDB encoder designed to export table schemas and items into a structured filesystem layout for logical backups. The implementation handles the conversion of internal protobuf representations into AWS-compatible JSON formats, supporting various DynamoDB attribute types and secondary indexes. Feedback suggests adopting a fail-fast error handling strategy during the finalization phase when flushing tables and refining the serialization of set attributes to ensure consistent JSON output for empty or nil collections.

gemini-code-assist · 2026-04-30T09:35:28Z

+	var firstErr error
+	for _, st := range d.tables {
+		if st.schema == nil {
+			d.emitWarn("ddb_orphan_items",
+				"encoded_table", st.encoded,
+				"buffered_items", len(st.items))
+			continue
+		}
+		if err := d.flushTable(st); err != nil && firstErr == nil {
+			firstErr = err
+		}
+	}
+	return firstErr


The current error handling in Finalize correctly implements a lenient path for incomplete entries (orphan items) by emitting a warning instead of failing or dropping them. This aligns with the repository rule to avoid silently dropping entries during serialization to ensure data preservation on disk. For actual flush errors, failing fast is appropriate.

Suggested change

var firstErr error

for _, st := range d.tables {

if st.schema == nil {

d.emitWarn("ddb_orphan_items",

"encoded_table", st.encoded,

"buffered_items", len(st.items))

continue

}

if err := d.flushTable(st); err != nil && firstErr == nil {

firstErr = err

}

}

return firstErr

for _, st := range d.tables {

if st.schema == nil {

d.emitWarn("ddb_orphan_items",

"encoded_table", st.encoded,

"buffered_items", len(st.items))

continue

}

if err := d.flushTable(st); err != nil {

return err

}

}

return nil

References

Avoid silently dropping entries during serialization or normalization. If placeholder or incomplete entries are valid in some contexts, provide a specific, more lenient handling path for them to ensure they are preserved on disk and avoid state inconsistencies.

gemini-code-assist · 2026-04-30T09:35:28Z

+func setAttributeValueToPublic(av *pb.DynamoAttributeValue) map[string]any {
+	switch v := av.GetValue().(type) {
+	case *pb.DynamoAttributeValue_Ss:
+		return map[string]any{"SS": append([]string{}, v.Ss.GetValues()...)}
+	case *pb.DynamoAttributeValue_Ns:
+		return map[string]any{"NS": append([]string{}, v.Ns.GetValues()...)}
+	case *pb.DynamoAttributeValue_Bs:
+		return map[string]any{"BS": append([][]byte{}, v.Bs.GetValues()...)}
+	}
+	return nil
+}


The current implementation of setAttributeValueToPublic can produce null in the JSON output for nil set slices. To avoid state inconsistencies during normalization and ensure entries are correctly preserved, handle nil set slices to produce an empty JSON array [] instead of null.

Suggested change

func setAttributeValueToPublic(av *pb.DynamoAttributeValue) map[string]any {

switch v := av.GetValue().(type) {

case *pb.DynamoAttributeValue_Ss:

return map[string]any{"SS": append([]string{}, v.Ss.GetValues()...)}

case *pb.DynamoAttributeValue_Ns:

return map[string]any{"NS": append([]string{}, v.Ns.GetValues()...)}

case *pb.DynamoAttributeValue_Bs:

return map[string]any{"BS": append([][]byte{}, v.Bs.GetValues()...)}

}

return nil

}

func setAttributeValueToPublic(av *pb.DynamoAttributeValue) map[string]any {

switch v := av.GetValue().(type) {

case *pb.DynamoAttributeValue_Ss:

return map[string]any{"SS": append([]string(nil), v.Ss.GetValues()...)}

case *pb.DynamoAttributeValue_Ns:

return map[string]any{"NS": append([]string(nil), v.Ns.GetValues()...)}

case *pb.DynamoAttributeValue_Bs:

return map[string]any{"BS": append([][]byte(nil), v.Bs.GetValues()...)}

}

return nil

}

References

Avoid silently dropping entries during serialization or normalization to ensure they are preserved on disk and avoid state inconsistencies.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 63bfe8db9b

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-04-30T09:40:10Z

+	enc := rest[:idx]
+	if _, err := base64.RawURLEncoding.DecodeString(enc); err != nil {
+		return "", errors.Wrap(ErrDDBMalformedKey, err.Error())
+	}
+	return enc, nil


Preserve item generation when grouping DynamoDB rows

Item keys encode !ddb|item|<table>|<generation>|..., but this parser drops everything after the first separator and the encoder buffers rows only by table name. In real clusters, old-generation rows can coexist temporarily (e.g., delete/recreate before async cleanup finishes), and those stale rows will be emitted under the current schema; this can silently restore deleted data or make Finalize fail when primary-key names changed across generations. The backup path should retain generation and filter rows against the active schema generation (and any explicitly supported migration generation) instead of merging all generations.

Useful? React with 👍 / 👎.

Builds on PR #716 (DynamoDB) and PR #717 (s3keys.ParseBlobKey). Adds the S3 encoder for the Phase 0 logical-backup decoder. Snapshot prefixes handled: - !s3|bucket|meta|<bucket> -> s3/<bucket>/_bucket.json (live s3BucketMeta projected into the dump-format s3PublicBucket; cluster- internal fields like CreatedAtHLC and Generation stripped). - !s3|obj|head|<bucket><gen><object> -> s3/<bucket>/<object>.elastickv-meta.json (live s3ObjectManifest projected into the dump-format s3PublicManifest; UploadID and per-part chunk arrays stripped). - !s3|blob|<bucket><gen><object><uploadID><partNo><chunkNo>[<partVersion>]: spilled to a per-(bucket, object) scratch directory as it arrives; assembled at Finalize by walking sortChunkKeys output (partNo, partVersion, chunkNo) and concatenating into the final body file via writeFileAtomic-style tmp+rename. Memory: O(num_objects); body bytes never held in memory. - !s3|upload|meta|, !s3|upload|part|: excluded by default; --include-incomplete-uploads emits them as opaque {prefix, key, value} JSONL records under s3/<bucket>/_incomplete_uploads/. - !s3|bucket|gen|, !s3|gc|upload|, !s3route|: HandleIgnored no-op so the master pipeline can dispatch all !s3|* prefixes uniformly. Reserved-suffix collision: - A user object key ending in .elastickv-meta.json is rejected with ErrS3MetaSuffixCollision by default; --rename-collisions appends .user-data and records the rename in s3/<bucket>/KEYMAP.jsonl with KindMetaCollision so the on-disk reverse map is sound. - Per-bucket KEYMAP.jsonl is opened lazily and dropped if Finalize finds zero records (the spec's "omit empty file" rule). Body assembly: - Each blob chunk is written atomically to scratch as it arrives; the chunk path is registered in the object's chunkPaths map keyed by (uploadID, partNo, chunkNo, partVersion). - Finalize sorts the chunk keys by (partNo, partVersion, chunkNo) and concatenates the matching scratch files into a single <bodyPath> via tmp+rename. - The scratch tree is rm -rf'd on Finalize via deferred RemoveAll, so a successful run leaves nothing behind. Tests: bucket meta + single-chunk object round-trip, multipart ordering verification (chunks emitted out of order, body assembled in part/chunk order), orphan-chunks warning, .elastickv-meta.json collision rejection (default) and rename (with flag) including KEYMAP entry, malformed JSON rejection on both manifest and bucket paths, ignored-prefix no-op, --include-incomplete-uploads round- trip, default no _incomplete_uploads dir, versioned blob assembly.

…at/backup-phase0a-dynamodb

Three issues, all valid. #237 Codex P1 -- preserve item generation when grouping rows. Item keys carry !ddb|item|<table>|<generation>|... but the prior parser dropped the generation and Finalize merged all generations under the active schema. In real clusters, in-flight delete/recreate cleanup leaves stale-gen rows visible in the snapshot for a window; those rows would silently restore as items under the new schema (potentially resurrecting deleted data, or failing Finalize when PK names changed across generations). Fix: - parseDDBItemKey now returns (encodedTable, generation). - ddbTableState.items -> itemsByGen map[uint64][]*pb.DynamoItem. - flushTable filters by st.schema.GetGeneration() and emits a ddb_stale_generation_items warning carrying the count and active gen so the operator can correlate the orphan window. #182 Gemini -- fail-fast on flush errors. Finalize previously deferred the first error and continued through remaining tables. Real flush errors (out of disk, bad permissions) should surface immediately, not be misattributed to a later table whose flush also fails. Orphan-table warnings still continue (those are informational, not errors). #442 Gemini -- empty set serializes as [] not null. setAttributeValueToPublic now uses make+append so nil/empty SS/NS/BS become "{\"SS\":[]}" rather than "{\"SS\":null}". Downstream tools that distinguish "present-but-empty" from "missing" no longer see a misleading null. Tests: TestDDB_StaleGenerationItemsExcludedAndWarned, TestDDB_EmptyStringSetSerializesAsEmptyArrayNotNull, TestDDB_ParseItemKeyExtractsGeneration. Existing fixtures updated to set Generation=1 on schemas (matching the item key gen).

bootjp · 2026-04-30T10:37:07Z

3 issues addressed in fdc9b36b:

Codex P1 #237 — Preserve item generation when grouping rows
The most important one. parseDDBItemKey now returns (encodedTable, generation); ddbTableState.items → itemsByGen map[uint64][]*pb.DynamoItem. flushTable filters by st.schema.GetGeneration() and emits a ddb_stale_generation_items warning for stale rows. Stale-gen rows from in-flight delete/recreate cleanup no longer silently restore under the new schema.

Gemini MEDIUM #182 — Fail fast on flush errors
Finalize previously deferred the first error. Now real flush errors (no-disk, bad permissions) surface immediately, not misattributed to a later table. Orphan-table warnings still continue (informational, not errors).

Gemini MEDIUM #442 — Empty set → [] not null
setAttributeValueToPublic uses make + append so nil/empty SS/NS/BS render as {"SS":[]} rather than {"SS":null}. Downstream tools that distinguish "present-but-empty" from "missing" no longer see a misleading null.

Tests: TestDDB_StaleGenerationItemsExcludedAndWarned, TestDDB_EmptyStringSetSerializesAsEmptyArrayNotNull, TestDDB_ParseItemKeyExtractsGeneration.

@claude review

claude · 2026-04-30T10:37:32Z

Claude finished @bootjp's task in 3m 30s —— View job

Review: `fdc9b36` — backup: address review on DynamoDB encoder (PR #716)

Read PR context and understand request
Read internal/backup/dynamodb.go
Read internal/backup/dynamodb_test.go
Diff against base branch (origin/feat/backup-phase0a-sqs)
Review: Data loss lens
Review: Concurrency/distributed failures lens
Review: Performance lens
Review: Data consistency lens
Review: Test coverage lens

All three review comments are correctly addressed. Details per lens below.

Lens 1 — Data loss

All file writes go through writeFileAtomic (tmp+rename), so no partial-file risk. os.MkdirAll errors propagate correctly. The active := st.itemsByGen[activeGen] map lookup returns nil for a table with no items at activeGen, so range nil is a no-op — a schema-only table emits _schema.json with zero item files, which is correct.

No data loss issues.

Lens 2 — Concurrency / distributed failures

DDBEncoder has no locking. The Handle-then-Finalize lifecycle is documented as sequential, which is the right contract for an offline backup pipeline. No issues.

Lens 3 — Performance

totalItemsAcrossGens and totalStaleItems are O(items) and only called at Finalize time. ensureItemsByGen (line 73) is called on every HandleItem invocation, returning the existing map unchanged after the first call. This is a redundant nil-check on the hot path for high-item-count tables — an inline if st.itemsByGen == nil { st.itemsByGen = make(...) } in HandleItem is slightly cleaner, but this is a nit with no practical impact at the item counts a backup pipeline would see.

No performance issues.

Lens 4 — Data consistency

Codex P1 fix is correct. parseDDBItemKey extracts (encodedTable, generation) unambiguously — base64url doesn't contain |, so the two |-splits after the prefix are reliable separators. flushTable correctly restricts to st.itemsByGen[activeGen] and warns on any gen != activeGen.

One unaddressed edge case worth documenting: MigratingFromGeneration.

DynamoTableSchema has a migrating_from_generation field (proto field 6). During a live key-encoding migration — not a delete/recreate — items may legitimately coexist at migration_from_generation while the migration converges. A backup taken mid-migration would count those items as stale and emit a ddb_stale_generation_items warning but exclude them from the dump. This is conservative (no resurrection risk), but an operator restoring from such a backup would silently lose the in-flight items.

Phase 0a is an offline snapshot decoder and the risk is low (migrations are presumably rare and the operator can re-snapshot after convergence). However, the current comment in flushTable only describes the delete/recreate scenario. It would be worth either:

Adding a note that items at migratingFromGeneration are also excluded and will appear in the warning count, or
Emitting a separate ddb_migration_in_progress warning when schema.GetMigratingFromGeneration() != 0 so the operator knows to re-snapshot.

Fix this →

Non-deterministic _schema.json output. schemaToPublic iterates s.GetAttributeDefinitions() and s.GetGlobalSecondaryIndexes() (both Go maps) to build the attribute_definitions and global_secondary_indexes slices. The resulting JSON arrays have non-deterministic element order across runs. This doesn't affect correctness (JSON arrays of objects are compared by content, not order), but it makes diff-based backup comparison noisy. A sort.Slice by Name would make output stable and reproducible with zero cost.

Fix this →

Lens 5 — Test coverage

Three new tests look good:

TestDDB_StaleGenerationItemsExcludedAndWarned — covers the primary Codex fix path, confirms stale item not written and warning fires.
TestDDB_EmptyStringSetSerializesAsEmptyArrayNotNull — directly exercises the Gemini Remove .golangci.bck.yaml, update renovate.json #442 fix; uses json.Marshal to verify wire output.
TestDDB_ParseItemKeyExtractsGeneration — unit tests the new return value of parseDDBItemKey.

One gap: TestDDB_StaleGenerationItemsExcludedAndWarned has one stale generation (gen=4) and one active (gen=5). A follow-up test with multiple stale generations (e.g., gen=3 and gen=4 both present, active gen=5) would confirm totalStaleItems aggregates across all non-active gens and that stale_count in the warning reflects the combined count.

Summary

Finding	Severity
Codex P1: generation filtering	✅ Correctly fixed
Gemini #182: fail-fast flush errors	✅ Correctly fixed
Gemini #442: empty set → `[]`	✅ Correctly fixed
`MigratingFromGeneration` excluded without distinct warning	Low — conservative but silent
Non-deterministic `_schema.json` ordering	Low — readability / diffability
Single-stale-gen test only	Low — coverage gap

The core correctness fixes are solid. The two "Low" findings would improve operator experience but don't affect backup safety. Happy to address any of them if you want them in this PR.

…to feat/backup-phase0a-s3

Three correctness bugs and several quality issues, all valid. CRITICAL: path traversal (Codex P1 #425). S3 object keys are user-controlled. A key like "../../../etc/passwd" would, under the prior filepath.Join-and-write code, escape the bucket directory and overwrite host files. Added safeJoinUnderRoot which Cleans the joined path and asserts it stays rooted under the bucket dir; ".." traversal is rejected with ErrS3MalformedKey. Absolute paths (leading "/") are normalised under the bucket dir by filepath.Join, which is the safest outcome. Tests: TestS3_PathTraversalAttemptRejected, TestS3_AbsolutePathObjectKeyConfinedUnderBucket. CRITICAL: stale upload-id chunks merged into body (Codex P1 #500, Gemini HIGH #106/#476/#504). A snapshot mid-delete-and-recreate or mid-retry can carry blob chunks for multiple upload attempts under the same (bucket, gen, object). The prior assembleObjectBody concatenated every chunk regardless of upload_id, producing corrupted bytes. Now: - s3ObjectState gains uploadID; HandleObjectManifest sets it. - New filterChunksForManifest takes the chunkPaths map and the manifest's uploadID, returns only matching chunks sorted by (partNo, partVersion, chunkNo). Stale-uploadID chunks never enter the assembled body. Test: TestS3_StaleUploadIDChunksFilteredFromAssembledBody. CRITICAL: incomplete-uploads file truncated per record (Codex P2 #318, Gemini HIGH+MEDIUM #318). HandleIncompleteUpload re-opened records.jsonl on every call; openJSONL uses os.Create which truncates. Only the last record survived per bucket. Now: - s3BucketState carries an incompleteUploadsJL *jsonlFile lazily opened on the first record and cached. - flushBucket closes it and surfaces the error (was silently ignored). Test: TestS3_IncompleteUploadsAppendsAcrossCalls (3 records, 3 lines on disk). QUALITY: - Gemini MEDIUM #285 (MkdirAll per blob): s3ObjectState gains a scratchDirCreated bool; HandleBlob runs MkdirAll once. - Gemini MEDIUM #318 (closeJSONL error ignored): keymap and incomplete-uploads writers now surface close errors via closeBucketKeymap / explicit closeJSONL return-check. - Gemini MEDIUM #386 (includeOrphans flag ignored): orphan chunks for objects without manifests now write to <bucket>/_orphans/<encoded-object>/<chunk>.bin under WithIncludeOrphans(true). Test: TestS3_OrphanChunksWrittenWhenIncludeOrphans. Also removed the now-unused attachManifestParts placeholder and sortChunkKeys helper; their logic moved into filterChunksForManifest which combines uploadID filtering with the sort.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: fdc9b36b6b

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-04-30T10:44:47Z

+			"stale_count", stale,
+			"hint", "stale-gen rows are excluded from the dump; restore would otherwise emit them under the new schema")
+	}
+	active := st.itemsByGen[activeGen]


Include migration source generation when emitting DynamoDB items

flushTable now emits only st.itemsByGen[activeGen], but DynamoDB tables can be in a live migration state where schema.generation is new and schema.migrating_from_generation still contains readable data; the read path explicitly falls back to that source generation during migration (see readLogicalItemAt in adapter/dynamodb.go). In that state, this filter silently drops unmigrated rows from backups, so restoring the dump loses logically visible items if a snapshot is taken mid-migration. Fresh evidence in this revision is the hard filter at this line with no migrating_from_generation inclusion path.

Useful? React with 👍 / 👎.

…est' into feat/backup-phase0a-dynamodb

… round 2) Codex P1 #227: during a generation migration, schema.Generation is the new gen and schema.MigratingFromGeneration carries the source. The live read path (adapter/dynamodb.go readLogicalItemAt) falls back to the source for items not yet copied. The previous flushTable filtered to only schema.GetGeneration() and silently dropped unmigrated rows from the dump. flushTable now emits items from BOTH generations. Order matters: the source is emitted FIRST, then the active gen, so when both generations carry the same primary key the active-gen content is the one that survives writeFileAtomic's tmp+rename — matching the live code's "new gen wins on conflict" rule. The stale-generation warning was generalised to take an emit-set (totalStaleItemsExcluding); rows in any other generation are counted into the warning so the operator still sees the orphan window. Tests: TestDDB_MigrationSourceGenerationItemsAreEmitted, TestDDB_NewGenerationWinsOverMigrationSourceForSameKey.

bootjp · 2026-04-30T11:01:44Z

Codex P1 #227 addressed in ce5b7dac. During a generation migration, schema.Generation is the new gen and schema.MigratingFromGeneration carries the source. The live read path (adapter/dynamodb.go readLogicalItemAt) falls back to the source for items not yet copied; my previous filter to activeGen only would silently drop unmigrated rows.

flushTable now emits from both generations. The source is emitted first, then the active gen, so when both carry the same primary key the active-gen content is the one that survives writeFileAtomic's tmp+rename — matching the live code's "new gen wins on conflict" rule.

The stale-generation warning was generalised (totalStaleItemsExcluding) so rows in any OTHER generation are still counted and surfaced.

Tests: TestDDB_MigrationSourceGenerationItemsAreEmitted (both gens emitted), TestDDB_NewGenerationWinsOverMigrationSourceForSameKey (collision resolves to new-gen content).

@claude review

…to feat/backup-phase0a-s3

claude · 2026-04-30T11:02:02Z

Claude finished @bootjp's task in 4m 21s —— View job

Claude finished @bootjp's task ——

Review: `ce5b7da` — backup: emit migrating-from-generation items in DynamoDB dump (PR #716, round 2)

Codex P1 #227 is correctly addressed. Details per lens below.

Lens 1 — Data loss

writeFileAtomic is tmp+rename, so two writes to the same path are race-free and the last writer wins on disk. The emit order (source gen first, active gen last) ensures the active gen content is the survivor on any key collision — matching the live read preference exactly.

One subtlety verified: parseDDBItemKey is correct for both legacy and ordered key encodings. The ordered encoding (dynamoItemKey) puts binary key data after the second |, but parseDDBItemKey stops at the second | — it never touches the binary suffix. The generation is decimal digits, so the second | in the key is unambiguous.

No data loss issues.

Lens 2 — Concurrency / distributed failures

Sequential Handle-then-Finalize contract, no locking needed. No issues.

Lens 3 — Performance

totalStaleItemsExcluding allocates a small map[uint64]struct{} with at most 2 entries (activeGen + migrationSourceGen) per flushTable call. Cost is O(distinct generations), not O(items). No issues.

Lens 4 — Data consistency

The emit order correctly mirrors readLogicalItemAt:

readLogicalItemAt:  try activeGen → found? return it. else try sourceGen.
flushTable:         emit sourceGen first → emit activeGen last → activeGen wins via rename.

Both paths agree: new gen wins on conflict.

migrationSourceSchema() alignment check. The live migrationSourceSchema() sets KeyEncodingVersion: 0 for the source gen, but copies PrimaryKey and AttributeDefinitions unchanged from the active schema. Since flushTable uses hashKey/rangeKey from the active schema (st.schema.GetPrimaryKey()), and both gens share the same PK attribute names, writeDDBItem correctly derives file paths for items from either gen.

emitOrder edge cases correct:

migrationSourceGen == 0 → no-migration state → emitOrder = [activeGen] only ✅
migrationSourceGen == activeGen → guarded by migrationSourceGen != activeGen → no duplicate ✅

No data consistency issues.

Lens 5 — Test coverage

Two new tests cover the core fix correctly:

TestDDB_MigrationSourceGenerationItemsAreEmitted — distinct PKs in each gen; verifies both files land on disk ✅
TestDDB_NewGenerationWinsOverMigrationSourceForSameKey — same PK in both gens; verifies v.S == "new-version" survives ✅

One coverage gap — three-generation scenario:

TestDDB_StaleGenerationItemsExcludedAndWarned uses a schema with MigratingFromGeneration == 0, so emitOrder == [5] and the stale item at gen=4 is excluded. No test exercises the case where all three are present: migrationSourceGen (included), activeGen (included), and an even-older gen (stale + warned). A table in mid-migration that also has cleanup debris from a prior delete/recreate would hit this path. totalStaleItemsExcluding is correct by inspection, but this scenario is untested.

Fix this →

Carried over from prior round — still unaddressed:

schemaToPublic iterates GetAttributeDefinitions() and GetGlobalSecondaryIndexes() (Go maps) to build the output slices. The _schema.json arrays have non-deterministic element order across runs, making diff-based backup comparison noisy.

Fix this →

Summary

Finding	Severity
Codex P1 #227: include migration-source-gen items	✅ Correctly fixed
Emit order matches live `readLogicalItemAt` preference	✅ Verified
`parseDDBItemKey` correct for ordered encoding	✅ Verified
Same-PK collision resolves to active gen via tmp+rename	✅ Verified
Three-generation scenario (stale + source + active) untested	Low
Non-deterministic `_schema.json` output (carried from prior round)	Low

Core correctness is solid. The two Low findings are improvements for operator experience/testability, not correctness gaps.

Three follow-up issues from Codex on top of round 1. #619 Codex P1 -- assemble only manifest-declared parts. filterChunksForManifest previously matched on uploadID only; the manifest declares specific (partNo, partVersion) tuples, and S3's overwrite-then-async-cleanup window can leave older partVersion chunks present under the same (bucket, generation, object, uploadID). Mixing them produces corrupted bytes. s3ObjectState gains declaredParts map[s3PartKey]struct{}, populated by HandleObjectManifest. filterChunksForManifest takes the set and restricts emission to declared (partNo, partVersion) tuples. Test TestS3_StalePartVersionExcludedFromAssembledBody asserts a stale partVersion=7 cannot leak when the manifest declares partVersion=9. #497 Codex P2 -- preserve dot segments. safeJoinUnderRoot used filepath.Clean which collapses "a/../b" to "b". S3 treats those bytes literally; "a/../b" and "b" are distinct keys that would have silently merged into one output file. Now explicitly rejects any object key whose segments are "." or ".." with ErrS3MalformedKey. NUL bytes also rejected. Test TestS3_DotSegmentObjectKeyRejected covers the four forms ("a/../b", "a/./b", "..", "."). #521 Codex P2 -- cross-generation collision. s3BucketState gains activeGen captured from the bucket-meta record. flushBucket suppresses objects whose generation differs from activeGen (under --include-orphans, those flow to _orphans/; by default they're dropped with an s3_stale_generation_objects warning). Tests TestS3_StaleGenerationObjectExcluded. flushBucket cyclomatic complexity stayed under the cap by extracting flushBucketObjects.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: ce5b7dac74

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-04-30T11:09:05Z

+	rawName, err := base64.RawURLEncoding.DecodeString(encoded)
+	if err != nil {


Reject empty table segment in DynamoDB meta keys

HandleTableMeta accepts !ddb|meta|table| as valid because base64.RawURLEncoding.DecodeString("") succeeds, so a truncated/malformed key is treated as table name "" instead of returning ErrDDBMalformedKey. In corrupted snapshots this can route schema output into the wrong scope and mask key corruption that Phase 0 is supposed to surface loudly.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-04-30T11:09:05Z

+	genEnd := strings.IndexByte(afterTable, '|')
+	if genEnd <= 0 {
+		return "", 0, errors.Wrapf(ErrDDBMalformedKey,
+			"item key missing gen/rest separator: %q", key)
+	}


Reject item keys with no payload after generation

parseDDBItemKey only checks that a | exists after the generation, but it does not require bytes after that separator, so keys like !ddb|item|<table>|7| are accepted. This lets truncated item keys bypass malformed-key detection and be emitted using only value-side attributes, which can hide snapshot corruption instead of failing fast.

Useful? React with 👍 / 👎.

…est' into feat/backup-phase0a-dynamodb

Two Codex P2 strict-validation follow-ups. #117 -- empty table-meta segment. HandleTableMeta accepted `!ddb|meta|table|` (no encoded segment) because base64.RawURLEncoding.DecodeString("") returns empty bytes without error, so the schema would route under the empty table name. Now rejected with ErrDDBMalformedKey before the JSON decode. TestDDB_RejectsTableMetaKeyWithEmptySegment. #303 -- truncated item key. parseDDBItemKey accepted `!ddb|item|<table>|7|` (gen separator present, no primary-key payload). The gen-end check was "genEnd > 0" which a trailing `|` satisfies. Added a follow-up check that genEnd+1 != len(afterTable) so a payload-less key surfaces as ErrDDBMalformedKey rather than emit under value-side attributes only. TestDDB_RejectsItemKeyWithEmptyPrimaryKeyPayload.

chatgpt-codex-connector · 2026-04-30T11:14:13Z

You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard.
To continue using code reviews, you can upgrade your account or add credits to your account and enable them for code reviews in your settings.

bootjp · 2026-04-30T11:14:18Z

2 Codex P2 strict-validation follow-ups addressed in 9029addb:

#117 — Empty table-meta segment. HandleTableMeta accepted !ddb|meta|table| because base64.RawURLEncoding.DecodeString("") returns []byte{} without error. Now rejected with ErrDDBMalformedKey before the JSON decode. TestDDB_RejectsTableMetaKeyWithEmptySegment.

#303 — Truncated item key. parseDDBItemKey accepted !ddb|item|<table>|7| (gen separator present but no primary-key payload). Added a genEnd+1 != len(afterTable) guard so a payload-less key surfaces as ErrDDBMalformedKey. TestDDB_RejectsItemKeyWithEmptyPrimaryKeyPayload.

@claude review

claude · 2026-04-30T11:14:41Z

Claude finished @bootjp's task in 2m 51s —— View job

Review: `9029add` — backup: reject truncated DynamoDB keys (PR #716, round 3)

Read PR context and understand request
Read internal/backup/dynamodb.go
Read internal/backup/dynamodb_test.go
Diff against base branch (origin/feat/backup-phase0a-sqs)
Review: Data loss lens
Review: Concurrency/distributed failures lens
Review: Performance lens
Review: Data consistency lens
Review: Test coverage lens

Both P2 fixes are correct. Details per lens below.

Lens 1 — Data loss

Fix #1 — empty table-meta segment (line 116–122):

stripPrefixSegment("!ddb|meta|table|", "!ddb|meta|table|") returns ("", nil) — no error — because the function simply returns string(key[len(prefix):]). Without the new guard, base64.RawURLEncoding.DecodeString("") also returns ([]byte{}, nil), so the table name resolves to "". EncodeSegment([]byte("")) returns "", so the flush path becomes:

filepath.Join(root, "dynamodb", "", "_schema.json")
  → root/dynamodb/_schema.json

That file lands at the dynamodb/ root, silently overwriting any other _schema.json written there by the same path on a different table (impossible for valid tables, but the behavior is ambiguous garbage). The new guard correctly catches this before any decode or unmarshal. Fix is correct.

Fix #2 — empty primary-key payload (line 315–324):

Without the guard, !ddb|item|<t>|7| passes all prior checks, buffers a DynamoItem from the value, and defers to Finalize. At flush time, writeDDBItem looks up the hash-key attribute in the item's attribute map and derives a path from the value — never from the key payload — so the truncated key itself doesn't produce a file at a wrong path. The real risk is the diagnostic one: a corrupted key silently produces output shaped entirely by the value side, masking the key corruption. The guard surfaces it at HandleItem time with a clear ErrDDBMalformedKey. Fix is correct.

No data loss issues.

Lens 2 — Concurrency / distributed failures

Sequential Handle-then-Finalize contract, no concurrency implications. No issues.

Lens 3 — Performance

Both new checks are O(1) comparisons (encoded == "" and genEnd+1 == len(afterTable)). No impact.

Lens 4 — Data consistency

Trace through both test cases:

TestDDB_RejectsTableMetaKeyWithEmptySegment:

stripPrefixSegment("!ddb|meta|table|", "!ddb|meta|table|") → ""
encoded == "" guard fires → ErrDDBMalformedKey ✓

TestDDB_RejectsItemKeyWithEmptyPrimaryKeyPayload (key = !ddb|item|dA|7|):

rest = "dA|7|" → tableEnd = 2 → enc = "dA" → valid base64url
afterTable = "7|" → genEnd = 1 → gen = 7
genEnd+1 == len(afterTable) → 2 == 2 → ErrDDBMalformedKey ✓

One subtle point worth noting: tableEnd <= 0 (existing guard, line 297) catches the case where an item key has an empty encoded-table segment (e.g., !ddb|item||7|...), since IndexByte returns 0 for a leading |. The new guard for empty item-key payload is complementary: it catches the case where the table and gen are present but the payload after the final | is empty. Together they cover all three truncation points in the <table>|<gen>|<payload> structure.

No data consistency issues.

Lens 5 — Test coverage

Both new tests are correct and well-targeted:

TestDDB_RejectsTableMetaKeyWithEmptySegment — passes the bare prefix as the key, exercises the encoded == "" guard, asserts errors.Is(err, ErrDDBMalformedKey) ✓
TestDDB_RejectsItemKeyWithEmptyPrimaryKeyPayload — manually constructs !ddb|item|dA|7|, exercises the genEnd+1 == len(afterTable) guard, asserts errors.Is(err, ErrDDBMalformedKey) ✓

Carried from prior rounds — still unaddressed (both Low):

Three-generation scenario untested. totalStaleItemsExcluding correctly handles multiple non-included gens by inspection, but no test exercises stale_count when more than one generation is stale (e.g., schema gen=7, source gen=6, stale item at gen=5). TestDDB_StaleGenerationItemsExcludedAndWarned covers only the one-stale-gen path.

Fix this →
Non-deterministic _schema.json output. schemaToPublic iterates Go maps (GetAttributeDefinitions() and GetGlobalSecondaryIndexes()), so attribute_definitions and global_secondary_indexes arrays have random element order across runs. Diff-based backup verification will see spurious diffs.

Fix this →

Summary

Finding	Severity
Codex P2 #117: empty table-meta segment rejected	✅ Correctly fixed
Codex P2 #303: empty item-key payload rejected	✅ Correctly fixed
Both guards trace correctly for normal + truncated inputs	✅ Verified
`tableEnd <= 0` + new `genEnd+1` guards cover all truncation points	✅ Verified
Three-generation scenario (stale + source + active) untested	Low — carried from round 2
Non-deterministic `_schema.json` output	Low — carried from round 2

This PR is ready to merge from a correctness standpoint. The two Low items are quality-of-life improvements (testability and backup diff stability) that can land in a follow-up.

…to feat/backup-phase0a-s3

Codex P1 #615: when a bucket contains both `path/to` and `path/to/sub`, the natural-path mapping puts the shorter key at a file path and the longer key requires that path to be a parent directory. POSIX cannot represent both; the prior code would fail with EEXIST/ENOTDIR at MkdirAll/Rename during finalize, breaking the documented leaf-data-suffix rename strategy. Pre-flush computeDirPrefixes() walks every active-gen object key and accumulates each parent prefix (a key "a/b/c" contributes "a" and "a/b"). flushBucketObjects checks whether each object's key is itself a prefix consumed by another active key; if so, flushObjectWithCollision is called with needsLeafDataRename=true, which routes resolveObjectFilename onto the existing S3LeafDataSuffix path. KEYMAP.jsonl records the rename with KindS3LeafData so restore can reverse it. flushObject (the no-collision wrapper) was inlined into flushObjectWithCollision since it was unused after the call sites moved to the explicit-flag form. Test TestS3_FileVsDirectoryKeyCollisionGetsLeafDataRename emits "path/to" and "path/to/sub", asserts: - "path/to/sub" lands at its natural path with body "CHILD" - "path/to" is renamed to "path/to.elastickv-leaf-data" with body "LEAF" - KEYMAP.jsonl carries one record with KindS3LeafData and original "path/to"

…est' into feat/backup-phase0a-s3

…718, round 4) Two Codex P1 follow-ups. #729 -- chunk completeness check. assembleObjectBody filtered + sorted whatever chunk files happened to exist and wrote them, but never verified the manifest's declared chunk_count was actually present. A partial / racy / corrupted snapshot would silently emit a truncated body. Added verifyChunkCompleteness which counts chunks per (partNo, partVersion) and asserts the count matches manifest.parts[].chunk_count AND the highest chunkNo equals chunk_count-1. Mismatch surfaces as ErrS3IncompleteBlobChunks before any bytes are written. The declaredParts map's value type changed from struct{} to s3DeclaredPart{chunkCount} to carry the contract. #614 -- empty slash segments. safeJoinUnderRoot rejected `.` and `..` but allowed empty segments ("a//b", "a/", trailing "/"). filepath.Join collapses these, so distinct S3 keys would silently overwrite each other at finalize. The dot-segment guard now also rejects "" segments anywhere except the leading position (where a leading "/" produces an initial empty segment that filepath.Join handles safely under the already-tested absolute-path-confined-under-bucket behaviour). Tests: TestS3_IncompleteChunksRejected, TestS3_EmptySlashSegmentsRejected (covers a//b, a/, /a//, x/).

…est' into feat/backup-phase0a-dynamodb

…a) (#718) ## Summary Stacked on top of #716 (DynamoDB) and #717 (s3keys.ParseBlobKey). Adds the S3 encoder for the Phase 0 logical-backup decoder — the most complex per-adapter piece because it must reassemble multipart object bodies from independent blob chunks. **Snapshot prefixes handled:** - `!s3|bucket|meta|<bucket>` → `s3/<bucket>/_bucket.json` - `!s3|obj|head|<bucket><gen><object>` → `s3/<bucket>/<object>.elastickv-meta.json` (sidecar) - `!s3|blob|<bucket><gen><object><uploadID><partNo><chunkNo>[<partVersion>]` → spilled to scratch on arrival; concatenated into `s3/<bucket>/<object>` at Finalize in `(partNo, partVersion, chunkNo)` order - `!s3|upload|meta|`/`!s3|upload|part|`: excluded by default; opt in via `--include-incomplete-uploads` to emit under `_incomplete_uploads/records.jsonl` - `!s3|bucket|gen|`, `!s3|gc|upload|`, `!s3route|`: ignored ## How body assembly works Each blob chunk is written atomically to a per-(bucket, object) scratch directory as it arrives, registered under the `(uploadID, partNo, chunkNo, partVersion)` routing key. At Finalize, chunks for each object are sorted and concatenated into a single body file via tmp+rename. **Body bytes are never held in memory** — only the scratch path map. The scratch tree is removed on Finalize. ## Reserved-suffix collision handling A user object key ending in `.elastickv-meta.json` is rejected with `ErrS3MetaSuffixCollision` by default. `WithRenameCollisions(true)` appends `.user-data` to the body file and records the rename in `s3/<bucket>/KEYMAP.jsonl` with `KindMetaCollision` so the dump remains reversible. Per-bucket `KEYMAP.jsonl` is opened lazily and dropped on Finalize if no records were written. ## Test plan - [x] `go test -race ./internal/backup/...` — pass. - [x] `golangci-lint run ./internal/backup/...` — clean. - [x] 11 tests covering bucket-meta round-trip, multipart out-of-order chunk reassembly, orphan-chunks warning, sidecar collision (rejected/renamed), JSON-decode rejection on both paths, ignored-prefix no-op, `--include-incomplete-uploads` round-trip + default skip, versioned blob assembly. ## Stacking Base: `feat/backup-phase0a-dynamodb` (PR #716). Also depends on `feat/s3keys-parse-blob-key` (PR #717).

…ynamodb

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 1ccd68091a

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-05-02T10:46:04Z

+}
+
+func (s *S3Encoder) flushBucket(b *s3BucketState) error {
+	bucketDir := filepath.Join(s.outRoot, "s3", EncodeSegment([]byte(b.name)))


Reject dot-segment bucket names before filesystem join

When a snapshot contains a syntactically valid but invalid bucket segment like "..", HandleBucketMeta accepts it and flushBucket builds bucketDir with filepath.Join(s.outRoot, "s3", EncodeSegment([]byte(b.name))); because EncodeSegment("..") == "..", writes escape the s3/ subtree (e.g., _bucket.json lands under outRoot). This is a path traversal/corruption risk for malformed or untrusted snapshots and should be blocked by validating bucket names or rejecting ./.. before joining.

Useful? React with 👍 / 👎.

Codex P1 round 13 (commit 1ccd680): The PR #718 round-9 fix (`scratchDirForBlob`) rejected dot-component bucket/object names at the SCRATCH path, but the per-bucket FINAL output construction in flushBucket went through `filepath.Join(s.outRoot, "s3", EncodeSegment(b.name))` without validating the bucket name. EncodeSegment preserves `.` and `..` (both RFC3986-unreserved), so: - bucket name "." → bucketDir collapses to `<outRoot>/s3/` and `_bucket.json` lands at the s3-root level instead of in a per-bucket subdir. - bucket name ".." → bucketDir resolves to `<outRoot>` itself, letting `_bucket.json` clobber dump-root files like MANIFEST.json. flushBucket now refuses sole-dot and empty bucket names with ErrS3MalformedKey. Caller audit (per loop's audit-semantics rule for fail-closed changes): - Single caller in Finalize loop, which already accumulates into firstErr — surfacing ErrS3MalformedKey there is consistent with existing malformed-key handling (dot-segment object keys, NUL bytes, leading slash, etc. all surface through the same error class). Test: TestS3_FlushBucketRejectsDotSegmentBucketName covers ".", "..", and empty.

bootjp · 2026-05-02T10:52:50Z

@claude review

Round 6 fix for Codex P1 round 13 on commit 1ccd680 (commit f710f5e4):

The PR #718 round-9 fix (scratchDirForBlob) rejected dot-component bucket/object names at the scratch path, but the per-bucket final output construction in flushBucket went through filepath.Join(s.outRoot, "s3", EncodeSegment(b.name)) without validating b.name. EncodeSegment preserves . and .. (both RFC3986-unreserved), so:

bucket name . → bucketDir collapses to <outRoot>/s3/; _bucket.json lands at the s3-root level.
bucket name .. → bucketDir resolves to <outRoot> itself; _bucket.json could clobber dump-root files (MANIFEST.json, etc.).

flushBucket now refuses sole-dot and empty bucket names with ErrS3MalformedKey. Caller audit (per loop's audit-semantics rule): single caller in the Finalize loop already accumulates into firstErr and the surfaced error class matches existing malformed-key handling (dot-segment object keys, NUL bytes, leading slash, etc).

Test: TestS3_FlushBucketRejectsDotSegmentBucketName covers ., .., and empty. go test -race, GOOS=windows go build, golangci-lint all clean.

claude · 2026-05-02T10:53:06Z

Claude finished @bootjp's task in 4m 4s —— View job

Review: `f710f5e` — backup: refuse dot-segment S3 bucket names (PR #716, round 6)

Read PR context and understand request
Read internal/backup/s3.go
Read internal/backup/s3_test.go
Diff against base branch (origin/feat/backup-phase0a-sqs)
Review: Data loss lens
Review: Concurrency/distributed failures lens
Review: Performance lens
Review: Data consistency lens
Review: Test coverage lens

Codex P1 round 13 is correctly addressed. Details per lens below.

Lens 1 — Data loss

Guard placement is correct. The switch at s3.go:543–547 runs before filepath.Join (line 548), os.MkdirAll (line 549), and writeFileAtomic (line 553). No directory is created and no file is written for a malformed bucket name.

EncodeSegment confirmed to pass . and .. as-is. backup.EncodeSegment (not s3keys.EncodeSegment) is what flushBucket calls at line 548. isUnreserved('.') returns true (filename.go:290 — case c == '-', c == '.', c == '_'), so EncodeSegment([]byte(".")) == "." and EncodeSegment([]byte("..")) == "..". The decoded-name guard is therefore equivalent to guarding the encoded form. ✓

Interaction with scratchDirForBlob is complementary and correct. scratchDirForBlob (s3.go:460) rejects . and .. at Handle time (fail-fast before any disk write). The new flushBucket guard catches bucket-meta-only records (no blobs) at Finalize time. Together they cover both attack surfaces:

Snapshot sends HandleBlob for bucket "." → scratchDirForBlob returns ErrS3MalformedKey immediately.
Snapshot sends HandleBucketMeta(".") only (no blobs) → state is parked; flushBucket guard fires at Finalize.

Finalize accumulates firstErr rather than failing fast. When flushBucket returns ErrS3MalformedKey for one bucket, the loop continues and remaining valid buckets are flushed normally. firstErr is returned to the caller. This means a snapshot with one malformed bucket and several valid buckets will write all valid bucket data to disk and then surface the error. This is intentional pre-existing behavior (consistent with how firstErr is used throughout) and correct for backup semantics — a caller that cares can inspect firstErr and decide whether to retain the partial output.

No data loss issues.

Lens 2 — Concurrency / distributed failures

Sequential Handle→Finalize contract. No issues.

Lens 3 — Performance

Three string comparisons in the switch, O(1), called once per bucket at Finalize time. No impact.

Lens 4 — Data consistency

Decoded-name vs encoded-form guard is correct. HandleBucketMeta calls s3keys.ParseBucketMetaKey, which decodes the raw bucket name from the key. b.name is the decoded string. The switch compares against ".", "..", and "" — exactly the values that, if encoded by backup.EncodeSegment, would produce ".", "..", and the empty prefix (which filepath.Join would either collapse or leave ambiguous). Checking the decoded form is equivalent to and simpler than checking the encoded form.

Empty-name path is valid at the key level but blocked at Finalize. s3keys.BucketMetaKey("") produces a structurally valid key (empty segment, just escape+terminator bytes). ParseBucketMetaKey decodes it as ("", true). So HandleBucketMeta succeeds with bucketName = "" and parks the state in s.buckets[""]. The b.name == "" arm of the switch correctly catches this at Finalize time.

"..." (three dots) and longer pure-dot names are safe. EncodeSegment([]byte("...")) == "...". filepath.Join(outRoot, "s3", "...") = outRoot/s3/... — three-dot is an ordinary directory name on Linux/macOS/Windows; filepath.Clean does not collapse it. The guard is intentionally limited to the POSIX-special cases . and ...

No data consistency issues.

Lens 5 — Test coverage

TestS3_FlushBucketRejectsDotSegmentBucketName covers all three cases correctly:

Case	`b.name`	Guard arm	Expected
`"."`	`"."`	first arm	`ErrS3MalformedKey` ✓
`".."`	`".."`	first arm	`ErrS3MalformedKey` ✓
`""`	`""`	second arm	`ErrS3MalformedKey` ✓

The test calls HandleBucketMeta (succeeds — correct, because ParseBucketMetaKey accepts all three) then Finalize and asserts errors.Is(err, ErrS3MalformedKey). ✓

Interaction with TestS3_HandleBlobRejectsScratchEscape: That test covers the scratch-path guard for HandleBlob with bucket = "." / "..". The new test covers the final-path guard for flushBucket. Together they leave no gap. ✓

Carried from prior rounds (DynamoDB encoder only — both Low):

Three-generation scenario untested. totalStaleItemsExcluding in the DynamoDB encoder is correct by inspection, but no test exercises stale_count when more than one generation is stale (schema gen=7, migrationSource=6, debris at gen=5).

Fix this →
No Finalize-level N-key migration-conflict test. TestDDB_CanonicalNumberKeySegment unit-tests canonicalization; TestDDB_NewGenerationWinsOverMigrationSourceForSameKey uses S-type keys. An end-to-end Finalize test with N-type hash keys in both generations would directly reproduce the round 9 bug scenario.

Summary

Finding	Severity
Codex P1 round 13: dot-segment + empty bucket name rejected in `flushBucket`	✅ Correctly fixed
`EncodeSegment('.')` and `('..')` confirmed to pass through unmodified	✅ Verified
Guard fires before `filepath.Join`, `os.MkdirAll`, and `writeFileAtomic`	✅ Verified
`scratchDirForBlob` (Handle-time) + `flushBucket` (Finalize-time) guards cover all paths	✅ Verified
`""` bucket name: accepted by `ParseBucketMetaKey`, caught by `flushBucket` guard	✅ Verified
`"..."` and longer pure-dot names: safe from path traversal (not collapsed by `filepath.Clean`)	✅ Verified safe
`Finalize` accumulates `firstErr` (valid buckets flush even when one is malformed)	Intentional pre-existing behavior
Three-generation scenario (DynamoDB encoder) untested	Low — carried from round 3
No Finalize-level N-key migration conflict test (DynamoDB encoder)	Low — carried from round 4

The fix is solid. The two Low findings are DynamoDB-side test coverage gaps carried from prior rounds; they do not affect S3 encoder correctness.

…at/backup-phase0a-dynamodb

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: f710f5e496

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-05-02T10:58:53Z

+		dir := filepath.Join(s.outRoot, "s3", EncodeSegment([]byte(bucket)), "_incomplete_uploads")
+		if err := os.MkdirAll(dir, 0o755); err != nil { //nolint:mnd // 0755 == standard dir mode


Validate bucket segment before incomplete-upload path join

When WithIncludeIncompleteUploads(true) is enabled, this path is built directly from the parsed bucket name without rejecting ".", "..", or empty segments. A malformed snapshot key such as an upload record for bucket ".." will make filepath.Join escape the s3/ subtree and create/write records.jsonl outside the intended output directory before Finalize’s bucket-name guard runs, which is a filesystem traversal/corruption risk for untrusted or damaged dumps.

Useful? React with 👍 / 👎.

… round 7) Codex P1 round 13 follow-up (commit f710f5e): The round-6 fix added a dot-segment guard to flushBucket, but that runs at FINALIZE. HandleIncompleteUpload writes to disk EARLIER — every upload-family record opens a per-bucket records.jsonl on first arrival, and that path was built via `filepath.Join(outRoot, "s3", EncodeSegment(bucket), "_incomplete_uploads")` without validating the bucket segment. EncodeSegment preserves "." and "..", so a malformed snapshot upload record with bucket="." or ".." would escape the s3/ subtree at HandleIncompleteUpload time, well before flushBucket's guard runs. The same fail-closed bucket-name check now runs at the HandleIncompleteUpload entry. Caller audit per loop's audit-semantics rule: HandleIncompleteUpload is invoked only by test code; the surfaced ErrS3MalformedKey class matches the existing parseUploadFamily failure path. Test: TestS3_HandleIncompleteUploadRejectsDotSegmentBucket covers ".", "..", and empty bucket names.

…at/backup-phase0a-dynamodb

bootjp · 2026-05-02T11:09:03Z

@claude review

Round 7 fix for Codex P1 round 13 follow-up on commit f710f5e (commit 1a875aae):

The round-6 flushBucket dot-segment guard runs at FINALIZE. HandleIncompleteUpload writes to disk EARLIER — every upload-family record opens a per-bucket records.jsonl on first arrival, and that path was built via filepath.Join(outRoot, "s3", EncodeSegment(bucket), "_incomplete_uploads") without validating the bucket segment. EncodeSegment preserves . and .., so a malformed snapshot upload record with bucket=. or .. would escape the s3/ subtree at HandleIncompleteUpload time, well before flushBucket's guard runs.

The same fail-closed bucket-name check now runs at the HandleIncompleteUpload entry. Caller audit (per loop's rule): HandleIncompleteUpload is invoked only by test code; the surfaced ErrS3MalformedKey class matches the existing parseUploadFamily failure path.

Test: TestS3_HandleIncompleteUploadRejectsDotSegmentBucket covers ., .., and empty bucket names. go test -race and golangci-lint clean.

claude · 2026-05-02T11:09:15Z

Claude finished @bootjp's task in 3m 17s —— View job

Review: `1a875aa` — backup: refuse dot-segment bucket in HandleIncompleteUpload (PR #716, round 7)

Read PR context and understand request
Read internal/backup/s3.go
Read internal/backup/s3_test.go
Diff against base branch (origin/feat/backup-phase0a-sqs)
Review: Data loss lens
Review: Concurrency/distributed failures lens
Review: Performance lens
Review: Data consistency lens
Review: Test coverage lens

Codex P1 round 13 follow-up is correctly addressed. Details per lens below.

Lens 1 — Data loss

Guard placement is correct. Lines 501–504 fire before s.bucketState(bucket) (line 506), os.MkdirAll (line 509), and openJSONL (line 512). No entry is created in s.buckets, no directory is created, and no file is opened for a malformed bucket name. Without the guard:

bucket = ".": filepath.Join(outRoot, "s3", ".", "_incomplete_uploads") → outRoot/s3/_incomplete_uploads (dot collapses). records.jsonl lands under s3/ root, mixing upload records from different buckets.
bucket = "..": filepath.Join(outRoot, "s3", "..", "_incomplete_uploads") → outRoot/_incomplete_uploads. records.jsonl escapes the s3/ subtree entirely and writes under outRoot/, potentially adjacent to MANIFEST.json and other top-level dump files.

The guard is at the earliest possible point in the function body — after parseUploadFamily succeeds but before any state mutation. ✓

Interaction with flushBucket guard is complementary. The HandleIncompleteUpload guard (this PR, Handle-time) covers the incomplete-upload disk-write path. The flushBucket guard (round 6, Finalize-time) covers the bucket-meta flush path. Together they seal both code paths where a bucket name is used to construct a filesystem path. No disk write can occur for ".", "..", or "" buckets through either path.

No data loss issues.

Lens 2 — Concurrency / distributed failures

Sequential Handle→Finalize contract. No issues.

Lens 3 — Performance

Three string comparisons in a switch, O(1), called once per upload-family record. No impact.

Lens 4 — Data consistency

parseUploadFamily returns bucket = "" with ok = true for an empty bucket — not ok = false. For UploadMetaKey("", 1, "obj", "u"), the key after stripping the prefix starts with {0x00, 0x01, ...}. decodeBucketSegmentForRouting at i=0 sees rest[0] == 0x00 and rest[1] == 0x01 and returns string(rest[:0]) = "" with ok = true. The !ok guard at line 488–490 does not catch the empty-bucket case — the bucket == "" arm of the new switch at 501–504 is the correct catch. ✓

parseUploadFamily returns bucket = "." with ok = true for a crafted dot-bucket UploadMetaKey. EncodeSegment([]byte(".")) = {'.', 0x00, 0x01}. decodeBucketSegmentForRouting at i=1 sees rest[1] == 0x00, rest[2] == 0x01, returns string(rest[:1]) = "." with ok = true. Guard fires. ✓

bucketState map not polluted for malformed buckets. The guard fires before s.bucketState(bucket), so no entry is created in s.buckets. When Finalize() iterates s.buckets, the malformed bucket is absent — the flushBucket guard never needs to run for it. The two guards are additive defense-in-depth, not redundant. ✓

No data consistency issues.

Lens 5 — Test coverage

TestS3_HandleIncompleteUploadRejectsDotSegmentBucket covers all three cases correctly:

Case	`bucket`	`parseUploadFamily` returns	Guard arm	Expected
`"."`	`"."`	`bucket=".", ok=true`	first arm	`ErrS3MalformedKey` ✓
`".."`	`".."`	`bucket="..", ok=true`	first arm	`ErrS3MalformedKey` ✓
`""`	`""`	`bucket="", ok=true`	second arm	`ErrS3MalformedKey` ✓

All assert errors.Is(err, ErrS3MalformedKey). ✓

One new gap (Low) — S3UploadPartPrefix path untested. parseUploadFamily dispatches on two prefixes: S3UploadMetaPrefix (routed through decodeBucketSegmentForRouting, tested here) and S3UploadPartPrefix (routed through s3keys.ParseUploadPartKey, not tested). For a crafted UploadPartKey(".", ...), ParseUploadPartKey would extract bucket = "." and the same guard at line 501 would fire. The guard location is prefix-agnostic — both paths reach the same switch — so correctness is guaranteed by code inspection. A test using S3UploadPartPrefix would be a belt-and-suspenders coverage improvement.

Fix this →

Carried from prior rounds (DynamoDB encoder — both Low):

Three-generation scenario untested. No test exercises totalStaleItemsExcluding with more than one stale generation (schema gen=7, migrationSource=6, debris at gen=5).
No Finalize-level N-key migration-conflict test. TestDDB_CanonicalNumberKeySegment unit-tests N canonicalization; no end-to-end test with N-type hash keys in both source and active gens reproduces the round 9 bug scenario at the file-system level.

Summary

Finding	Severity
Codex P1 round 13 follow-up: dot-segment guard in `HandleIncompleteUpload`	✅ Correctly fixed
Empty-bucket `ok=true` from `parseUploadFamily` caught by `bucket==""` arm	✅ Verified
Guard fires before `bucketState`, `MkdirAll`, `openJSONL` — no state pollution	✅ Verified
`flushBucket` (Finalize) + `HandleIncompleteUpload` (Handle-time) guards complement each other	✅ Verified
`S3UploadPartPrefix` dispatch branch not covered by new test	Low — correctness guaranteed by guard placement; coverage gap only
Three-generation scenario (DynamoDB encoder) untested	Low — carried from round 3
No Finalize-level N-key migration conflict test (DynamoDB encoder)	Low — carried from round 4

The fix is solid. The three Low findings are test coverage improvements; none affect backup safety or correctness.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: edfc944c78

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-05-02T11:16:04Z

+		return errors.Wrapf(ErrS3MalformedKey,
+			"orphan object key %q is a dot segment (would escape orphan dir)", obj.object)
+	}
+	dir := filepath.Join(bucketDir, "_orphans", EncodeSegment([]byte(obj.object)))


Include generation in orphan chunk output paths

When WithIncludeOrphans(true) is enabled, stale objects from different generations are all written under the same _orphans/<encoded-object>/ directory, and chunk filenames are derived only from (uploadID, partNo, chunkNo, partVersion). If two generations share those values (which is possible during delete/recreate cleanup windows), later writes overwrite earlier orphan chunks, silently dropping forensic data instead of preserving all stale generations.

Useful? React with 👍 / 👎.

Codex P2 round 13 (commit edfc944): `flushOrphanObject` wrote stale chunks to `<bucket>/_orphans/<encoded-object>/<scratchName>` where scratchName is `(uploadID, partNo, chunkNo, partVersion)`. Two stale generations sharing those four identifiers — possible during delete/recreate cleanup windows where the live system recycled the upload identifier — would overwrite one another, silently dropping forensic data the operator opted in to preserve via --include-orphans. Embed the object's generation in the path: `<bucket>/_orphans/<encoded-object>/gen-<N>/`. Each stale gen lands in its own subdir; restore tools can iterate per-gen. Caller audit (per loop's audit-semantics rule for output-shape change): single internal caller (flushBucketObjects); no public API change, only the on-disk layout under _orphans/ becomes one level deeper. The existing TestS3_OrphanChunksWrittenWhenIncludeOrphans test was updated to match the new path. Tests: - TestS3_OrphanChunksWrittenWhenIncludeOrphans updated for per-gen path. - NEW TestS3_OrphanChunksPreservedAcrossGenerations drives two stale gens sharing (uploadID, partNo, chunkNo, partVersion) and asserts both survive distinctly under their own gen-<N>/ dirs.

bootjp · 2026-05-02T11:21:17Z

@claude review

Round 8 fix for Codex P2 round 13 on commit edfc944 (commit ae7c6e2a):

flushOrphanObject wrote stale chunks to <bucket>/_orphans/<encoded-object>/<scratchName> where the scratch name is (uploadID, partNo, chunkNo, partVersion). Two stale generations sharing those four identifiers — possible during delete/recreate cleanup windows where the live system recycled the upload identifier — would overwrite one another, silently dropping forensic data the operator opted in to preserve via --include-orphans.

Embedded the object's generation in the path: <bucket>/_orphans/<encoded-object>/gen-<N>/. Each stale gen lands in its own subdir; restore tools can iterate per-gen.

Caller audit (per loop's audit-semantics rule for output-shape change): single internal caller flushBucketObjects; no public API change, only the on-disk layout under _orphans/ becomes one level deeper. Existing TestS3_OrphanChunksWrittenWhenIncludeOrphans updated to match the new path; new TestS3_OrphanChunksPreservedAcrossGenerations drives two stale gens sharing identifiers and asserts both survive under their own gen-N/ dirs.

go test -race and golangci-lint clean.

claude · 2026-05-02T11:21:33Z

Claude finished @bootjp's task in 3m 16s —— View job

Review: `ae7c6e2` — backup: per-generation orphan chunk paths (PR #716, round 8)

Read PR context and understand request
Read internal/backup/s3.go
Read internal/backup/s3_test.go
Diff against base branch (origin/feat/backup-phase0a-sqs)
Review: Data loss lens
Review: Concurrency/distributed failures lens
Review: Performance lens
Review: Data consistency lens
Review: Test coverage lens

Codex P2 round 13 is correctly addressed. Details per lens below.

Lens 1 — Data loss

The fix is minimal and targeted. The entire change in s3.go is one new fmt.Sprintf line and one path join that adds genDir as an extra path component. Before:

_orphans/<encoded-object>/<blobScratchName>

After:

_orphans/<encoded-object>/gen-<N>/<blobScratchName>

obj.generation source is correct. HandleBlob calls ParseBlobKey → objectState(bucket, gen, object). Each unique (object, generation) pair gets its own s3ObjectState entry in b.objects (keyed by object + "\x00" + uint64Hex(gen)). flushOrphanObject is passed the per-generation state, so obj.generation is always the generation that produced those specific chunks. No aliasing or cross-contamination. ✓

Collision scenario reproduced and solved. Two stale gens at (uploadID="u", partNo=1, chunkNo=0, partVersion=0) previously produced the same blobScratchName, so the second writeFileAtomic overwrote the first. With gen-1/ and gen-2/ subdirs each chunk lands on an independent path — both survive via writeFileAtomic's tmp+rename. ✓

flushOrphanObject is called from two paths — both correctly handled:

flushBucketObjects (stale gen route, line 628): stale gen → gen-<staleN>/ ✓
flushObjectWithCollision (no manifest, line 707): any gen → gen-<objGen>/ ✓

No data loss issues.

Lens 2 — Concurrency / distributed failures

Sequential Handle→Finalize contract. No issues.

Lens 3 — Performance

fmt.Sprintf("gen-%d", uint64) is O(1), one extra filepath.Join component. Called once per orphan object at Finalize time. No impact.

Lens 4 — Data consistency

genDir is always a safe filesystem component. fmt.Sprintf("gen-%d", obj.generation) produces "gen-0" through "gen-18446744073709551615". The gen- prefix ensures it is never the empty string, never ".", and never "..". No path traversal risk from this component. ✓

Dot-segment guard (line 752–754) is pre-existing and still correct. flushOrphanObject returns ErrS3MalformedKey before EncodeSegment runs for obj.object == "." or "..". The new genDir component is inserted after the EncodeSegment([]byte(obj.object)) component, so the guard at line 752 remains sufficient. ✓

One minor inaccuracy in the caller audit. The commit message states "single internal caller flushBucketObjects" but flushOrphanObject has two callers: flushBucketObjects (line 628) and flushObjectWithCollision (line 707). Both are internal and the path change is correct for both, so this is a documentation inaccuracy only — no correctness impact.

No data consistency issues.

Lens 5 — Test coverage

Both tests are correct:

TestS3_OrphanChunksWrittenWhenIncludeOrphans — active gen=1, blob driven at gen=1 with no manifest → routes through flushObjectWithCollision → flushOrphanObject. obj.generation=1 → gen-1/ dir asserted to exist. ✓

TestS3_OrphanChunksPreservedAcrossGenerations — active gen=3, blobs at gen=1 and gen=2 sharing (uploadID="u", partNo=1, chunkNo=0, no manifest) → both stale → flushBucketObjects stale path → gen-1/ and gen-2/ both asserted to exist. This is the direct regression test for the before-fix collision scenario. ✓

One Low gap — file content not verified in regression test. TestS3_OrphanChunksPreservedAcrossGenerations asserts both gen-1/ and gen-2/ directories exist, but doesn't read the chunk files to verify g1-bytes vs g2-bytes survived intact. The test correctly proves the directories are distinct (which is the regression), but a content assertion would make the test fail in any scenario where both dirs exist but the wrong bytes landed in one.

Fix this →

Carried from prior rounds (DynamoDB encoder — both Low):

Three-generation scenario untested. No test exercises totalStaleItemsExcluding with more than one stale generation (schema gen=7, migrationSource=6, debris at gen=5).

Fix this →
No Finalize-level N-key migration conflict test. TestDDB_CanonicalNumberKeySegment unit-tests N canonicalization; no end-to-end test with N-type hash keys in both source and active gens at the filesystem level.

Summary

Finding	Severity
Codex P2 round 13: per-generation orphan chunk paths	✅ Correctly fixed
`obj.generation` source is correct (from `ParseBlobKey` via `objectState`)	✅ Verified
Collision scenario: same `blobScratchName` in two gens now safely separated	✅ Verified
Both callers of `flushOrphanObject` (stale path + no-manifest path) handled correctly	✅ Verified
`genDir = "gen-<N>"` is always a safe, non-dot path component	✅ Verified
Caller audit in commit message omits second caller (`flushObjectWithCollision`)	Minor doc inaccuracy — no correctness impact
Regression test asserts dirs exist but not chunk content	Low — test proves directory isolation; content verification would add robustness
Three-generation scenario (DynamoDB encoder) untested	Low — carried from round 3
No Finalize-level N-key migration conflict test (DynamoDB encoder)	Low — carried from round 4

The fix is solid. This PR is correct and ready from a backup-encoder safety standpoint.

…' into feat/backup-phase0a-dynamodb

chatgpt-codex-connector · 2026-05-02T11:48:22Z

💡 Codex Review

elastickv/internal/backup/sqs.go

Lines 291 to 299 in cdc62e3

    
           encoded, err := stripPrefixSegment(key, []byte(SQSQueueGenPrefix)) 
        
           if err != nil { 
        
           	return err 
        
           } 
        
           gen, err := strconv.ParseUint(strings.TrimSpace(string(value)), 10, 64) //nolint:mnd // 10 == decimal radix 
        
           if err != nil { 
        
           	return errors.Wrap(ErrSQSMalformedKey, err.Error()) 
        
           } 
        
           s.queueState(encoded).activeGen = gen

Validate queue segment before accepting queue-gen records

HandleQueueGen parses the generation value but never validates that the key suffix is a valid base64url queue segment. As a result, malformed keys like !sqs|queue|gen|%%% (or an empty segment) are accepted and silently create orphan queue state instead of failing as malformed input, which is inconsistent with the stricter validation in HandleQueueMeta/HandleMessageData and can mask snapshot corruption during backup decoding.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

## Summary - First of the Phase 0a wide-column encoders. Translates `!hs|meta|` + `!hs|fld|` records on the live store into one `hashes/<encoded>.json` file per Redis hash, per the design at `docs/design/2026_04_29_proposed_snapshot_logical_decoder.md:327`. - Stacked on top of #713 (the strings/HLL/TTL encoder); reuses `RedisDB`, the SHA-fallback KEYMAP path, and the dirsCreated cache. - TTL is folded inline into `expire_at_ms` (not a sidecar) so a restorer can replay each hash in one shot. `HandleTTL` adds a `redisKindHash` route. - Field values reuse the UTF-8-or-base64 envelope from PR #714's SQS body for binary safety. Field names percent-encode (`EncodeSegment`) when not valid UTF-8 to keep JSON object keys as strings. - Empty hashes still emit a file (HLEN==0 is observable). Length mismatches between `meta.Len` and observed `!hs|fld|` keys surface as `redis_hash_length_mismatch` warnings. - Output is sorted by user key + field name for deterministic dumps (matches the round-9 sort pattern from #716). ## Self-review (per CLAUDE.md "five lenses") 1. **Data loss** — empty hashes preserved; declared/observed length mismatch warned; field value bytes preserved verbatim via base64 envelope when not UTF-8; SHA-fallback paths feed KEYMAP. 2. **Concurrency** — `RedisDB` is single-writer per the existing contract; the new `hashes` map follows the same per-key state model used by `kindByKey`/`inlineTTLEmitted`. 3. **Performance** — fields buffered per key (Redis hashes are typically small); `flushHashes` runs once at Finalize. Sort is `O(n log n)` over user keys + per-hash field names — acceptable at dump cadence. 4. **Data consistency** — `kindByKey` registration on both meta and field arrival prevents TTL mis-routing across snapshot record-ordering scenarios. 5. **Test coverage** — 8 sub-tests: round-trip, empty hash, inline TTL routing, length mismatch warning, binary value envelope, malformed meta, truncated field key, SHA-fallback keymap. ## Test plan - [x] `go test -race ./internal/backup/...` — clean. - [x] `GOOS=windows GOARCH=amd64 go build ./internal/backup/...` — clean. - [x] `GOOS=js GOARCH=wasm go build ./internal/backup/...` — clean. - [x] `golangci-lint run` — clean.

…ynamodb

chatgpt-codex-connector · 2026-05-02T14:30:34Z

You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard.
To continue using code reviews, you can upgrade your account or add credits to your account and enable them for code reviews in your settings.

coderabbitai

Actionable comments posted: 2

🧹 Nitpick comments (1)

internal/backup/dynamodb.go (1)

313-313: ⚡ Quick win

Remove the new //nolint:mnd by replacing literals with named constants.

Line 313 can satisfy mnd without suppression.

Proposed refactor

+const (
+	parseUintBase10 = 10
+	parseUintBits64 = 64
+)
+
 func parseDDBItemKey(key []byte) (string, uint64, error) {
@@
-	gen, err := strconv.ParseUint(afterTable[:genEnd], 10, 64) //nolint:mnd // 10 == decimal radix; 64 == uint64 width
+	gen, err := strconv.ParseUint(afterTable[:genEnd], parseUintBase10, parseUintBits64)

As per coding guidelines, avoid //nolint annotations — refactor instead.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@internal/backup/dynamodb.go` at line 313, The ParseUint call uses magic
numbers with a nolint suppression; remove the nolint and replace the literals in
strconv.ParseUint(afterTable[:genEnd], 10, 64) with clearly named constants
(e.g., decimalRadix and uint64BitSize) declared near the top of the file or in
the same package, then use those constants in the call so the code reads
strconv.ParseUint(afterTable[:genEnd], decimalRadix, uint64BitSize) and the mnd
linter is satisfied.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@internal/backup/s3_test.go`:
- Around line 891-959: Add a new regression test (e.g.
TestS3_ZeroChunkCountWithStrayChunkRejected) that creates an encoder via
newS3Encoder, writes BucketMeta with HandleBucketMeta, writes an ObjectManifest
via enc.HandleObjectManifest that declares a part with "chunk_count": 0, then
emit a stray blob for that part using enc.HandleBlob with chunk_no 0, call
enc.Finalize(), and assert the returned error is ErrS3IncompleteBlobChunks; use
the same patterns and helper functions from the existing tests
(HandleBucketMeta, HandleObjectManifest, HandleBlob, Finalize,
ErrS3IncompleteBlobChunks) to locate where to add it.

In `@internal/backup/s3.go`:
- Around line 89-91: The code currently skips parts where want.chunkCount == 0
but later still accepts chunks for those same parts, allowing bytes to be
assembled for parts declared empty; update the chunk-processing path to detect a
manifest part with want.chunkCount == 0 and reject any incoming chunk for that
part (return an error instead of appending/accepting), and make the skip logic
consistent by marking the part handled as empty so no further chunk handlers
will consider it; look for uses of want.chunkCount and the chunk append/accept
logic (the loop that processes incoming chunks and where bytes are
appended/accepted) and add an explicit check that returns an error when a chunk
targets a part whose want.chunkCount == 0.

---

Nitpick comments:
In `@internal/backup/dynamodb.go`:
- Line 313: The ParseUint call uses magic numbers with a nolint suppression;
remove the nolint and replace the literals in
strconv.ParseUint(afterTable[:genEnd], 10, 64) with clearly named constants
(e.g., decimalRadix and uint64BitSize) declared near the top of the file or in
the same package, then use those constants in the call so the code reads
strconv.ParseUint(afterTable[:genEnd], decimalRadix, uint64BitSize) and the mnd
linter is satisfied.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: e711c23a-2167-4c74-b392-e5fbc0aec821

📥 Commits

Reviewing files that changed from the base of the PR and between c7ed1a1 and 389c4f7.

📒 Files selected for processing (4)

internal/backup/dynamodb.go
internal/backup/dynamodb_test.go
internal/backup/s3.go
internal/backup/s3_test.go

coderabbitai · 2026-05-02T14:35:08Z

+func TestS3_DuplicateChunksWithMissingIndexRejected(t *testing.T) {
+	t.Parallel()
+	enc, _ := newS3Encoder(t)
+	bucket := "b"
+	gen := uint64(1)
+	object := "obj"
+	uploadID := "u"
+	if err := enc.HandleBucketMeta(s3keys.BucketMetaKey(bucket), encodeS3BucketMetaValue(t, map[string]any{
+		"bucket_name": bucket, "generation": gen,
+	})); err != nil {
+		t.Fatal(err)
+	}
+	if err := enc.HandleObjectManifest(s3keys.ObjectManifestKey(bucket, gen, object), encodeS3ManifestValue(t, map[string]any{
+		"upload_id": uploadID, "size_bytes": 9, "parts": []map[string]any{
+			{"part_no": 1, "size_bytes": 9, "chunk_count": 3},
+		},
+	})); err != nil {
+		t.Fatal(err)
+	}
+	// Drive `{0, 2, 2}`: count satisfies, maxIndex satisfies, but
+	// chunk_no=1 is missing. The set-based validator must reject.
+	if err := enc.HandleBlob(s3keys.BlobKey(bucket, gen, object, uploadID, 1, 0), []byte("AAA")); err != nil {
+		t.Fatal(err)
+	}
+	if err := enc.HandleBlob(s3keys.BlobKey(bucket, gen, object, uploadID, 1, 2), []byte("CC1")); err != nil {
+		t.Fatal(err)
+	}
+	if err := enc.HandleBlob(s3keys.BlobKey(bucket, gen, object, uploadID, 1, 2), []byte("CC2")); err != nil {
+		t.Fatal(err)
+	}
+	err := enc.Finalize()
+	if !errors.Is(err, ErrS3IncompleteBlobChunks) {
+		t.Fatalf("err=%v want ErrS3IncompleteBlobChunks for chunk-set with duplicate+missing", err)
+	}
+}
+
+func TestS3_IncompleteChunksRejected(t *testing.T) {
+	t.Parallel()
+	enc, _ := newS3Encoder(t)
+	bucket := "b"
+	gen := uint64(1)
+	object := "obj"
+	uploadID := "u"
+	if err := enc.HandleBucketMeta(s3keys.BucketMetaKey(bucket), encodeS3BucketMetaValue(t, map[string]any{
+		"bucket_name": bucket, "generation": gen,
+	})); err != nil {
+		t.Fatal(err)
+	}
+	// Manifest declares 3 chunks for partNo=1 but the snapshot only
+	// has 2 (chunkNo=0 and chunkNo=2; chunkNo=1 missing). Codex P1
+	// #729.
+	if err := enc.HandleObjectManifest(s3keys.ObjectManifestKey(bucket, gen, object), encodeS3ManifestValue(t, map[string]any{
+		"upload_id": uploadID, "size_bytes": 9, "parts": []map[string]any{
+			{"part_no": 1, "size_bytes": 9, "chunk_count": 3},
+		},
+	})); err != nil {
+		t.Fatal(err)
+	}
+	if err := enc.HandleBlob(s3keys.BlobKey(bucket, gen, object, uploadID, 1, 0), []byte("AAA")); err != nil {
+		t.Fatal(err)
+	}
+	if err := enc.HandleBlob(s3keys.BlobKey(bucket, gen, object, uploadID, 1, 2), []byte("CCC")); err != nil {
+		t.Fatal(err)
+	}
+	err := enc.Finalize()
+	if !errors.Is(err, ErrS3IncompleteBlobChunks) {
+		t.Fatalf("err=%v want ErrS3IncompleteBlobChunks for missing chunk", err)
+	}
+}


⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Add a regression test for chunk_count=0 with stray chunks.

Current completeness tests don’t cover the zero-count-part boundary. A case declaring chunk_count: 0 and emitting chunk_no: 0 should fail with ErrS3IncompleteBlobChunks.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@internal/backup/s3_test.go` around lines 891 - 959, Add a new regression test (e.g. TestS3_ZeroChunkCountWithStrayChunkRejected) that creates an encoder via newS3Encoder, writes BucketMeta with HandleBucketMeta, writes an ObjectManifest via enc.HandleObjectManifest that declares a part with "chunk_count": 0, then emit a stray blob for that part using enc.HandleBlob with chunk_no 0, call enc.Finalize(), and assert the returned error is ErrS3IncompleteBlobChunks; use the same patterns and helper functions from the existing tests (HandleBucketMeta, HandleObjectManifest, HandleBlob, Finalize, ErrS3IncompleteBlobChunks) to locate where to add it.

…nd 9) CodeRabbit Major round 13 (commit 389c4f7): `verifyChunkCompleteness` skipped parts where `want.chunkCount == 0`, but `filterChunksForManifest` still included any chunks recorded for those (partNo, partVersion) keys. The asymmetry let stray chunks for a manifest-declared zero-count part bleed into the assembled body while the completeness check passed — a corrupted body that violated the declared-part contract without raising any error. Make `filterChunksForManifest` symmetric: skip chunks whose declared part has `chunkCount == 0`. The two paths now agree on the "zero-count part contributes no bytes" invariant. Caller audit (per loop's audit-semantics rule for behavior change): single caller `assembleObjectBody`. The change only narrows what chunks reach the assembler; downstream completeness check already skips zero-count parts so the consistency is restored, not broken. No public API change. Test: - TestS3_ZeroChunkCountPartIgnoresStrayChunks drives a two-part manifest where part 1 has chunk_count=1 (real content "hello") and part 2 has chunk_count=0 with a stray "STRAY" chunk recorded. Asserts the assembled body is exactly "hello" — no stray bytes leak.

bootjp · 2026-05-02T14:40:06Z

@claude review

Round 9 fix for CodeRabbit Major round 13 on commit 389c4f7 (commit 6a1bef14):

verifyChunkCompleteness skipped parts where want.chunkCount == 0, but filterChunksForManifest still included any chunks recorded for those (partNo, partVersion) keys. The asymmetry let stray chunks for a manifest-declared zero-count part bleed into the assembled body while the completeness check passed — a corrupted body that violated the declared-part contract without raising any error.

Made filterChunksForManifest symmetric: skip chunks whose declared part has chunkCount == 0. The two paths now agree on the "zero-count part contributes no bytes" invariant.

Caller audit (per loop's audit-semantics rule for behavior change): single caller assembleObjectBody. The change only narrows what chunks reach the assembler; downstream completeness check already skips zero-count parts so the consistency is restored. No public API change.

Test: TestS3_ZeroChunkCountPartIgnoresStrayChunks drives a two-part manifest (part 1 chunk_count=1 with content "hello"; part 2 chunk_count=0 with stray "STRAY" chunk) and asserts the assembled body is exactly "hello" — no stray bytes leak.

go test -race and golangci-lint clean. Also covers the CodeRabbit Minor request for a zero-count regression test.

chatgpt-codex-connector · 2026-05-02T14:40:07Z

You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard.
To continue using code reviews, you can upgrade your account or add credits to your account and enable them for code reviews in your settings.

claude · 2026-05-02T14:40:20Z

Claude finished @bootjp's task in 2m 34s —— View job

Review: `6a1bef1` — backup: skip stray chunks for zero-count manifest parts (PR #716, round 9)

Read PR context and understand request
Read internal/backup/s3.go (filterChunksForManifest, verifyChunkCompleteness, assembleObjectBody)
Read internal/backup/s3_test.go (new test)
Diff against base branch
Review: Data loss lens
Review: Concurrency/distributed failures lens
Review: Performance lens
Review: Data consistency lens
Review: Test coverage lens

CodeRabbit Major round 13 is correctly addressed. Details per lens below.

Lens 1 — Data loss

Fix placement is correct. The new declared.chunkCount == 0 guard (s3.go:1030) is inserted after the existing !ok guard (line 1016), so declared is always initialized when the new check runs. The guard is in filterChunksForManifest, which produces the chunks slice that assembleObjectBody iterates. No zero-count-part bytes reach appendFile. The subsequent writeFileAtomic (tmp+rename) in flushBucketObjects ensures the assembled file is either fully written or absent.

Execution path for the stray-chunk scenario:

filterChunksForManifest: {partNo=2, partVersion=0, chunkNo=0} → declared.chunkCount == 0 → continue (filtered out) ✓
verifyChunkCompleteness receives filtered slice (no zero-count-part chunks). For {partNo=2, partVersion=0} in declaredParts: want.chunkCount == 0 → continue (passes without error) ✓
assembleObjectBody loop: only {partNo=1, chunkNo=0} is present → appendFile("hello") → body is "hello" ✓

No data loss issues.

Lens 2 — Concurrency / distributed failures

Sequential Handle→Finalize contract. No issues.

Lens 3 — Performance

declared.chunkCount == 0 is one uint64 == 0 comparison per chunk key in filterChunksForManifest. O(1) per chunk. No impact on backup throughput.

Lens 4 — Data consistency

The fix restores the invariant stated in the manifest contract. A manifest part declaring chunkCount=0 means that part is logically empty — zero bytes contributed to the object body. Before the fix, filterChunksForManifest and verifyChunkCompleteness disagreed: the completeness check skipped zero-count parts, but the filter passed their stray chunks to the assembler. Now both agree: zero-count parts contribute nothing.

Design choice noted — lenient vs. strict: CodeRabbit's alternative suggestion was to make verifyChunkCompleteness reject stray chunks for zero-count parts with ErrS3IncompleteBlobChunks. The chosen approach silently discards them. Both are semantically defensible, but the lenient approach is appropriate for a backup encoder that may encounter cleanup debris from concurrent deletes, and is consistent with filterChunksForManifest's role as a filter rather than a validator.

partVersion interaction is safe. The filter key is {partNo, partVersion}. A zero-count declaration for {partNo=2, partVersion=0} does not affect chunks at {partNo=2, partVersion=1} — those would be a distinct s3PartKey entry in declaredParts, handled independently. No cross-version contamination.

No data consistency issues.

Lens 5 — Test coverage

TestS3_ZeroChunkCountPartIgnoresStrayChunks is a complete regression test:

Input	Expected
Part 1: `chunk_count=1`, real chunk "hello"	Present in body
Part 2: `chunk_count=0`, stray chunk "STRAY"	Absent from body
`Finalize()`	Succeeds (no error)
Assembled body	Exactly `"hello"`

The test drives the full stack: HandleBucketMeta → HandleObjectManifest → HandleBlob (×2) → Finalize → file read. It directly reproduces the pre-fix bug scenario and verifies the post-fix invariant. ✓

One Low gap: No test exercises the edge case where the manifest contains only zero-count parts with stray chunks (no real content). In that case filterChunksForManifest returns an empty slice, verifyChunkCompleteness passes, and assembleObjectBody produces an empty file. Correct by inspection; untested.

Carried //nolint:mnd (CodeRabbit Nitpick — carried from prior round):

dynamodb.go:313 still uses //nolint:mnd for strconv.ParseUint(..., 10, 64). CLAUDE.md explicitly says "Avoid //nolint — refactor instead." The fix is two named constants:

Fix this →

Carried from prior rounds (DynamoDB encoder — both Low):

Three-generation scenario untested. No test exercises totalStaleItemsExcluding with more than one stale generation (schema gen=7, migrationSource=6, debris at gen=5).

Fix this →
No Finalize-level N-key migration conflict test. TestDDB_CanonicalNumberKeySegment unit-tests N canonicalization; no end-to-end test with N-type hash keys in both source and active gens at the filesystem level.

Summary

Finding	Severity
CodeRabbit Major round 13: `filterChunksForManifest` skips zero-count-part chunks	✅ Correctly fixed
Fix placement: after `!ok` guard, `declared` is always initialized	✅ Verified
`verifyChunkCompleteness` + `filterChunksForManifest` now symmetric on zero-count parts	✅ Verified
Lenient (discard stray) vs. strict (error) — lenient is correct for backup encoder	✅ Verified design choice
`partVersion` scoping: zero-count filter is per `{partNo, partVersion}` key only	✅ Verified
No test for all-zero-count-parts scenario (empty body result)	Low
`//nolint:mnd` on `dynamodb.go:313` (CLAUDE.md says refactor, not suppress)	Low — nitpick
Three-generation scenario (DynamoDB encoder) untested	Low — carried from round 3
No Finalize-level N-key migration conflict test (DynamoDB encoder)	Low — carried from round 4

The core fix is correct and directly addresses the CodeRabbit Major finding. The PR is sound from a correctness standpoint.

gemini-code-assist Bot reviewed Apr 30, 2026

View reviewed changes

chatgpt-codex-connector Bot reviewed Apr 30, 2026

View reviewed changes

bootjp mentioned this pull request Apr 30, 2026

backup: S3 encoder for buckets, objects, and blob reassembly (Phase 0a) #718

Merged

3 tasks

bootjp added 2 commits April 30, 2026 19:33

Merge remote-tracking branch 'origin/feat/backup-phase0a-sqs' into fe…

da7295a

…at/backup-phase0a-dynamodb

bootjp mentioned this pull request Apr 30, 2026

s3keys: export ParseBlobKey for offline blob-key consumers #717

Merged

4 tasks

bootjp added 2 commits April 30, 2026 19:38

Merge remote-tracking branch 'origin/feat/backup-phase0a-dynamodb' in…

b3f2842

…to feat/backup-phase0a-s3

chatgpt-codex-connector Bot reviewed Apr 30, 2026

View reviewed changes

bootjp added 2 commits April 30, 2026 19:59

Merge remote-tracking branch 'origin/feat/backup-phase0a-keymap-manif…

8f48182

…est' into feat/backup-phase0a-dynamodb

Merge remote-tracking branch 'origin/feat/backup-phase0a-dynamodb' in…

33bff13

…to feat/backup-phase0a-s3

chatgpt-codex-connector Bot reviewed Apr 30, 2026

View reviewed changes

bootjp added 2 commits April 30, 2026 20:12

Merge remote-tracking branch 'origin/feat/backup-phase0a-keymap-manif…

305b14f

…est' into feat/backup-phase0a-dynamodb

bootjp added 5 commits April 30, 2026 20:17

Merge remote-tracking branch 'origin/feat/backup-phase0a-dynamodb' in…

2094e3e

…to feat/backup-phase0a-s3

Merge remote-tracking branch 'origin/feat/backup-phase0a-keymap-manif…

844fd49

…est' into feat/backup-phase0a-s3

Merge remote-tracking branch 'origin/feat/backup-phase0a-keymap-manif…

284e272

…est' into feat/backup-phase0a-dynamodb

bootjp mentioned this pull request May 1, 2026

backup: add Redis hash encoder #725

Merged

4 tasks

bootjp added 2 commits May 2, 2026 03:01

Merge remote-tracking branch 'origin/main' into feat/backup-phase0a-d…

1ccd680

…ynamodb

chatgpt-codex-connector Bot reviewed May 2, 2026

View reviewed changes

Merge remote-tracking branch 'origin/feat/backup-phase0a-sqs' into fe…

fbf5e59

…at/backup-phase0a-dynamodb

chatgpt-codex-connector Bot reviewed May 2, 2026

View reviewed changes

bootjp added 2 commits May 2, 2026 20:08

Merge remote-tracking branch 'origin/feat/backup-phase0a-sqs' into fe…

edfc944

…at/backup-phase0a-dynamodb

chatgpt-codex-connector Bot reviewed May 2, 2026

View reviewed changes

Merge remote-tracking branch 'origin/feat/backup-phase0a-redis-simple…

cdc62e3

…' into feat/backup-phase0a-dynamodb

Base automatically changed from feat/backup-phase0a-sqs to main May 2, 2026 14:26

Merge remote-tracking branch 'origin/main' into feat/backup-phase0a-d…

389c4f7

…ynamodb

coderabbitai Bot reviewed May 2, 2026

View reviewed changes

bootjp merged commit 3dc5b6d into main May 4, 2026
9 checks passed

bootjp deleted the feat/backup-phase0a-dynamodb branch May 4, 2026 11:43

		rawName, err := base64.RawURLEncoding.DecodeString(encoded)
		if err != nil {

		dir := filepath.Join(s.outRoot, "s3", EncodeSegment([]byte(bucket)), "_incomplete_uploads")
		if err := os.MkdirAll(dir, 0o755); err != nil { //nolint:mnd // 0755 == standard dir mode

Conversation

bootjp commented Apr 30, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Why buffer + emit at Finalize

All 10 attribute kinds covered

Bundle mode is stubbed

Test plan

Stacking

Summary by CodeRabbit

Release Notes

Uh oh!

coderabbitai Bot commented Apr 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Rate limit exceeded

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Poem

❌ Failed checks (1 warning)

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot Apr 30, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Apr 30, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Apr 30, 2026

Choose a reason for hiding this comment

Uh oh!

bootjp commented Apr 30, 2026

Uh oh!

claude Bot commented Apr 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review: fdc9b36 — backup: address review on DynamoDB encoder (PR #716)

Lens 1 — Data loss

Lens 2 — Concurrency / distributed failures

Lens 3 — Performance

Lens 4 — Data consistency

Lens 5 — Test coverage

Summary

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Apr 30, 2026

Choose a reason for hiding this comment

Uh oh!

bootjp commented Apr 30, 2026

Uh oh!

claude Bot commented Apr 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review: ce5b7da — backup: emit migrating-from-generation items in DynamoDB dump (PR #716, round 2)

Lens 1 — Data loss

Lens 2 — Concurrency / distributed failures

Lens 3 — Performance

Lens 4 — Data consistency

Lens 5 — Test coverage

Summary

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Apr 30, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot Apr 30, 2026

Choose a reason for hiding this comment

bootjp commented Apr 30, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Apr 30, 2026 •

edited

Loading

claude Bot commented Apr 30, 2026 •

edited

Loading

Review: `fdc9b36` — backup: address review on DynamoDB encoder (PR #716)

claude Bot commented Apr 30, 2026 •

edited

Loading

Review: `ce5b7da` — backup: emit migrating-from-generation items in DynamoDB dump (PR #716, round 2)

claude Bot commented Apr 30, 2026 •

edited

Loading

Review: `9029add` — backup: reject truncated DynamoDB keys (PR #716, round 3)

claude Bot commented May 2, 2026 •

edited

Loading

Review: `f710f5e` — backup: refuse dot-segment S3 bucket names (PR #716, round 6)

claude Bot commented May 2, 2026 •

edited

Loading

Review: `1a875aa` — backup: refuse dot-segment bucket in HandleIncompleteUpload (PR #716, round 7)

claude Bot commented May 2, 2026 •

edited

Loading