Skip to content

Changes to move to protobufs for DDL messages rather than JSON#1195

Merged
garthgoodson merged 13 commits intomainfrom
1076-protobuf-ddls
Apr 30, 2026
Merged

Changes to move to protobufs for DDL messages rather than JSON#1195
garthgoodson merged 13 commits intomainfrom
1076-protobuf-ddls

Conversation

@craigsoules
Copy link
Copy Markdown
Collaborator

No description provided.

@craigsoules craigsoules marked this pull request as ready for review March 31, 2026 23:10
craigsoules and others added 12 commits April 18, 2026 15:02
…-in-use error

- Updated the integration test workflow to capture the logs when there's a failure during startup
- Fixed a compilation error in release
- pg_log_reader: cap valid_until on cached pre-resync schemas synchronously
  inside _check_sync_commit, before the reader can decode subsequent messages.
  The committer thread's later record_schema_change call (committer.cc:847)
  was leaving a window after SyncTracker::clear_tables during which add_mutation
  fell into the cache path and stamped a stale ExtentSchemaPtr into the
  TableEntry, causing the producer to misinterpret post-ALTER 6-column messages
  through a 5-column pg_fields decoder.

- committer: register min_index_xid via insert_index_xid before the per-xid
  commit loop runs, instead of after process_requests is invoked. The previous
  ordering let last_committed_xid advance past the future idx._xid while
  min_index_xid was still unregistered, briefly raising the vacuum cutoff and
  letting the vacuumer hole-punch extents the upcoming index reconciliation
  needed to follow via prev_offset.

Two safety-net DCHECKs are added so future regressions of either invariant
fail loudly:

- committer::_process_extent: DCHECK_EQ producer/consumer row_size on every
  write-cache extent so a schema-version mismatch surfaces at the source of
  the misread, not as an "Invalid operation: 0" downstream.
- indexer::_reconcile_index: DCHECK that prev_extent read from disk is
  non-null while min_index_xid is registered.

Repro: PARALLEL_DML_And_DDL.sql integration test.
When a single batch's final_xid carries multiple index DDLs (e.g. a
drop_index plus several create_index requests on the same table), every
resulting IndexState ends up bucketed under the same idx._xid. The Redis
sorted set backing min_index_xid keys entries by xid, so all siblings
share one entry. _reconcile_index was calling remove_index_xid per
IndexState, so the first sibling to finish (typically the drop, which
has no build phase) yanked the pin while the rest were still walking
prev_offset chains -- letting the vacuumer punch chain extents the
in-flight reconciliation needed and tripping the indexer.cc:655 DCHECK.

Move the remove_index_xid call into process_index_reconciliation,
keyed off reconcile_xid, so the pin is dropped exactly once after every
sibling has finished.

Repro: PARALLEL_DML_And_DDL.sql nightly integration run; observed
cutoff_xid jumping from 250 to 252 mid-reconciliation when 34189's drop
released the pin ahead of 34246/34247/34248's catch-up walks.
Copy link
Copy Markdown
Contributor

@garthgoodson garthgoodson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks ok.

@garthgoodson garthgoodson merged commit b4b7971 into main Apr 30, 2026
10 checks passed
@garthgoodson garthgoodson deleted the 1076-protobuf-ddls branch April 30, 2026 04:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants