Skip to content

Spoc 558#478

Open
rasifr wants to merge 3 commits into
mainfrom
SPOC-558
Open

Spoc 558#478
rasifr wants to merge 3 commits into
mainfrom
SPOC-558

Conversation

@rasifr
Copy link
Copy Markdown
Member

@rasifr rasifr commented May 21, 2026

Three commits cherry-picked from v5 to main.

  • 6a52d10 -- Change apply_replay_bytes from int to uint64
  • 1cababf -- fix: redirect CurrentResourceOwner in delta-apply subtransaction
  • b8e2471 -- Fix apply_delay dangling pointer in subscription_fromtuple.

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 21, 2026

Review Change Stack

📝 Walkthrough

Walkthrough

This PR widens the in-memory replay byte counter to uint64, runs delta-update conflict resolution inside an internal subtransaction while explicitly managing CurrentResourceOwner, and deep-copies subscription apply_delay interval values to allocated memory.

Changes

Replication Safety and Sizing Improvements

Layer / File(s) Summary
Apply queue byte tracking upgrade
src/spock_apply.c
apply_replay_bytes is widened from int to uint64 and the spill-activation debug log uses UINT64_FORMAT.
Delta-apply conflict resolution refactoring
src/spock_apply_heap.c
Adds PostgreSQL includes and refactors the delta-apply UPDATE path to run inside an internal subtransaction, temporarily switch CurrentResourceOwner to TopTransactionResourceOwner for executor calls, restore the original owner in PG_CATCH and after PG_END_TRY, and perform subtransaction cleanup within the delta branch.
Subscription apply_delay memory safety
src/spock_node.c
subscription_fromtuple() now deep-copies the apply_delay Interval into freshly allocated memory instead of storing a tuple-owned pointer.

Poem

🐇 I nibbled bytes and widened the store,
I wrapped the delta in a safe little door,
I copied the delay to a fresh-claimed bed,
Now replay and updates sleep safe in their stead.
Hooray — small hops, steady code! 🌿

🚥 Pre-merge checks | ✅ 3 | ❌ 2

❌ Failed checks (2 warnings)

Check name Status Explanation Resolution
Title check ⚠️ Warning The title 'Spoc 558' refers only to a ticket number and does not describe any actual change in the pull request. Use a descriptive title that summarizes the main changes, such as 'Fix subscriber crashes and memory issues in spock apply operations'.
Docstring Coverage ⚠️ Warning Docstring coverage is 66.67% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (3 passed)
Check name Status Explanation
Description check ✅ Passed The description clearly lists the three commits being cherry-picked with their purposes, directly relating to the changeset modifications.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch SPOC-558

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@codacy-production
Copy link
Copy Markdown

codacy-production Bot commented May 21, 2026

Up to standards ✅

🟢 Issues 0 issues

Results:
0 new issues

View in Codacy

🟢 Metrics 0 duplication

Metric Results
Duplication 0

View in Codacy

NEW Get contextual insights on your PRs based on Codacy's metrics, along with PR and Jira context, without leaving GitHub. Enable AI reviewer
TIP This summary will be updated as you push new changes.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
src/spock_apply_heap.c (1)

762-780: 💤 Low value

Consider renaming edata to avoid confusion with the common naming pattern.

Throughout this file, edata typically refers to ApplyExecutionData* (e.g., in spock_apply_heap_insert, spock_apply_heap_update, spock_apply_heap_delete). Using edata here for ErrorData* could cause confusion during maintenance.

Suggested rename
 				if (!is_insert)
 				{
-					ErrorData  *edata;
+					ErrorData  *errdata;

 					MemoryContextSwitchTo(MessageContext);
-					edata = CopyErrorData();
+					errdata = CopyErrorData();

-					if (edata->sqlerrcode == ERRCODE_UNIQUE_VIOLATION)
+					if (errdata->sqlerrcode == ERRCODE_UNIQUE_VIOLATION)
 					{
 						elog(spock_conflict_log_level,
 							 "CONFLICT: detected %s on %s.%s: %s",
 							 SpockConflictTypeName(SPOCK_CT_UPDATE_EXISTS),
 							 rel->nspname,
 							 RelationGetRelationName(rel->rel),
-							 edata->message);
+							 errdata->message);
 `#if` PG_VERSION_NUM >= 180000
 						spock_stat_report_subscription_conflict(
 							MyApplyWorker->subid, SPOCK_CT_UPDATE_EXISTS);
 `#endif`
 					}
-					FreeErrorData(edata);
+					FreeErrorData(errdata);
 				}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/spock_apply_heap.c` around lines 762 - 780, Rename the local ErrorData*
variable currently named edata in the error-handling block to a distinct name
(e.g., errdata or err_info) to avoid confusion with ApplyExecutionData*
variables named edata elsewhere; update all references in this block (the
CopyErrorData() call, edata->sqlerrcode, edata->message, and FreeErrorData()) to
the new name and ensure no other symbols (SpockConflictTypeName,
SPOCK_CT_UPDATE_EXISTS, spock_stat_report_subscription_conflict, MyApplyWorker,
spock_conflict_log_level, rel->nspname, RelationGetRelationName(rel->rel)) are
modified.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@src/spock_apply_heap.c`:
- Around line 762-780: Rename the local ErrorData* variable currently named
edata in the error-handling block to a distinct name (e.g., errdata or err_info)
to avoid confusion with ApplyExecutionData* variables named edata elsewhere;
update all references in this block (the CopyErrorData() call,
edata->sqlerrcode, edata->message, and FreeErrorData()) to the new name and
ensure no other symbols (SpockConflictTypeName, SPOCK_CT_UPDATE_EXISTS,
spock_stat_report_subscription_conflict, MyApplyWorker,
spock_conflict_log_level, rel->nspname, RelationGetRelationName(rel->rel)) are
modified.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 81c2b3c6-5b6f-4d58-a4f3-309d7a93e77f

📥 Commits

Reviewing files that changed from the base of the PR and between 3886058 and 6a52d10.

📒 Files selected for processing (3)
  • src/spock_apply.c
  • src/spock_apply_heap.c
  • src/spock_node.c

rasifr and others added 2 commits May 21, 2026 16:39
Strict ResourceOwner tracking raises a buffer-pin error when
ExecSimpleRelationUpdate acquires pins under BeginInternalSubTransaction
that are not released before ReleaseCurrentSubTransaction().

Redirect CurrentResourceOwner to TopTransactionResourceOwner for the
duration of the ExecSimpleRelationUpdate call so that TupleDesc refs and
buffer pins are registered under the top-level owner and survive the
subtransaction release.  Restore the saved resowner before releasing so
CommitSubTransaction's assertion holds.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Transactions with more than 2GB cause a crash in the subscriber.
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
src/spock_apply.c (1)

4475-4493: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Use a 64-bit spill-threshold expression too.

apply_replay_bytes is now uint64, but the limit on Line 4476 is still computed as spock_replay_queue_size * 1024L * 1024L. On 32-bit long builds that overflows above ~2047 MB before the comparison, so spill activation can still happen at the wrong boundary for large queue settings.

Suggested fix
-	if (!apply_replay_spilling && spock_replay_queue_size > 0 &&
-		apply_replay_bytes + msg->len > spock_replay_queue_size * 1024L * 1024L)
+	if (!apply_replay_spilling && spock_replay_queue_size > 0 &&
+		apply_replay_bytes + (uint64) msg->len >
+		(uint64) spock_replay_queue_size * UINT64CONST(1024) * UINT64CONST(1024))
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/spock_apply.c` around lines 4475 - 4493, The comparison uses a 32-bit
arithmetic expression for the threshold (spock_replay_queue_size * 1024L *
1024L) while apply_replay_bytes is uint64, which can overflow on 32-bit builds;
change the threshold computation to 64-bit by casting or using 64-bit constants
so the multiplication is done in uint64 (e.g., compute
(uint64)spock_replay_queue_size * 1024ULL * 1024ULL) and replace the old
expression in the if that checks apply_replay_bytes > threshold (the block
setting apply_replay_spilling and creating apply_replay_spill_file).
src/spock_apply_heap.c (1)

730-789: ⚠️ Potential issue | 🔴 Critical | ⚡ Quick win

Wrap the entire delta-apply subtransaction lifecycle in the local PG_TRY/PG_CATCH.

After BeginInternalSubTransaction("SpockDeltaApply"), only ExecSimpleRelationUpdate() is protected. An error in EvalPlanQualSetSlot(), SubTransactionIdSetCommitTsData(), or ReleaseCurrentSubTransaction() skips the restore path entirely, and the current PG_CATCH rethrows without RollbackAndReleaseCurrentSubTransaction(). In DISCARD/SUB_DISABLE flows this function already runs inside the caller's own internal subtransaction, so leaking the inner one will break later cleanup/assertions.

Suggested structure
-		if (is_delta_apply)
-		{
-			BeginInternalSubTransaction("SpockDeltaApply");
-			save_resowner = CurrentResourceOwner;
-			CurrentResourceOwner = TopTransactionResourceOwner;
-		}
-
-		EvalPlanQualSetSlot(epqstate, remoteslot);
-
 		PG_TRY();
 		{
+			if (is_delta_apply)
+			{
+				BeginInternalSubTransaction("SpockDeltaApply");
+				save_resowner = CurrentResourceOwner;
+				CurrentResourceOwner = TopTransactionResourceOwner;
+			}
+
+			EvalPlanQualSetSlot(epqstate, remoteslot);
+
 			ExecSimpleRelationUpdate(relinfo, estate, epqstate,
 									 localslot, remoteslot);
+
+			if (is_delta_apply)
+			{
+				CurrentResourceOwner = save_resowner;
+				SubTransactionIdSetCommitTsData(GetCurrentTransactionId(),
+												local_ts, local_origin);
+				ReleaseCurrentSubTransaction();
+			}
 		}
 		PG_CATCH();
 		{
 			if (is_delta_apply)
-				CurrentResourceOwner = save_resowner;
+			{
+				CurrentResourceOwner = save_resowner;
+				RollbackAndReleaseCurrentSubTransaction();
+			}
 
 			/*
 			 * existing unique-violation logging...
 			 */
 			PG_RE_THROW();
 		}
 		PG_END_TRY();
-
-		if (is_delta_apply)
-		{
-			CurrentResourceOwner = save_resowner;
-			SubTransactionIdSetCommitTsData(GetCurrentTransactionId(),
-											local_ts, local_origin);
-			ReleaseCurrentSubTransaction();
-		}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/spock_apply_heap.c` around lines 730 - 789, Begin and protect the entire
delta-apply subtransaction lifecycle (from
BeginInternalSubTransaction("SpockDeltaApply") through
SubTransactionIdSetCommitTsData() and ReleaseCurrentSubTransaction()) in a
single PG_TRY/PG_CATCH so any error in EvalPlanQualSetSlot(),
ExecSimpleRelationUpdate(), SubTransactionIdSetCommitTsData(), or
ReleaseCurrentSubTransaction() restores CurrentResourceOwner and correctly rolls
back/releases the subtransaction; specifically, move BeginInternalSubTransaction
and saving/restoring save_resowner outside/around the try block, ensure the
PG_CATCH calls RollbackAndReleaseCurrentSubTransaction() (or Release/Commit as
appropriate) before resetting CurrentResourceOwner, and only call
SubTransactionIdSetCommitTsData() and ReleaseCurrentSubTransaction() in the
PG_TRY success path (or commit path) to avoid leaking the inner subtransaction.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Outside diff comments:
In `@src/spock_apply_heap.c`:
- Around line 730-789: Begin and protect the entire delta-apply subtransaction
lifecycle (from BeginInternalSubTransaction("SpockDeltaApply") through
SubTransactionIdSetCommitTsData() and ReleaseCurrentSubTransaction()) in a
single PG_TRY/PG_CATCH so any error in EvalPlanQualSetSlot(),
ExecSimpleRelationUpdate(), SubTransactionIdSetCommitTsData(), or
ReleaseCurrentSubTransaction() restores CurrentResourceOwner and correctly rolls
back/releases the subtransaction; specifically, move BeginInternalSubTransaction
and saving/restoring save_resowner outside/around the try block, ensure the
PG_CATCH calls RollbackAndReleaseCurrentSubTransaction() (or Release/Commit as
appropriate) before resetting CurrentResourceOwner, and only call
SubTransactionIdSetCommitTsData() and ReleaseCurrentSubTransaction() in the
PG_TRY success path (or commit path) to avoid leaking the inner subtransaction.

In `@src/spock_apply.c`:
- Around line 4475-4493: The comparison uses a 32-bit arithmetic expression for
the threshold (spock_replay_queue_size * 1024L * 1024L) while apply_replay_bytes
is uint64, which can overflow on 32-bit builds; change the threshold computation
to 64-bit by casting or using 64-bit constants so the multiplication is done in
uint64 (e.g., compute (uint64)spock_replay_queue_size * 1024ULL * 1024ULL) and
replace the old expression in the if that checks apply_replay_bytes > threshold
(the block setting apply_replay_spilling and creating apply_replay_spill_file).

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 5d2d842c-bd5a-4dfd-8adf-814933a32928

📥 Commits

Reviewing files that changed from the base of the PR and between 6a52d10 and eedf764.

📒 Files selected for processing (2)
  • src/spock_apply.c
  • src/spock_apply_heap.c

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants