Skip to content

kvcoord: deflake TestTxnCoordSenderRetriesAcrossEndTxn#170836

Open
pav-kv wants to merge 3 commits into
cockroachdb:masterfrom
pav-kv:fix-170829
Open

kvcoord: deflake TestTxnCoordSenderRetriesAcrossEndTxn#170836
pav-kv wants to merge 3 commits into
cockroachdb:masterfrom
pav-kv:fix-170829

Conversation

@pav-kv
Copy link
Copy Markdown
Collaborator

@pav-kv pav-kv commented May 22, 2026

The first subtest asserted ExpectPusheeTxnRecordNotFound, which requires the push to find no txn record at all. However, the heartbeat loop can race and write a PENDING txn record during CommitInBatch execution, causing the push to find that record and fail the assertion.

Replace with the new ExpectNoTxnRecovery expectation, which verifies the actual invariant: no STAGING record was found and no transaction recovery was performed. This tolerates a PENDING record from the heartbeat loop while still checking that the txn was not implicitly committed.

Fixes-25.4 #170829
Epic: none

pav-kv and others added 3 commits May 22, 2026 22:06
…iesAcrossEndTxn

Add a sleep during the first CPut attempt to give the heartbeat loop
time to write a txn record before CommitInBatch completes. This
reliably reproduces the race where CheckPushResult finds an existing
txn record instead of getting "pushee txn record not found".

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The first subtest asserted ExpectPusheeTxnRecordNotFound, which requires
the push to find no txn record at all. However, the heartbeat loop can
race and write a PENDING txn record during CommitInBatch execution,
causing the push to find that record and fail the assertion.

Replace with the new ExpectNoTxnRecovery expectation, which verifies the
actual invariant: no STAGING record was found and no transaction recovery
was performed. This tolerates a PENDING record from the heartbeat loop
while still checking that the txn was not implicitly committed.

Fixes: cockroachdb#170829
Release note: None

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@pav-kv pav-kv requested a review from arulajmani May 22, 2026 21:17
@pav-kv pav-kv requested a review from a team as a code owner May 22, 2026 21:17
@pav-kv pav-kv added the backport-all Flags PRs that need to be backported to all supported release branches label May 22, 2026
@trunk-io
Copy link
Copy Markdown
Contributor

trunk-io Bot commented May 22, 2026

Merging to master in this repository is managed by Trunk.

  • To merge this pull request, check the box to the left or comment /trunk merge below.

After your PR is submitted to the merge queue, this comment will be automatically updated with its status. If the PR fails, failure details will also be posted here

@cockroach-teamcity
Copy link
Copy Markdown
Member

This change is Reviewable

@pav-kv
Copy link
Copy Markdown
Collaborator Author

pav-kv commented May 22, 2026

@arulajmani I'll squash the first and last commit out when merging, this is just a repro.

@pav-kv pav-kv requested a review from stevendanna May 22, 2026 21:20
// transaction recovery. This is weaker than ExpectPusheeTxnRecordNotFound: it
// tolerates a PENDING txn record (e.g. written by the heartbeat loop) but
// verifies no STAGING record was found and recovered.
ExpectNoTxnRecovery
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@arulajmani @stevendanna Please verify the wording here and all places. I don't have much intuition here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backport-all Flags PRs that need to be backported to all supported release branches

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants