release-25.4: server: wait for full replication in TestCheckRestartSafe_Criticality#170133
Conversation
The test asserts zero under-replicated ranges before draining, but doesn't wait for the cluster to finish initial replication after startup. Under CI resource pressure, ranges may still be up-replicating when the assertion fires. Also remove a duplicate assertion. Fixes: cockroachdb#167850 Epic: none Release note: None Co-Authored-By: roachdev-claude <roachdev-claude-bot@cockroachlabs.com>
131b342 to
5744a0e
Compare
|
Merging to
After your PR is submitted to the merge queue, this comment will be automatically updated with its status. If the PR fails, failure details will also be posted here |
|
Thanks for opening a backport. Before merging, please confirm that it falls into one of the following categories (select one):
Add a brief release justification to the PR description explaining your selection. Also, confirm that the change does not break backward compatibility and complies with all aspects of the backport policy. All backports must be reviewed by the TL and EM for the owning area. |
|
Detected infrastructure failure (matched: self-hosted runner lost communication with the server). Automatically rerunning failed jobs. (run link) |
|
Reminder: it has been 2 weeks please merge or close your backport! |
Backport 1/1 commits from #169990 on behalf of @dhartunian.
The test asserts zero under-replicated ranges before draining, but
doesn't wait for the cluster to finish initial replication after
startup. Under CI resource pressure (the failing run logged "disk
slowness detected: unable to sync log files within 10s"), ranges may
still be up-replicating when the assertion fires, causing 78
under-replicated ranges where zero were expected.
Add
WaitForFullReplication()after cluster start, matching thepattern already used by
TestCheckRestartSafe_RangeStatusand othertests in the same file. Also remove a duplicate assertion.
Fixes: #167850
Epic: none
Release note: None
Release justification: