Skip to content

Conversation

@oneby-wang
Copy link
Contributor

@oneby-wang oneby-wang commented Dec 28, 2025

Motivation

Since nextValidLedger:-1 is a valid markDeletePosition, we should move newPosition to nextValidLedger:-1 to avoid cursor position and ledger inconsistency in ManagedCursorImpl whenever possible.

PR #25087 also made this change.

Modifications

  1. Modify ManagedCursorImpl.asyncMarkDelete() method, move newPosition to nextValidLedger:-1 if we reach the following conditions:
    a. if lastConfirmedEntry >= newPosition, and next ledger exists, and current ledger entries are all consumed.
    b. if lastConfirmedEntry < newPosition, next ledger exists, and newPosition == nextValidLedger:-1, and current ledger entries are all consumed.

I think the previous code might have a little problem. If newPosition == nextValidLedger:n, n is an non-negative number, we might set new newPosition to nextValidLedger:n which is greater than lastConfirmedEntry.

Position newPosition = ackBatchPosition(position);
if (ledger.getLastConfirmedEntry().compareTo(newPosition) < 0) {
boolean shouldCursorMoveForward = false;
try {
long ledgerEntries = ledger.getLedgerInfo(markDeletePosition.getLedgerId()).get().getEntries();
Long nextValidLedger = ledger.getNextValidLedger(ledger.getLastConfirmedEntry().getLedgerId());
shouldCursorMoveForward = nextValidLedger != null
&& (markDeletePosition.getEntryId() + 1 >= ledgerEntries)
&& (newPosition.getLedgerId() == nextValidLedger);
} catch (Exception e) {
log.warn("Failed to get ledger entries while setting mark-delete-position", e);
}

And snapshot positions into a local variable to avoid race condition.

  1. Add testAsyncMarkDeleteMoveToNextLedgerInNonRolloverScenario, testAsyncMarkDeleteMayMoveToNextLedgerInRolloverScenario, testAsyncMarkDeleteMoveToNextLedgerOneByOne, testAsyncMarkDeleteNextLedgerMinusOneEntryIdPosition tests in ManagedCursorImpl to verify the code change.

  2. Fix tests due to this PR's code change.

  3. Fix some flaky tests introduced by PR [fix][broker] Fix cursor position persistence in ledger trimming #25087.

Verifying this change

  • Make sure that the change passes the CI checks.

Does this pull request potentially affect one of the following parts:

If the box was checked, please highlight the changes

  • Dependencies (add or upgrade a dependency)
  • The public API
  • The schema
  • The default values of configurations
  • The threading model
  • The binary protocol
  • The REST endpoints
  • The admin CLI options
  • The metrics
  • Anything that affects deployment

Documentation

  • doc
  • doc-required
  • doc-not-needed
  • doc-complete

Matching PR in forked repository

PR in forked repository: oneby-wang#19

…position and ledger inconsistency in ManagedCursorImpl
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

doc-not-needed Your PR changes do not impact docs

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant