Skip to content

MM2: Add offset gap detection and basic topic reset handling in MirrorSourceTask#21831

Draft
PoojanSmart wants to merge 1 commit intoapache:trunkfrom
PoojanSmart:mm2-offset-recovery
Draft

MM2: Add offset gap detection and basic topic reset handling in MirrorSourceTask#21831
PoojanSmart wants to merge 1 commit intoapache:trunkfrom
PoojanSmart:mm2-offset-recovery

Conversation

@PoojanSmart
Copy link
Copy Markdown

Summary

This PR solves enhancements that tries to handle data inconsistencies for MM2 Cluster topic replication-

  • Fail-fast detection for offset gaps (log truncation)
  • Basic handling for topic reset scenarios (delete + recreate)

Problem

MM2 Currently does not handle following scenarios:

  1. Silent data loss (fail fast) - If source data is deleted due to retention, mm2 continues to replicate without detecting the missing offsets
  2. Topic reset (delete + recreate) - If topic in primary cluster is deleted and recreated, mm2 does not automatically reset the offset from beginning and continues to the previous chain which makes the replication offset tracking invalid

Solution

For both of the above challenges, MirrorSourceTask is modified. Following are modifications:

  1. Offset gap detection:

    • Added validation for each topic+partition combination while ignoring internal topics where newoffset should be previousoffset+1
    • Throwing exception when the above condition is not satisfied and can make replication invalid
    • Added appropriate error logs
  2. Topic reset handling:

    • Added validation where if previous offset was >= 0 and now received offset 0 again, it detects it as a topic reset and the consumer updates it current offset
    • Added appropriate warning log

@github-actions github-actions Bot added triage PRs from the community connect mirror-maker-2 small Small PRs labels Mar 20, 2026
@github-actions
Copy link
Copy Markdown

A label of 'needs-attention' was automatically added to this PR in order to raise the
attention of the committers. Once this issue has been triaged, the triage label
should be removed to prevent this automation from happening again.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant