Open
Conversation
apache#18549) Reviewers: Ismael Juma <ismael@juma.me.uk>, Gaurav Narula <gaurav_narula2@apple.com>, TengYao Chi <kitingiao@gmail.com>
…pache#18543) Reviewers: Bruno Cadonna <bruno@confluent.io>, Alieh Saeedi <asaeedi@confluent.io>
Reviewers: Bill Bejeck <bill@confluent.io>
…ly (apache#18490) RocksDBTimeOrderedKeyValueBuffer is not initialize with serdes provides via Joined, but always uses serdes from StreamsConfig. Reviewers: Bill Bejeck <bill@confluent.io>
…n without records (apache#18448) Fix the issue where producer.commitTransaction under transaction version 2 throws error if no partition or offset is added to transaction. The solution is to avoid sending the endTxnRequest unless producer.send or producer.sendOffsetsToTransaction is triggered. Reviewers: Justine Olshan <jolshan@confluent.io>
Reviewers: Bill Bejeck <bill@confluent.io>
A prior commit introduced checking for the version of a node related to move to log4j2 but it was causing an error
AttributeError("'ClusterNode' object has no attribute 'version'") This PR uses the get_version method from version.py which checks if the Node has a version attribute preventing an error.
Reviewers: Matthias Sax <mjsax@apache.org>
Removed Optional for SharePartitionManager and ClientMetricsManager as zookeeper code is being removed. Also removed asScala and asJava conversion in KafkaApis.handleListClientMetricsResources, moved to java stream. Reviewers: Mickael Maison <mickael.maison@gmail.com>, Ismael Juma <ismael@juma.me.uk>, Chia-Ping Tsai <chia7712@gmail.com>
…fter removing zookeeper (apache#18365) This patch introduces a new page to document the configs and metrics that have been removed in the transition to 4.0. While these removed items lack a formal deprecation cycle as they are part of KIP-500, KIP-500 itself does not provide an exhaustive list of all impacted configs and metrics. Therefore, this new page aims to assist Kafka users in understanding the specific configs and metrics that have been removed in the 4.0 release. Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
The PR removes dependency of server module on share-coordinator, rather it should be other way. Moved the ShareCoordinatorConfig class from server to share-coordinator. Reviewers: Ismael Juma <ismael@juma.me.uk>, Chia-Ping Tsai <chia7712@gmail.com>
Reviewers: Ismael Juma <ismael@juma.me.uk>, Chia-Ping Tsai <chia7712@gmail.com>
…18414) In 4.0, there is no ZK mode and both of these configs are required in kraft mode. Reviewers: Ismael Juma <ismael@juma.me.uk>
Reviewers: Luke Chen <showuon@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>
Reviewers: Ismael Juma <ismael@juma.me.uk>, Divij Vaidya <diviv@amazon.com>
…8508) Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
…18535) Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
Reviewers: Luke Chen <showuon@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>
…EFAULT and NUM_RECOVERY_THREADS_PER_DATA_DIR_CONFIG (apache#18106) Reviewers: Divij Vaidya <diviv@amazon.com>
…P-1030 (apache#18140) Reviewers: Divij Vaidya <diviv@amazon.com>
…ns (apache#18568) Reviewers: Mickael Maison <mickael.maison@gmail.com>
…rtitionsMetadata, ZkConfigRepository, DelayedDeleteTopics (apache#18574) Reviewers: Mickael Maison <mickael.maison@gmail.com>
…rtitionsAssignedCallback (apache#18515) Reviewers: Lianet Magrans <lmagrans@confluent.io>
Reviewers: Andrew Schofield <aschofield@confluent.io>
Reviewers: Mickael Maison <mickael.maison@gmail.com>, Ismael Juma <ismael@juma.me.uk>
…pache#18565) Since the example.com DNS lookup changed the second time within one year, we rewrote the unit tests for ClientUtils so that they do not make a real DNS lookup to the outside but use mocks. Reviewers: PoAn Yang <payang@apache.org>, Chia-Ping Tsai <chia7712@gmail.com>, Lianet Magrans <lmagrans@confluent.io>
Remove KafkaController and related unused references: * ControllerChannelContext * ControllerChannelManager * ControllerEventManager * ControllerState * PartitionStateMachine * ReplicaStateMachine * TopicDeletionManager * ZkBrokerEpochManager Reviewers: Ismael Juma <ismael@juma.me.uk>
…apache#18224) Reviewers: David Jacot <djacot@confluent.io>
…#18406) Add some logs when offline/online happens. Reviewers: David Jacot <djacot@confluent.io>
…pache#19146) Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
In "Upgrading to 4.0.0 from any version 0.8.x through 3.9.x" section, we directly give instructions about [Upgrading to KRaft-based clusters](https://kafka.apache.org/documentation/#upgrade_390_kraft), but there might still be some users under ZK cluster before upgrading to v4.0.0. We need to make it clear that they need to upgrade to KRaft mode first before upgrading to v4.0.0 in "Upgrading to 4.0.0 from any version 0.8.x through 3.9.x" section. Reviewers: TengYao Chi <kitingiao@gmail.com>, Ken Huang <s7133700@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>
Add new section for Kafka 4.0 compatibility metrics for user, so user can check server and client in this section. Reviewers: Divij Vaidya <diviv@amazon.com>, Matthias J. Sax <matthias@confluent.io>
This patch adds a section about upgrading clients to the upgrade notes. Reviewers: Ismael Juma <ismael@juma.me.uk>, David Jacot <djacot@confluent.io>
Skip kraft.version when applying FeatureLevelRecord records. The kraft.version is stored as control records and not as metadata records. This solution has the benefits of removing from snapshots any FeatureLevelRecord for kraft.version that was incorrectly written to the log and allows ApiVersions to report the correct finalized kraft.version. Reviewers: Colin P. McCabe <cmccabe@apache.org>
…ion (apache#19164) Fixes two issues: - only commit TX if no revoked tasks need to be committed - commit revoked tasks after punctuation triggered Reviewers: Lucas Brutschy <lbrutschy@confluent.io>, Anna Sophie Blee-Goldman <sophie@responsive.dev>, Bruno Cadonna <bruno@confluent.io>, Bill Bejeck <bill@confluent.io>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Kafka Data Replication Project
Problem Statement
Overview
This project implements fault-tolerant data replication between two Kafka clusters using an enhanced MirrorMaker 2 (MM2) connector. The solution addresses two critical failure modes that vanilla MM2 does not handle:
Log Truncation: When Kafka's retention policy deletes messages before MM2 replicates them, vanilla MM2 silently skips those messages, leaving the disaster recovery (DR) cluster with permanent, invisible gaps in its Write-Ahead Log (WAL).
Solution: Added
checkForLogTruncation()method called at the top of everypoll()cycle:Why fail-fast? Log truncation represents irrecoverable data loss. The DR cluster is now permanently incomplete. Continuing replication would create a DR cluster that appears healthy but cannot fully reconstruct system state on failover.
Topic Reset: When an operator deletes and recreates a source topic, vanilla MM2 crashes with
OffsetOutOfRangeExceptionor silently skips new messages because its stored checkpoint offsets no longer exist.Solution: Two complementary detection mechanisms:
Leader Epoch Regression: Every
ConsumerRecordcarries the partition's leader epoch. A backward jump triggersOffsetOutOfRangeException Fallback: Catches resets while MM2 was paused:
Why auto-recover? A topic reset is deliberate. The operator chose to wipe the source. The correct response is to replicate the new topic from the beginning.