Oops! Something went wrong while submitting the form.
We use cookies to improve your browsing experience on our website, to show you personalised content and to analize our website traffic. By browsing our website, you consent to our use of cookies. Read privacy policy.
We are covering how Kafka MirrorMaker operates, how to set it up, and how to test mirror data.
MirrorMaker 2.0 is the new replication feature of Kafka 2.4, defined as part of the Kafka Improvement Process - KIP 382. Kafka MirrorMaker 2 is designed to replicate or mirror topics from one Kafka cluster to another. It uses the Kafka Connect framework to simplify the configuration and scaling. MirrorMaker dynamically detects changes to source topics and ensures source and target topic properties are synchronized, including topic data, offsets, and partitions. The topic, together with topic data, offsets, and partitions, is replicated in the target cluster when a new topic is created in the source cluster.
Use Cases
Disaster Recovery
Though Kafka is highly distributed and provides a high level of fault tolerance, disasters can still happen, and data can still become temporarily unavailable—or lost altogether. The best way to mitigate the risks is to have a copy of your data in another Kafka cluster in a different data center. MirrorMaker translates and syncs consumer offsets to the target cluster. That way, we can switch clients to it relatively seamlessly, moving to an alternative deployment on the fly with minor or no service interruptions.
Closer Read / Writes
Kafka producer clients often prefer to write locally to achieve low latency, but business requirements demand the data be read by different consumers, often deployed in multiple regions. This can easily make deployments complex due to VPC peering. MirrorMaker can handle all complex replication, making it easier to write and read local mechanisms.
Data Analytics
Aggregation is also a factor in data pipelines, which might require the consolidation of data from regional Kafka clusters into a single one. That aggregate cluster then broadcasts that data to other clusters and/or data systems for analysis and visualization.
Supported Topologies
Active/Passive or Active/Standby high availability deployments - (ClusterA => ClusterB)
Active/Active HA Deployment - (ClusterA => ClusterB and ClusterB => ClusterA)
Aggregation (e.g., from many clusters to one): (ClusterA => ClusterK, ClusterB => ClusterK, ClusterC => ClusterK)
Mirrors Topic and Topic Configuration - Detects and mirrors new topics and config changes automatically, including the number of partitions and replication factors.
Mirrors ACLs - Mirrors Topic ACLs as well, though we found issues in replicating WRITE permission. Also, replicated topics often contain source cluster names as a prefix, which means existing ACLs need to be tweaked, or ACL replication may need to be managed externally if the topologies are more complex.
Mirrors Consumer Groups and Offsets - Seamlessly translates and syncs Consumer Group Offsets to target clusters to make it easier to switch from one cluster to another in case of disaster.
Ability to Update MM2 Config Dynamically - MirrorMaker is backed by Kafka Connect Framework, which provides REST APIs through which MirrorMaker configurations like replicating new topics, stopping replicating certain topics, etc. can be updated without restarting the cluster.
Fault-Tolerant and Horizontally Scalable Operations - The number of processes can be scaled horizontally to increase performance.
How Kafka MirrorMaker 2 Works
MirrorMaker uses a set of standard Kafka connectors. Each connector has its own role. The listing of connectors and their functions is provided below.
MirrorSourceConnector: Replicates topics, topic ACLs, and configs from the source cluster to the target cluster.
MirrorCheckpointConnector: Syncs consumer offsets, emits checkpoints, and enables failover.
MirrorHeartBeatConnector: Checks connectivity between the source and target clusters.
MirrorMaker Running Modes
There are three ways to run MirrorMaker:
As a dedicated MirrorMaker cluster (can be distributed with multiple replicas having the same config): In this mode, MirrorMaker does not require an existing Connect cluster. Instead, a high-level driver manages a collection of Connect workers.
As a standalone Connect worker: In this mode, a single Connect worker runs MirrorSourceConnector. This does not support multi-clusters, but it’s useful for small workloads or for testing.
In legacy mode, using existing MirrorMaker scripts: After legacy MirrorMaker is deprecated, the existing ./bin/kafka-mirror-maker.sh scripts will be updated to run MM2 in legacy mode:
Setting up MirrorMaker 2
We recommend running MirrorMaker as a dedicated MirrorMaker cluster since it does not require an existing Connect cluster. Instead, a high-level driver manages a collection of Connect workers. The cluster can be easily converted to a distributed cluster just by adding multiple replicas of the same configuration. A distributed cluster is required to reduce the load on a single node cluster and also to increase MirrorMaker throughput.
Prerequisites
Docker
Docker Compose
Steps to Set Up MirrorMaker 2
Set up a single node source, target Kafka cluster, and a MirrorMaker node to run MirrorMaker 2.
2. Run the below command to start the Kafka clusters and the MirrorMaker Docker container:
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
5. Monitor the logs of the MirrorMaker container—it should be something like this:
[2024-02-05 04:07:39,450] INFO [MirrorCheckpointConnector|task-0] sync idle consumer group offset from source to target took 0 ms (org.apache.kafka.connect.mirror.Scheduler:95)
[2024-02-05 04:07:49,246] INFO [MirrorCheckpointConnector|worker] refreshing consumer groups took 1 ms (org.apache.kafka.connect.mirror.Scheduler:95)
[2024-02-05 04:07:49,337] INFO [MirrorSourceConnector|worker] refreshing topics took 3 ms (org.apache.kafka.connect.mirror.Scheduler:95)
[2024-02-05 04:07:49,450] INFO [MirrorCheckpointConnector|task-0] refreshing idle consumers group offsets at target cluster took 2 ms (org.apache.kafka.connect.mirror.Scheduler:95)
6. Create a topic at the source cluster:
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
9. Check whether the topic is mirrored in the target cluster.
Note: The mirrored topic will have a source cluster name prefix to be able to identify which source cluster the topic is mirrored from.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
10. Consume 5 messages from the source kafka cluster:
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
11. Describe the consumer group at the source and destination to verify that consumer offsets are also mirrored:
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
12. Consume five messages from the target Kafka cluster. The messages should start from the committed offset in the source cluster. In this case, the message offset will start at 6.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
We’ve seen how to set up MirrorMaker 2.0 in a dedicated instance. This running mode does not need a running Connect cluster as it leverages a high-level driver that creates a set of Connect workers based on the MirrorMaker properties configuration file.
Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.
A Comprehensive Guide to Unlock Kafka MirrorMaker 2.0
Overview
We are covering how Kafka MirrorMaker operates, how to set it up, and how to test mirror data.
MirrorMaker 2.0 is the new replication feature of Kafka 2.4, defined as part of the Kafka Improvement Process - KIP 382. Kafka MirrorMaker 2 is designed to replicate or mirror topics from one Kafka cluster to another. It uses the Kafka Connect framework to simplify the configuration and scaling. MirrorMaker dynamically detects changes to source topics and ensures source and target topic properties are synchronized, including topic data, offsets, and partitions. The topic, together with topic data, offsets, and partitions, is replicated in the target cluster when a new topic is created in the source cluster.
Use Cases
Disaster Recovery
Though Kafka is highly distributed and provides a high level of fault tolerance, disasters can still happen, and data can still become temporarily unavailable—or lost altogether. The best way to mitigate the risks is to have a copy of your data in another Kafka cluster in a different data center. MirrorMaker translates and syncs consumer offsets to the target cluster. That way, we can switch clients to it relatively seamlessly, moving to an alternative deployment on the fly with minor or no service interruptions.
Closer Read / Writes
Kafka producer clients often prefer to write locally to achieve low latency, but business requirements demand the data be read by different consumers, often deployed in multiple regions. This can easily make deployments complex due to VPC peering. MirrorMaker can handle all complex replication, making it easier to write and read local mechanisms.
Data Analytics
Aggregation is also a factor in data pipelines, which might require the consolidation of data from regional Kafka clusters into a single one. That aggregate cluster then broadcasts that data to other clusters and/or data systems for analysis and visualization.
Supported Topologies
Active/Passive or Active/Standby high availability deployments - (ClusterA => ClusterB)
Active/Active HA Deployment - (ClusterA => ClusterB and ClusterB => ClusterA)
Aggregation (e.g., from many clusters to one): (ClusterA => ClusterK, ClusterB => ClusterK, ClusterC => ClusterK)
Mirrors Topic and Topic Configuration - Detects and mirrors new topics and config changes automatically, including the number of partitions and replication factors.
Mirrors ACLs - Mirrors Topic ACLs as well, though we found issues in replicating WRITE permission. Also, replicated topics often contain source cluster names as a prefix, which means existing ACLs need to be tweaked, or ACL replication may need to be managed externally if the topologies are more complex.
Mirrors Consumer Groups and Offsets - Seamlessly translates and syncs Consumer Group Offsets to target clusters to make it easier to switch from one cluster to another in case of disaster.
Ability to Update MM2 Config Dynamically - MirrorMaker is backed by Kafka Connect Framework, which provides REST APIs through which MirrorMaker configurations like replicating new topics, stopping replicating certain topics, etc. can be updated without restarting the cluster.
Fault-Tolerant and Horizontally Scalable Operations - The number of processes can be scaled horizontally to increase performance.
How Kafka MirrorMaker 2 Works
MirrorMaker uses a set of standard Kafka connectors. Each connector has its own role. The listing of connectors and their functions is provided below.
MirrorSourceConnector: Replicates topics, topic ACLs, and configs from the source cluster to the target cluster.
MirrorCheckpointConnector: Syncs consumer offsets, emits checkpoints, and enables failover.
MirrorHeartBeatConnector: Checks connectivity between the source and target clusters.
MirrorMaker Running Modes
There are three ways to run MirrorMaker:
As a dedicated MirrorMaker cluster (can be distributed with multiple replicas having the same config): In this mode, MirrorMaker does not require an existing Connect cluster. Instead, a high-level driver manages a collection of Connect workers.
As a standalone Connect worker: In this mode, a single Connect worker runs MirrorSourceConnector. This does not support multi-clusters, but it’s useful for small workloads or for testing.
In legacy mode, using existing MirrorMaker scripts: After legacy MirrorMaker is deprecated, the existing ./bin/kafka-mirror-maker.sh scripts will be updated to run MM2 in legacy mode:
Setting up MirrorMaker 2
We recommend running MirrorMaker as a dedicated MirrorMaker cluster since it does not require an existing Connect cluster. Instead, a high-level driver manages a collection of Connect workers. The cluster can be easily converted to a distributed cluster just by adding multiple replicas of the same configuration. A distributed cluster is required to reduce the load on a single node cluster and also to increase MirrorMaker throughput.
Prerequisites
Docker
Docker Compose
Steps to Set Up MirrorMaker 2
Set up a single node source, target Kafka cluster, and a MirrorMaker node to run MirrorMaker 2.
2. Run the below command to start the Kafka clusters and the MirrorMaker Docker container:
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
5. Monitor the logs of the MirrorMaker container—it should be something like this:
[2024-02-05 04:07:39,450] INFO [MirrorCheckpointConnector|task-0] sync idle consumer group offset from source to target took 0 ms (org.apache.kafka.connect.mirror.Scheduler:95)
[2024-02-05 04:07:49,246] INFO [MirrorCheckpointConnector|worker] refreshing consumer groups took 1 ms (org.apache.kafka.connect.mirror.Scheduler:95)
[2024-02-05 04:07:49,337] INFO [MirrorSourceConnector|worker] refreshing topics took 3 ms (org.apache.kafka.connect.mirror.Scheduler:95)
[2024-02-05 04:07:49,450] INFO [MirrorCheckpointConnector|task-0] refreshing idle consumers group offsets at target cluster took 2 ms (org.apache.kafka.connect.mirror.Scheduler:95)
6. Create a topic at the source cluster:
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
9. Check whether the topic is mirrored in the target cluster.
Note: The mirrored topic will have a source cluster name prefix to be able to identify which source cluster the topic is mirrored from.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
10. Consume 5 messages from the source kafka cluster:
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
11. Describe the consumer group at the source and destination to verify that consumer offsets are also mirrored:
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
12. Consume five messages from the target Kafka cluster. The messages should start from the committed offset in the source cluster. In this case, the message offset will start at 6.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
We’ve seen how to set up MirrorMaker 2.0 in a dedicated instance. This running mode does not need a running Connect cluster as it leverages a high-level driver that creates a set of Connect workers based on the MirrorMaker properties configuration file.
Velotio Technologies is an outsourced software product development partner for top technology startups and enterprises. We partner with companies to design, develop, and scale their products. Our work has been featured on TechCrunch, Product Hunt and more.
We have partnered with our customers to built 90+ transformational products in areas of edge computing, customer data platforms, exascale storage, cloud-native platforms, chatbots, clinical trials, healthcare and investment banking.
Since our founding in 2016, our team has completed more than 90 projects with 220+ employees across the following areas:
Building web/mobile applications
Architecting Cloud infrastructure and Data analytics platforms