Skip to Main Content

MirrorMaker vs. Confluent Replicator: A Deep Dive into Kafka Data Replication

Discover AutoMQ: a 100% Kafka-compatible, cloud-native solution offering scalable and cost-efficient data replication without cross-AZ traffic, reducing Kafka expenses by 50%+.

MirrorMaker vs. Confluent Replicator: A Deep Dive into Kafka Data Replication

Overview

Apache Kafka has become the backbone of real-time data streaming for countless organizations. As data volumes grow and the need for geographically distributed systems, disaster recovery, and data sharing across different environments increases, robust data replication between Kafka clusters is crucial. Two popular solutions for this task are Apache Kafka's own MirrorMaker 2 (MM2) and Confluent Replicator . This blog post provides a comprehensive comparison to help you understand their concepts, architecture, features, and best practices.


Core Concepts and Architecture

Understanding how these tools are built and operate is key to choosing the right one for your needs.

Apache Kafka MirrorMaker 2 (MM2)

MirrorMaker 2 was introduced as a significant improvement over the original MirrorMaker (MM1) and is designed to replicate data and topic configurations between Kafka clusters [1]. It is built upon the Kafka Connect framework, which provides a scalable and fault-tolerant way to stream data in and out of Kafka [2].

MM2 employs a set of Kafka Connect connectors to perform its tasks:

  • MirrorSourceConnector : This connector fetches data from topics in the source Kafka cluster and produces it to the target Kafka cluster. It also handles the replication of topic configurations and ACLs [2].

  • MirrorCheckpointConnector : This connector emits "checkpoints" that track consumer group offsets in the source cluster. These checkpoints are crucial for translating and synchronizing consumer group offsets to the target cluster, enabling consumers to resume processing from the correct point after a failover or migration [3].

  • MirrorHeartbeatConnector : This connector emits heartbeats to both source and target clusters. These heartbeats can be used to monitor the health and connectivity of the replication flow and ensure that MM2 instances are active [3].

MM2 creates several internal topics in both source and target clusters to manage its operations, such as mm2-offset-syncs.<source-cluster-alias>.internal , <source-cluster-alias>.checkpoints.internal , and heartbeats [2]. By default, MM2 renames topics in the target cluster by prefixing them with the source cluster's alias (e.g., sourceClusterAlias.myTopic ). This helps prevent topic name collisions and aids in routing, especially in complex multi-cluster topologies [4]. This behavior can be overridden using the IdentityReplicationPolicy if identical topic names are required across clusters [5].

Confluent Replicator

Confluent Replicator is a commercial offering from Confluent, designed for robust, enterprise-grade replication between Kafka clusters. Like MM2, it is also built on the Kafka Connect framework and runs as a set of connectors within a Kafka Connect cluster, typically deployed near the destination Kafka cluster [6].

Key architectural aspects of Confluent Replicator include:

  • Data and Metadata Replication : Replicator copies messages, topic configurations (including partition counts and replication factors, with some caveats), and consumer group offset translations [6].

  • Schema Registry Integration : A significant feature of Replicator is its integration with Confluent Schema Registry. It can migrate schemas associated with topics and handle schema translation. For Confluent Platform 7.0.0 and later, Confluent recommends Cluster Linking for schema migration over Replicator's schema translation feature for certain use cases, though Replicator still supports schema migration, especially useful for older platform versions or specific scenarios [7]. When migrating schemas, Replicator can be configured with modes like READONLY on the source and IMPORT on the destination for continuous migration [8].

  • Provenance Headers : To prevent circular replication in active-active or bi-directional setups, Replicator (version 5.0.1+) automatically adds provenance headers to messages, allowing it to identify and drop messages that have already been replicated, thus avoiding infinite loops [9].

  • Licensing : Replicator is a proprietary, licensed component of the Confluent Platform [10].


Feature Comparison: MirrorMaker 2 vs. Confluent Replicator

Let's compare these tools across several key features:

Feature
Apache Kafka MirrorMaker 2 (MM2)
Confluent Replicator
Underlying Framework
Kafka Connect
Kafka Connect
Licensing
Open-source (Apache 2.0 License)
Commercial (Part of Confluent Platform subscription) [10]
Topic Configuration Sync
Yes, syncs topic configurations (e.g., partitions, replication factor with caveats). Can be enabled/disabled (sync.topic.configs.enabled) [2]. Some limitations exist on exact RF matching if target brokers are fewer than source RF.
Yes, copies topic configurations. Can ensure partition counts and replication factors match (if destination cluster capacity allows) [6].
ACL Sync
Yes, syncs topic ACLs. Can be enabled/disabled (sync.topic.acls.enabled) [11]. Limitations include not creating service accounts in the target and downgrading ALL permissions to read-only for some managed Kafka offerings [12].
Yes, leverages Kafka security and requires appropriate ACLs/RBAC for its operations [7]. ACLs themselves are typically managed at the cluster level rather than directly replicated as metadata by Replicator in the same way MM2 does.
Consumer Offset Sync
Yes, via MirrorCheckpointConnector and OffsetSync internal topic. sync.group.offsets.enabled=true (Kafka 2.7+) allows direct writing of translated offsets to __consumer_offsets in the target [3].
Yes, translates consumer offsets (primarily for Java consumers using standard offset commit mechanisms) and writes them to __consumer_offsets in the destination [6].
Topic Renaming/Prefixing
Yes, prefixes topics with source cluster alias by default (DefaultReplicationPolicy). IdentityReplicationPolicy for no prefixing [4, 5].
Yes, supports topic renaming using topic.rename.format which can use variables like ${topic}. Can also implement prefixing/suffixing [13].
Schema Registry Integration
No direct integration. Schemas must be managed independently on source and target Schema Registries. Some managed Kafka services using MM2 also explicitly state that schemas are not synced by their MM2 offering [14].
Yes, tight integration with Confluent Schema Registry. Supports schema migration and translation (e.g., using DefaultSubjectTranslator or custom translators) [8, 13]. Handles schema ID mapping.
Loop Prevention
Primarily through default topic prefixing. IdentityReplicationPolicy in bi-directional setups requires careful design to avoid loops [4].
Built-in via provenance headers (Replicator 5.0.1+) [9].
Data Consistency
Generally provides at-least-once semantics for cross-cluster replication due to the asynchronous nature of offset commits relative to data replication [15]. Some managed services offer configurations aiming for exactly-once semantics (EOS) with specific flags [16].
Provides at-least-once delivery semantics [7].
Monitoring
Standard Kafka Connect JMX metrics. Heartbeats can be used for liveness. Monitoring via tools that consume JMX metrics [1].
Extensive monitoring via Confluent Control Center (C3), including latency, message rates, and lag. Exposes JMX metrics and has a Replicator Monitoring Extension REST API [17].
Configuration Management
Via Kafka Connect worker configuration files or REST API if KIP-710 enhancements are used for a dedicated MM2 cluster with REST enabled [18].
Via Kafka Connect worker configuration files or REST API. Rich set of configuration options specific to Replicator [13].
Ease of Use & Setup
Can be complex to configure optimally, especially for advanced scenarios like active-active. Requires understanding of Kafka Connect.
Can be simpler for common use cases if using Confluent Platform due to integration with Control Center and defined configurations. Still requires Kafka Connect knowledge.
Multi-DC Topologies
Supports various topologies like hub-spoke and DR. Active-active requires careful planning to manage offsets and potential re-consumption [19].
Designed for multi-DC deployments, including active-passive, active-active, and aggregation [6]. Provenance headers simplify active-active.

How They Work: Data Flow and Offset Management

MirrorMaker 2

  1. Data Replication : The MirrorSourceConnector reads messages from whitelisted topics in the source cluster. It produces these messages to topics in the target cluster (prefixed by default).

  2. Configuration Sync : The MirrorSourceConnector also periodically checks for new topics or configuration changes (if sync.topic.configs.enabled=true ) and ACL changes (if sync.topic.acls.enabled=true ) in the source cluster and applies them to the target cluster [2, 11].

  3. Offset Tracking & Sync :

    • The MirrorSourceConnector emits OffsetSync records to an internal mm2-offset-syncs.<source-cluster-alias>.internal topic. These records contain information about native consumer offsets and their corresponding replicated message offsets [2].

    • The MirrorCheckpointConnector consumes these OffsetSync records. It translates the source consumer group offsets to their equivalent in the target cluster.

    • If sync.group.offsets.enabled=true (available since Kafka 2.7+), the MirrorCheckpointConnector writes these translated offsets directly into the __consumer_offsets topic in the target cluster. This allows consumers in the target cluster to pick up from where their counterparts left off in the source cluster [3].

    • The MirrorHeartbeatConnector periodically sends heartbeats to confirm connectivity and active replication [3].

Confluent Replicator

  1. Data Replication : Replicator's source connector reads messages from specified topics in the source cluster. It preserves message timestamps by default and produces messages to the target cluster. If configured, it adds provenance headers.

  2. Topic Management : Replicator can automatically create topics in the destination cluster if they don't exist, attempting to match the source topic's partition count and replication factor (if topic.auto.create.enabled=true and destination capacity allows) [6]. It can also rename topics using topic.rename.format [13].

  3. Schema Migration : If integrated with Schema Registry, Replicator reads schemas from the source registry and writes them to the destination registry, handling subject name translation if topic.rename.format is used and an appropriate schema.subject.translator.class is configured [8, 13]. How it handles ongoing schema evolution during active replication is less explicitly detailed in public documentation but relies on the destination Schema Registry's compatibility rules.

  4. Offset Translation : Replicator monitors committed consumer offsets in the source cluster. It translates these offsets to their corresponding offsets in the target cluster, typically based on timestamps, and writes them to the __consumer_offsets topic in the destination cluster. This is primarily for Kafka clients (Java) using the standard consumer offset commit mechanisms [6].


Common Issues and Considerations

  • Data Duplication (At-Least-Once Semantics) : Both MM2 and Replicator generally provide at-least-once delivery. This means that in certain failure scenarios (e.g., a Replicator or MM2 task failing after producing messages but before committing its source consumer offsets), messages might be re-replicated, leading to duplicates in the target cluster [15]. Applications consuming from replicated topics should ideally be idempotent or have deduplication logic.

  • Configuration Complexity :

    • MM2 : Fine-tuning MM2 for optimal performance and reliability (e.g., number of tasks, buffer sizes, batch sizes for embedded producer/consumer) can be complex. Correctly configuring offset-syncs.topic.location (source or target) is crucial for DR scenarios [19].

    • Replicator : While often simpler to start with within Confluent Platform, advanced configurations like custom subject translators or complex topic routing rules still require careful setup [13].

  • Resource Management : Both tools run on Kafka Connect and require sufficient resources (CPU, memory, network bandwidth) for the Connect workers. Under-provisioning can lead to high replication lag.

  • Replication Lag : Monitoring replication lag is critical. High lag can be due to network latency between clusters, insufficient resources for Connect workers, misconfigured Connect tasks, or overloaded source/target Kafka clusters.

  • Active-Active Challenges :

    • MM2 : Requires careful planning to avoid data duplication and ensure consistent offset translation. Topic prefixing is the default way to manage distinct data streams, but if IdentityReplicationPolicy is used, applications or external mechanisms might be needed for loop prevention in complex setups [4].

    • Replicator : Simplified by provenance headers, but careful consideration of consumer offset management and application design is still needed for seamless failover/failback [9].

  • Schema Management (MM2) : With MM2, schema evolution must be managed independently across clusters. This can be a significant operational overhead if not automated.

  • Licensing Costs (Replicator) : Confluent Replicator is a commercial product, and its cost can be a factor for some organizations [10].


Best Practices

  • Deployment Location :

    • MM2 : It's generally recommended to run the MM2 Kafka Connect cluster in the target data center or environment. This is often referred to as the "consume from remote, produce to local" pattern, which can be more resilient to network issues between data centers [20].

    • Replicator : Similarly, Confluent recommends deploying Replicator in the destination data center, close to the destination Kafka cluster [6].

  • Dedicated Connect Cluster : For critical replication flows, run MM2 or Replicator on a dedicated Kafka Connect cluster rather than sharing it with other Connect jobs. This provides resource isolation and simplifies tuning.

  • Monitoring :

    • MM2 : Monitor Kafka Connect JMX metrics (e.g., task status, lag, throughput), MM2-specific metrics if available (e.g., via heartbeats), and Kafka broker metrics on both clusters [1].

    • Replicator : Leverage Confluent Control Center for comprehensive monitoring. Also, monitor standard Kafka Connect JMX metrics [17]. Key metrics include MBeans like kafka.connect.replicator:type=replicated-messages,topic=\(\[-.w\]\+),source=\(\[-.w\]\+),target=\(\[-.w\]\+) for message lag and throughput.

  • Capacity Planning : Ensure both source and target Kafka clusters, as well as the Kafka Connect cluster, have adequate resources (brokers, disk, network, CPU, memory) to handle the replication load.

  • Topic Filtering : Use whitelists ( topic.whitelist or topics.regex for MM2, topic.whitelist or topic.regex.list for Replicator) to replicate only necessary topics. Avoid replicating all topics unless essential [13].

  • Configuration Synchronization :

    • MM2 : Understand which topic configurations are synced ( sync.topic.configs.enabled ) and be aware of limitations (e.g., replication factor cannot exceed the number of brokers in the target cluster) [2].

    • Replicator : Replicator also syncs topic configurations, but verify critical settings post-creation [6].

  • Failover Testing : Regularly test your disaster recovery and failover procedures to ensure consumer applications can correctly switch to the replicated cluster and resume processing from the correct offsets.

  • Security : Secure communication between Kafka Connect and Kafka clusters using TLS/SSL and SASL. Configure appropriate ACLs/RBAC for MM2/Replicator principals in both source and target clusters [7].


Conclusion

Both MirrorMaker 2 and Confluent Replicator are powerful tools for Kafka data replication, each with its strengths and ideal use cases.

  • MirrorMaker 2 is an excellent open-source choice for organizations looking for a flexible, Kafka-native solution. It's well-suited for disaster recovery, data migration, and distributing data across clusters, especially when deep integration with a commercial schema registry isn't a primary concern or when schema management is handled externally. Its learning curve can be steeper for complex configurations, and achieving exactly-once semantics often requires careful design or reliance on features from managed Kafka providers.

  • Confluent Replicator , as a commercial offering, provides a more batteries-included experience, especially for users within the Confluent ecosystem. Its tight integration with Confluent Schema Registry, robust monitoring via Control Center, and built-in features like provenance headers for active-active setups make it attractive for enterprises needing comprehensive multi-datacenter replication solutions with strong support. The licensing cost is a key consideration.

The choice between MM2 and Confluent Replicator depends on your specific technical requirements (like schema management needs), operational capabilities, existing Kafka ecosystem, and budget. Thoroughly evaluate your use cases against the features and considerations outlined in this blog to make an informed decision.


If you find this content helpful, you might also be interested in our product AutoMQ. AutoMQ is a cloud-native alternative to Kafka by decoupling durability to S3 and EBS. 10x Cost-Effective. No Cross-AZ Traffic Cost. Autoscale in seconds. Single-digit ms latency. AutoMQ now is source code available on github. Big Companies Worldwide are Using AutoMQ. Check the following case studies to learn more:


References

  1. Apache Kafka Documentation - Geo-Replication

  2. Kafka Replication using MirrorMaker 2.0 - Principles and Internal Topics

  3. Understanding MirrorMaker 2.0 Connectors

  4. MirrorMaker 2.0 Deep Dive

  5. Understanding Identity Replication Flows

  6. Confluent Replicator Overview

  7. Confluent Replicator Security Guide

  8. Schema Translation with Confluent Replicator

  9. 15 Facts About Kafka Replicator Every Engineer Should Know

  10. Confluent Community License FAQ

  11. Google Cloud Pub/Sub Lite - Kafka MirrorMaker Replication

  12. Redpanda - Mirroring Data with MirrorMaker 2.0

  13. Confluent Replicator Configuration Options

  14. Aiven Kafka - Setting up Active-Active Replication

  15. Does Kafka MirrorMaker 2 Guarantee Exactly-Once Delivery?

  16. Exactly-Once Semantics in MirrorMaker 2.0

  17. Monitoring Confluent Replicator

  18. KIP-710: Distributed Mode Support in MirrorMaker 2.0

  19. MirrorMaker: Beyond the Basics - Kafka Summit Europe 2021

  20. Kafka MirrorMaker 2 Deep Dive Part 2: Managed MirrorMaker 2

  21. Kafka MirrorMaker 2(MM2): Usages & Best Practices

  22. Replicate Multi-Datacenter Topics Across Kafka Clusters in Confluent Platform ¶