Geo-Replicating Kafka Data with MirrorMaker


I've been reading some documentation on Kafka replication and decided to summarize my notes in this blog post.

MirrorMaker

Data replication across Kafka clusters is handled by MirrorMaker. It is a solution based on Kafka Connect that utilizes a set of Producers and Consumers to read from source clusters and write to target clusters.

Basic Configuration

The basic MirrorMaker configuration specifies aliases for all of the clusters. 

These aliases are then used to specify the broker addresses for each cluster

Replication Flows

The aliases specified in the basic configuration can be used to define a replication flow. 

This replication flow needs to be enabled, and additional properties can be configured per replication flow. A full list of properties can be found here

Since MirrorMaker is based on Connect, it inherits Connect properties too. A full list of those are available here

Replication Strategies

Active - Passive

An Active Passive replication strategy is unidirectional, and is ideal in situations where you need to back up topics for disaster recovery

Active Active

An Active-Active replication strategy is bi-directional and topics are replicated in both directions. This is useful for high availability.

MultiCluster GeoReplication

In a geo replication setup a leader cluster from each region can be used to replicate to a leader cluster of another region.

Clusters in a region replicate amongst themselves.


Target Topic Names

By default the topic names of the source and target cluster will not match. The target topic will have the source cluster's alias prefixed to the topic name along with a delimiter that can be customized. 

You can also use replication.policy=IdentityReplicationPolicy to avoid the renaming of a topic with the prefix. This makes sense if you are trying to replicate a topic one-way or trying to migrate from a legacy Kafka cluster.  When using this policy, MirrorMaker cannot prevent cycles, so it is necessary to ensure that the strategy is acyclic.

Consuming from Topics in Active Active Replicated Clusters

As seen in the previous section, topic names are renamed when replicated to another cluster. Consumers would need to aggregate all data in both the original and the replicated topic(s). 

Alternatively, data from these multiple topics can be aggregated into a single topic via a solution like KSQLDB. This will simplify the consumer solution.

Best Practices 

Consistent Configuration

If two MirrorMaker instances have different configuration, the one belonging to the leader will take effect, and this can cause issues if that wasn't the desired configuration

Consuming from the Source rather that producing to the target

Kafka MirrorMaker is based on Connect and has a set of Producers and Consumers. Consumers in Kafka are more resilient to recovering from failure, and Producers can have issues if there is high network latency. It therefore makes sense to host the MirrorMaker instances closer to the target cluster. 

We can enforce the production of data to a nearby or local cluster by using the special clusters flag and the target cluster alias 

Starting and Stopping MirrorMaker

MirrorMaker is based on Kafka Connect and multiple instances can co-ordinate with each other (via Kafka) to share the replication load. This makes scaling horizontally a matter of adding new instances with the same configuration.

MirrorMaker can be stopped by killing the process.

Monitoring

Kafka MirrorMaker includes all the jmx bean properties provided by Kafka Connect, and provides a few more to monitor records transferred and latency.
  • record-count - number of records transferred / replicated
  • replication-latency - time taken for a record to be replicated
  • checkpoint-latency - time taken to transfer consumer offsets
All the metrics are monitored at the topic level AND for each partition in the topic

Sources