The world’s leading publication for data science, AI, and ML professionals.

Spin up new MirrorMaker in 5 minutes

A step-by-step walkthrough with Kubernetes deployment script

As recently included in Apache Kafka and introduced in my previous blog, new MirrorMaker becomes the officially certified open-source tool that replicates data between two Kafka instances across datacenters.

To have the first-hand experience of new Mirrormaker, in this article, we will walk through the end-to-end deployment on local Kubernetes.

As a prerequisite, Minikube and an instance of Virtual Machine Monitor (e.g. VirtualBox, VMWare Fusion…) need to be installed on local before the following steps.

Note: the scripts used in the following may be used in a Kubernetes cluster, but do not warrant a production quality deployment

Step 1: start local Kubernetes

minikube start --driver=<driver_name> --kubernetes-version=v1.15.12 --cpus 4 --memory 8192

If use VirtualBox, will be "virtualbox"

Step 2: clone Kubernetes deployment scripts and spin up Kafka

Clone the repo (https://github.com/ning2008wisc/minikube-mm2-demo) and run the following commands to create namespace, 2 kafka instances

kubectl apply -f 00-namespace
kubectl apply -f 01-zookeeper
kubectl apply -f 02-kafka
kubectl apply -f 03-zookeeper
kubectl apply -f 04-kafka

Then verify that 2 kafka clusters are running, each with 3 nodes

kubectl config set-context --current --namespace=kafka
kubectl get pods
NAME                               READY   STATUS    RESTARTS   AGE
kafka-0                            1/1     Running   0          2m5s
kafka-1                            1/1     Running   0          86s
kafka-2                            1/1     Running   0          84s
kafka2-0                           1/1     Running   0          119s
kafka2-1                           1/1     Running   0          84s
kafka2-2                           1/1     Running   0          82s
zookeeper-&lt;hash&gt;                   1/1     Running   0          2m8s
zookeeper-backup-&lt;hash&gt;            1/1     Running   0          2m2s

Step 3: deploy new MirrorMaker

MirrorMaker will be deployed by Helm. Install Helm on local then initialize it as follows,

helm init --tiller-namespace kafka
kubectl create serviceaccount --namespace kafka tiller
kubectl create clusterrolebinding tiller-cluster-rule --clusterrole=cluster-admin --serviceaccount=kafka:tiller
kubectl patch deploy --namespace kafka tiller-deploy -p '{"spec":{"template":{"spec":{"serviceAccount":"tiller"}}}}'

To minimize the footprint, MirrorMaker is deployed as a distributed and independent kubernetes service, rather than setting up a Kafka Connect cluster and deploying MirrorMaker via Kafka Connect REST interface.

cd kafka-mm
helm --tiller-namespace kafka install ./ --name kafka-mm

Check the log of MM 2 to make sure it is running properly

kubectl logs -f kafka-mm-&lt;hash&gt; -c kafka-mm-server

Step 4: test MirrorMaker with Kafka instances

Now, let’s produce something on the source kafka cluster (kafka-{0,1,2}) and consume from the target cluster (kafka2-{0,1,2}) to verify the data is mirrored simultaneously.

Open a new terminal, switch to kafka namespace, login to the broker node of the source Kafka cluster then start the console producer

kubectl exec -i -t kafka-0 -- /bin/bash
bash-4.4# unset JMX_PORT
bash-4.4# /opt/kafka/bin/kafka-console-producer.sh  --broker-list localhost:9092 --topic test

Open another new terminal, switch to kafka namespace, login to the broker node of the target Kafka cluster then start the console consumer

kubectl exec -i -t kafka2-0 -- /bin/bash
bash-4.4# unset JMX_PORT
bash-4.4# /opt/kafka/bin/kafka-console-consumer.sh  --bootstrap-server localhost:9092 --topic primary.test

Now type in some random characters at console producer . It is expected to see the same characters at console consumer simultaneously.

Step 5: monitor MirrorMaker

To track the performance and healthiness, MirrorMaker exposes many metrics via JMX Beans. Here is how to quickly validate their raw format via port forwarding.

kubectl port-forward kafka-mm-&lt;hash&gt; 8081:8081

Open a local web browser and enter http://localhost:8081/ to view the relevant metrics in the raw and plain text format.

Conclusion

In the next few blogs, I plan to introduce more interesting topics around new MirrorMaker, including:

  • exactly-once message delivery guarantee across datacenters
  • tools to migrate from existing mirroring solutions to MM2

Please stay tuned for more articles!


Related Articles