Skip to Main Content

Example: Self-Balancing when Cluster Nodes Change

This document presents the Kafka CLI tools for verifying automatic partition reassignment and data balancing while scaling an AutoMQ cluster. The Kafka CLI tools run via a Docker image offered by AutoMQ.

  1. Create a topic with 16 partitions and distribute a balanced load.

  2. As you start and stop brokers, check to see if partitions automatically reassign themselves across different brokers.

This automatic data balancing is an intrinsic feature of AutoMQ, ensuring that data is distributed automatically and evenly throughout the cluster. By observing the distribution of partitions and broker load, you can confirm whether automatic partition balancing functions as anticipated.

Prerequisites

Before conducting automated partition data rebalance tests, the following conditions must be met:

Complete the installation and deployment of the AutoMQ cluster. You can refer to the following methods for installing and deploying AutoMQ:

If deploying the cluster through Deploy Multi-Nodes Cluster on Linux▸ or Deploy Multi-Nodes Cluster on Kubernetes▸, you need to ensure that when starting the Controller, autobalancer.controller.enable is set to true to enable automatic data rebalancing.

Additionally, the host running the test program needs to meet the following conditions:

  • Linux/Mac/Windows Subsystem for Linux

  • Docker

Experience Partition Reassignment Triggered by Cluster Node Changes.

If the previous AutoMQ cluster was deployed by following the guide Deploy Multi-Nodes Test Cluster on Docker▸, the cluster bootstrap address you would have acquired might look like "server1:9092,server2:9092,server3:9092", and the AutoMQ cluster would be in the "automq_net" Docker network.

Please replace the bootstrap-server address below with the actual address of the cluster based on your deployment configuration.

Create Topic


docker run --network automq_net automqinc/automq:1.5.5 /bin/bash -c "/opt/kafka/kafka/bin/kafka-topics.sh --partitions 16 --create --topic self-balancing-topic --bootstrap-server server1:9092,server2:9092,server3:9092"

View Partition Distribution


docker run --network automq_net automqinc/automq:1.5.5 /bin/bash -c "/opt/kafka/kafka/bin/kafka-topics.sh --topic self-balancing-topic --describe --bootstrap-server server1:9092,server2:9092,server3:9092"


Topic: self-balancing-topic TopicId: AjoAB22YRRq7w6MdtZ4hDA PartitionCount: 16 ReplicationFactor: 1 Configs: min.insync.replicas=1,elasticstream.replication.factor=1,segment.bytes=1073741824
Topic: self-balancing-topic Partition: 0 Leader: 2 Replicas: 2 Isr: 2
Topic: self-balancing-topic Partition: 1 Leader: 1 Replicas: 1 Isr: 1
Topic: self-balancing-topic Partition: 2 Leader: 1 Replicas: 1 Isr: 1
Topic: self-balancing-topic Partition: 3 Leader: 2 Replicas: 2 Isr: 2
Topic: self-balancing-topic Partition: 4 Leader: 1 Replicas: 1 Isr: 1
Topic: self-balancing-topic Partition: 5 Leader: 2 Replicas: 2 Isr: 2
Topic: self-balancing-topic Partition: 6 Leader: 1 Replicas: 1 Isr: 1
Topic: self-balancing-topic Partition: 7 Leader: 2 Replicas: 2 Isr: 2
Topic: self-balancing-topic Partition: 8 Leader: 1 Replicas: 1 Isr: 1
Topic: self-balancing-topic Partition: 9 Leader: 2 Replicas: 2 Isr: 2
Topic: self-balancing-topic Partition: 10 Leader: 1 Replicas: 1 Isr: 1
Topic: self-balancing-topic Partition: 11 Leader: 2 Replicas: 2 Isr: 2
Topic: self-balancing-topic Partition: 12 Leader: 2 Replicas: 2 Isr: 2
Topic: self-balancing-topic Partition: 13 Leader: 1 Replicas: 1 Isr: 1
Topic: self-balancing-topic Partition: 14 Leader: 1 Replicas: 1 Isr: 1
Topic: self-balancing-topic Partition: 15 Leader: 2 Replicas: 2 Isr: 2

Launch Producer


docker run --network automq_net automqinc/automq:1.5.5 /bin/bash -c "/opt/kafka/kafka/bin/kafka-producer-perf-test.sh --topic self-balancing-topic --num-records=1024000 --throughput 5120 --record-size 1024 --producer-props bootstrap.servers=server1:9092,server2:9092,server3:9092 linger.ms=100 batch.size=524288 buffer.memory=134217728 max.request.size=67108864"

Start the Consumer


docker run --network automq_net automqinc/automq:1.5.5 /bin/bash -c "/opt/kafka/kafka/bin/kafka-consumer-perf-test.sh --topic self-balancing-topic --show-detailed-stats --timeout 300000 --messages=1024000 --reporting-interval 1000 --bootstrap-server=server1:9092,server2:9092,server3:9092"

Stop the Broker

Stop a server, causing its partitions to be reassigned to other nodes. After stopping, you can observe how producers and consumers recover.


docker stop automq-server3

After stopping, the producer logs will appear as follows:


[2024-04-29 05:00:03,436] WARN [Producer clientId=perf-producer-client] Got error produce response with correlation id 49732 on topic-partition self-balancing-topic-7, retrying (2147483641 attempts left). Error: NOT_LEADER_OR_FOLLOWER (org.apache.kafka.clients.producer.internals.Sender)
[2024-04-29 05:00:03,438] WARN [Producer clientId=perf-producer-client] Received invalid metadata error in produce request on partition self-balancing-topic-7 due to org.apache.kafka.common.errors.NotLeaderOrFollowerException: For requests intended only for the leader, this error indicates that the broker is not the current leader. For requests intended for any replica, this error indicates that the broker is not a replica of the topic partition.. Going to request metadata update now (org.apache.kafka.clients.producer.internals.Sender)

After waiting a few seconds, you will see that production and consumption return to normal.


2024-05-07 11:56:08,920] WARN [Producer clientId=perf-producer-client] Received invalid metadata error in produce request on partition self-balancing-topic-3 due to org.apache.kafka.common.errors.NotLeaderOrFollowerException: For requests intended only for the leader, this error indicates that the broker is not the current leader. For requests intended for any replica, this error indicates that the broker is not a replica of the topic partition.. Going to request metadata update now (org.apache.kafka.clients.producer.internals.Sender)
[2024-05-07 11:56:08,920] WARN [Producer clientId=perf-producer-client] Got error produce response with correlation id 42141 on topic-partition self-balancing-topic-3, retrying (2147483646 attempts left). Error: NOT_LEADER_OR_FOLLOWER (org.apache.kafka.clients.producer.internals.Sender)
[2024-05-07 11:56:08,920] WARN [Producer clientId=perf-producer-client] Received invalid metadata error in produce request on partition self-balancing-topic-3 due to org.apache.kafka.common.errors.NotLeaderOrFollowerException: For requests intended only for the leader, this error indicates that the broker is not the current leader. For requests intended for any replica, this error indicates that the broker is not a replica of the topic partition.. Going to request metadata update now (org.apache.kafka.clients.producer.internals.Sender)
[2024-05-07 11:56:08,588] 25693 records sent, 5138.6 records/sec (5.02 MB/sec), 8.9 ms avg latency, 1246.0 ms max latency.
[2024-05-07 11:56:13,589] 25607 records sent, 5120.4 records/sec (5.00 MB/sec), 1.8 ms avg latency, 44.0 ms max latency.
[2024-05-07 11:56:18,591] 25621 records sent, 5121.1 records/sec (5.00 MB/sec), 1.6 ms avg latency, 10.0 ms max latency.

Review Partition Distribution Again

After the producer resumes writing, we examine the partition distribution once more and observe that all partitions are located on broker1. AutoMQ efficiently and quickly completes the reassignment of partitions and rebalancing of traffic from the stopped node.


docker run --network automq_net automqinc/automq:1.5.0 /bin/bash -c "/opt/kafka/kafka/bin/kafka-topics.sh --topic self-balancing-topic --describe --bootstrap-server server1:9092,server2:9092,server3:9092"


Topic: self-balancing-topic TopicId: AjoAB22YRRq7w6MdtZ4hDA PartitionCount: 16 ReplicationFactor: 1 Configs: min.insync.replicas=1,elasticstream.replication.factor=1,segment.bytes=1073741824
Topic: self-balancing-topic Partition: 0 Leader: 1 Replicas: 1 Isr: 1
Topic: self-balancing-topic Partition: 1 Leader: 1 Replicas: 1 Isr: 1
Topic: self-balancing-topic Partition: 2 Leader: 1 Replicas: 1 Isr: 1
Topic: self-balancing-topic Partition: 3 Leader: 1 Replicas: 1 Isr: 1
Topic: self-balancing-topic Partition: 4 Leader: 1 Replicas: 1 Isr: 1
Topic: self-balancing-topic Partition: 5 Leader: 1 Replicas: 1 Isr: 1
Topic: self-balancing-topic Partition: 6 Leader: 1 Replicas: 1 Isr: 1
Topic: self-balancing-topic Partition: 7 Leader: 1 Replicas: 1 Isr: 1
Topic: self-balancing-topic Partition: 8 Leader: 1 Replicas: 1 Isr: 1
Topic: self-balancing-topic Partition: 9 Leader: 1 Replicas: 1 Isr: 1
Topic: self-balancing-topic Partition: 10 Leader: 1 Replicas: 1 Isr: 1
Topic: self-balancing-topic Partition: 11 Leader: 1 Replicas: 1 Isr: 1
Topic: self-balancing-topic Partition: 12 Leader: 1 Replicas: 1 Isr: 1
Topic: self-balancing-topic Partition: 13 Leader: 1 Replicas: 1 Isr: 1
Topic: self-balancing-topic Partition: 14 Leader: 1 Replicas: 1 Isr: 1
Topic: self-balancing-topic Partition: 15 Leader: 1 Replicas: 1 Isr: 1

Restart the Broker

Restart automq-server3 to trigger the automatic reassignment of partitions. After several seconds of retrying, the producer and consumer can resume operations.


docker start automq-server3

At this stage, if we review the partition distribution again, we can confirm that the partitions have been automatically reassigned.


Topic: self-balancing-topic TopicId: AjoAB22YRRq7w6MdtZ4hDA PartitionCount: 16 ReplicationFactor: 1 Configs: min.insync.replicas=1,elasticstream.replication.factor=1,segment.bytes=1073741824
Topic: self-balancing-topic Partition: 0 Leader: 1 Replicas: 1 Isr: 1
Topic: self-balancing-topic Partition: 1 Leader: 2 Replicas: 2 Isr: 2
Topic: self-balancing-topic Partition: 2 Leader: 1 Replicas: 1 Isr: 1
Topic: self-balancing-topic Partition: 3 Leader: 1 Replicas: 1 Isr: 1
Topic: self-balancing-topic Partition: 4 Leader: 2 Replicas: 2 Isr: 2
Topic: self-balancing-topic Partition: 5 Leader: 1 Replicas: 1 Isr: 1
Topic: self-balancing-topic Partition: 6 Leader: 1 Replicas: 1 Isr: 1
Topic: self-balancing-topic Partition: 7 Leader: 1 Replicas: 1 Isr: 1
Topic: self-balancing-topic Partition: 8 Leader: 2 Replicas: 2 Isr: 2
Topic: self-balancing-topic Partition: 9 Leader: 2 Replicas: 2 Isr: 2
Topic: self-balancing-topic Partition: 10 Leader: 1 Replicas: 1 Isr: 1
Topic: self-balancing-topic Partition: 11 Leader: 2 Replicas: 2 Isr: 2
Topic: self-balancing-topic Partition: 12 Leader: 2 Replicas: 2 Isr: 2
Topic: self-balancing-topic Partition: 13 Leader: 2 Replicas: 2 Isr: 2
Topic: self-balancing-topic Partition: 14 Leader: 1 Replicas: 1 Isr: 1
Topic: self-balancing-topic Partition: 15 Leader: 1 Replicas: 1 Isr: 1