Example: Self-Balancing when Cluster Nodes Change

This document presents the Kafka CLI tools for verifying automatic partition reassignment and data balancing while scaling an AutoMQ cluster. The Kafka CLI tools run via a Docker image offered by AutoMQ.

Create a topic with 16 partitions and distribute a balanced load.
As you start and stop brokers, check to see if partitions automatically reassign themselves across different brokers.

This automatic data balancing is an intrinsic feature of AutoMQ, ensuring that data is distributed automatically and evenly throughout the cluster. By observing the distribution of partitions and broker load, you can confirm whether automatic partition balancing functions as anticipated.

Prerequisites

Before conducting automated partition data rebalance tests, the following conditions must be met: Complete the installation and deployment of the AutoMQ cluster. You can refer to the following methods for installing and deploying AutoMQ:

Deploy Multi-Nodes Test Cluster on Docker▸

If deploying the cluster through Deploy Multi-Nodes Cluster on Linux▸ or Deploy Multi-Nodes Cluster on Kubernetes▸, you need to ensure that when starting the Controller, autobalancer.controller.enable is set to true to enable automatic data rebalancing.

Additionally, the host running the test program needs to meet the following conditions:

Linux/Mac/Windows Subsystem for Linux
Docker

Experience Partition Reassignment Triggered by Cluster Node Changes.

If the previous AutoMQ cluster was deployed by following the guide Deploy Multi-Nodes Test Cluster on Docker▸, the cluster bootstrap address you would have acquired might look like “server1:9092,server2:9092,server3:9092”, and the AutoMQ cluster would be in the “automq_net” Docker network.

Please replace the bootstrap-server address below with the actual address of the cluster based on your deployment configuration.

Create Topic

docker run  --network automq_net  automqinc/automq:1.5.5 /bin/bash -c "/opt/kafka/kafka/bin/kafka-topics.sh --partitions 16 --create --topic self-balancing-topic --bootstrap-server server1:9092,server2:9092,server3:9092"

View Partition Distribution

docker run  --network automq_net   automqinc/automq:1.5.5 /bin/bash -c "/opt/kafka/kafka/bin/kafka-topics.sh --topic self-balancing-topic --describe --bootstrap-server server1:9092,server2:9092,server3:9092"

Topic: self-balancing-topic        TopicId: AjoAB22YRRq7w6MdtZ4hDA        PartitionCount: 16        ReplicationFactor: 1        Configs: min.insync.replicas=1,elasticstream.replication.factor=1,segment.bytes=1073741824
        Topic: self-balancing-topic        Partition: 0        Leader: 2        Replicas: 2        Isr: 2
        Topic: self-balancing-topic        Partition: 1        Leader: 1        Replicas: 1        Isr: 1
        Topic: self-balancing-topic        Partition: 2        Leader: 1        Replicas: 1        Isr: 1
        Topic: self-balancing-topic        Partition: 3        Leader: 2        Replicas: 2        Isr: 2
        Topic: self-balancing-topic        Partition: 4        Leader: 1        Replicas: 1        Isr: 1
        Topic: self-balancing-topic        Partition: 5        Leader: 2        Replicas: 2        Isr: 2
        Topic: self-balancing-topic        Partition: 6        Leader: 1        Replicas: 1        Isr: 1
        Topic: self-balancing-topic        Partition: 7        Leader: 2        Replicas: 2        Isr: 2
        Topic: self-balancing-topic        Partition: 8        Leader: 1        Replicas: 1        Isr: 1
        Topic: self-balancing-topic        Partition: 9        Leader: 2        Replicas: 2        Isr: 2
        Topic: self-balancing-topic        Partition: 10        Leader: 1        Replicas: 1        Isr: 1
        Topic: self-balancing-topic        Partition: 11        Leader: 2        Replicas: 2        Isr: 2
        Topic: self-balancing-topic        Partition: 12        Leader: 2        Replicas: 2        Isr: 2
        Topic: self-balancing-topic        Partition: 13        Leader: 1        Replicas: 1        Isr: 1
        Topic: self-balancing-topic        Partition: 14        Leader: 1        Replicas: 1        Isr: 1
        Topic: self-balancing-topic        Partition: 15        Leader: 2        Replicas: 2        Isr: 2

Launch Producer

docker run  --network automq_net  automqinc/automq:1.5.5 /bin/bash -c  "/opt/kafka/kafka/bin/kafka-producer-perf-test.sh --topic self-balancing-topic --num-records=1024000 --throughput 5120 --record-size 1024 --producer-props bootstrap.servers=server1:9092,server2:9092,server3:9092 linger.ms=100 batch.size=524288 buffer.memory=134217728 max.request.size=67108864"

Start the Consumer

docker run --network automq_net  automqinc/automq:1.5.5 /bin/bash -c "/opt/kafka/kafka/bin/kafka-consumer-perf-test.sh --topic self-balancing-topic --show-detailed-stats --timeout 300000 --messages=1024000 --reporting-interval 1000 --bootstrap-server=server1:9092,server2:9092,server3:9092"

Stop the Broker

Stop a server, causing its partitions to be reassigned to other nodes. After stopping, you can observe how producers and consumers recover.


docker stop automq-server3

After stopping, the producer logs will appear as follows:

[2024-04-29 05:00:03,436] WARN [Producer clientId=perf-producer-client] Got error produce response with correlation id 49732 on topic-partition self-balancing-topic-7, retrying (2147483641 attempts left). Error: NOT_LEADER_OR_FOLLOWER (org.apache.kafka.clients.producer.internals.Sender)
[2024-04-29 05:00:03,438] WARN [Producer clientId=perf-producer-client] Received invalid metadata error in produce request on partition self-balancing-topic-7 due to org.apache.kafka.common.errors.NotLeaderOrFollowerException: For requests intended only for the leader, this error indicates that the broker is not the current leader. For requests intended for any replica, this error indicates that the broker is not a replica of the topic partition.. Going to request metadata update now (org.apache.kafka.clients.producer.internals.Sender)

After waiting a few seconds, you will see that production and consumption return to normal.

2024-05-07 11:56:08,920] WARN [Producer clientId=perf-producer-client] Received invalid metadata error in produce request on partition self-balancing-topic-3 due to org.apache.kafka.common.errors.NotLeaderOrFollowerException: For requests intended only for the leader, this error indicates that the broker is not the current leader. For requests intended for any replica, this error indicates that the broker is not a replica of the topic partition.. Going to request metadata update now (org.apache.kafka.clients.producer.internals.Sender)
[2024-05-07 11:56:08,920] WARN [Producer clientId=perf-producer-client] Got error produce response with correlation id 42141 on topic-partition self-balancing-topic-3, retrying (2147483646 attempts left). Error: NOT_LEADER_OR_FOLLOWER (org.apache.kafka.clients.producer.internals.Sender)
[2024-05-07 11:56:08,920] WARN [Producer clientId=perf-producer-client] Received invalid metadata error in produce request on partition self-balancing-topic-3 due to org.apache.kafka.common.errors.NotLeaderOrFollowerException: For requests intended only for the leader, this error indicates that the broker is not the current leader. For requests intended for any replica, this error indicates that the broker is not a replica of the topic partition.. Going to request metadata update now (org.apache.kafka.clients.producer.internals.Sender)
[2024-05-07 11:56:08,588] 25693 records sent, 5138.6 records/sec (5.02 MB/sec), 8.9 ms avg latency, 1246.0 ms max latency.
[2024-05-07 11:56:13,589] 25607 records sent, 5120.4 records/sec (5.00 MB/sec), 1.8 ms avg latency, 44.0 ms max latency.
[2024-05-07 11:56:18,591] 25621 records sent, 5121.1 records/sec (5.00 MB/sec), 1.6 ms avg latency, 10.0 ms max latency.

Review Partition Distribution Again

After the producer resumes writing, we examine the partition distribution once more and observe that all partitions are located on broker1. AutoMQ efficiently and quickly completes the reassignment of partitions and rebalancing of traffic from the stopped node.

docker run --network automq_net  automqinc/automq:1.5.0 /bin/bash -c "/opt/kafka/kafka/bin/kafka-topics.sh --topic self-balancing-topic --describe --bootstrap-server server1:9092,server2:9092,server3:9092"

Topic: self-balancing-topic        TopicId: AjoAB22YRRq7w6MdtZ4hDA        PartitionCount: 16        ReplicationFactor: 1        Configs: min.insync.replicas=1,elasticstream.replication.factor=1,segment.bytes=1073741824
        Topic: self-balancing-topic        Partition: 0        Leader: 1        Replicas: 1        Isr: 1
        Topic: self-balancing-topic        Partition: 1        Leader: 1        Replicas: 1        Isr: 1
        Topic: self-balancing-topic        Partition: 2        Leader: 1        Replicas: 1        Isr: 1
        Topic: self-balancing-topic        Partition: 3        Leader: 1        Replicas: 1        Isr: 1
        Topic: self-balancing-topic        Partition: 4        Leader: 1        Replicas: 1        Isr: 1
        Topic: self-balancing-topic        Partition: 5        Leader: 1        Replicas: 1        Isr: 1
        Topic: self-balancing-topic        Partition: 6        Leader: 1        Replicas: 1        Isr: 1
        Topic: self-balancing-topic        Partition: 7        Leader: 1        Replicas: 1        Isr: 1
        Topic: self-balancing-topic        Partition: 8        Leader: 1        Replicas: 1        Isr: 1
        Topic: self-balancing-topic        Partition: 9        Leader: 1        Replicas: 1        Isr: 1
        Topic: self-balancing-topic        Partition: 10        Leader: 1        Replicas: 1        Isr: 1
        Topic: self-balancing-topic        Partition: 11        Leader: 1        Replicas: 1        Isr: 1
        Topic: self-balancing-topic        Partition: 12        Leader: 1        Replicas: 1        Isr: 1
        Topic: self-balancing-topic        Partition: 13        Leader: 1        Replicas: 1        Isr: 1
        Topic: self-balancing-topic        Partition: 14        Leader: 1        Replicas: 1        Isr: 1
        Topic: self-balancing-topic        Partition: 15        Leader: 1        Replicas: 1        Isr: 1

Restart the Broker

Restart automq-server3 to trigger the automatic reassignment of partitions. After several seconds of retrying, the producer and consumer can resume operations.


docker start automq-server3

At this stage, if we review the partition distribution again, we can confirm that the partitions have been automatically reassigned.

Topic: self-balancing-topic        TopicId: AjoAB22YRRq7w6MdtZ4hDA        PartitionCount: 16        ReplicationFactor: 1        Configs: min.insync.replicas=1,elasticstream.replication.factor=1,segment.bytes=1073741824
        Topic: self-balancing-topic        Partition: 0        Leader: 1        Replicas: 1        Isr: 1
        Topic: self-balancing-topic        Partition: 1        Leader: 2        Replicas: 2        Isr: 2
        Topic: self-balancing-topic        Partition: 2        Leader: 1        Replicas: 1        Isr: 1
        Topic: self-balancing-topic        Partition: 3        Leader: 1        Replicas: 1        Isr: 1
        Topic: self-balancing-topic        Partition: 4        Leader: 2        Replicas: 2        Isr: 2
        Topic: self-balancing-topic        Partition: 5        Leader: 1        Replicas: 1        Isr: 1
        Topic: self-balancing-topic        Partition: 6        Leader: 1        Replicas: 1        Isr: 1
        Topic: self-balancing-topic        Partition: 7        Leader: 1        Replicas: 1        Isr: 1
        Topic: self-balancing-topic        Partition: 8        Leader: 2        Replicas: 2        Isr: 2
        Topic: self-balancing-topic        Partition: 9        Leader: 2        Replicas: 2        Isr: 2
        Topic: self-balancing-topic        Partition: 10        Leader: 1        Replicas: 1        Isr: 1
        Topic: self-balancing-topic        Partition: 11        Leader: 2        Replicas: 2        Isr: 2
        Topic: self-balancing-topic        Partition: 12        Leader: 2        Replicas: 2        Isr: 2
        Topic: self-balancing-topic        Partition: 13        Leader: 2        Replicas: 2        Isr: 2
        Topic: self-balancing-topic        Partition: 14        Leader: 1        Replicas: 1        Isr: 1
        Topic: self-balancing-topic        Partition: 15        Leader: 1        Replicas: 1        Isr: 1

What is AutoMQ?

Getting Started

Deployment

Migration

Observability

Architecture

Table Topic

Eliminate Inter-Zone Traffics

Integrations

Configuration

Releases

Benchmarks

Reference

Example: Self-Balancing when Cluster Nodes Change

Prerequisites

Experience Partition Reassignment Triggered by Cluster Node Changes.

Create Topic

View Partition Distribution

Launch Producer

Start the Consumer

Stop the Broker

Review Partition Distribution Again

Restart the Broker

What is AutoMQ?

Getting Started

Deployment

Migration

Observability

Architecture

Table Topic

Eliminate Inter-Zone Traffics

Integrations

Configuration

Releases

Benchmarks

Reference

​Prerequisites

​Experience Partition Reassignment Triggered by Cluster Node Changes.

​Create Topic

​View Partition Distribution

​Launch Producer

​Start the Consumer

​Stop the Broker

​Review Partition Distribution Again

​Restart the Broker

Prerequisites

Experience Partition Reassignment Triggered by Cluster Node Changes.

Create Topic

View Partition Distribution

Launch Producer

Start the Consumer

Stop the Broker

Review Partition Distribution Again

Restart the Broker