Skip to Main Content

Example: Continuous Data Self-Balancing

This document explains the data auto-rebalance testing for the AutoMQ cluster using the Kafka CLI tool. The Kafka CLI tool is executed through the Docker image provided by AutoMQ.

  1. Create a topic with multiple partitions and manually reassign partitions to specific nodes to induce an imbalance in partition distribution.

  2. Subsequently, send a balanced load to all partitions and observe if the partitions automatically reassign between different brokers.

This automatic data balancing is a fundamental feature of AutoMQ, ensuring automatic equitable distribution of data across the cluster. By monitoring the distribution of partitions and the load on brokers, you can verify whether the automatic partition balancing functions as expected.

Prerequisites

Before conducting automatic partition data rebalance tests, the following conditions must be met:

Complete the installation and deployment of the AutoMQ cluster, you can refer to the following methods for installing and deploying AutoMQ:

Tip

If deploying the cluster via Deploy Multi-Nodes Cluster on Linux▸ or Deploy Multi-Nodes Cluster on Kubernetes▸, ensure that autobalancer.controller.enable is set to true when starting the Controller to enable automatic data rebalancing.

Additionally, the host running the test program needs to meet the following criteria:

  • Linux/Mac/Windows Subsystem for Linux

  • Docker

Info

If downloading container images is slow, please refer to Docker Hub Mirror Configuration▸

If the previous AutoMQ cluster was deployed by following the instructions in Deploy Multi-Nodes Test Cluster on Docker▸, the retrieved cluster Bootstrap address will be similar to "server1:9092,server2:9092,server3:9092" and the AutoMQ cluster will be configured under the "automq_net" Docker network.

Tip

Please replace the bootstrap-server address below with the actual cluster's address based on the deployment's configuration.

Create Topic


docker run --network automq_net automqinc/automq:latest /bin/bash -c "/opt/kafka/kafka/bin/kafka-topics.sh --partitions 8 --create --topic continuous-self-balancing-topic --bootstrap-server server1:9092,server2:9092,server3:9092"

View Partition Distribution


docker run --network automq_net automqinc/automq:latest /bin/bash -c "/opt/kafka/kafka/bin/kafka-topics.sh --topic continuous-self-balancing-topic --describe --bootstrap-server server1:9092,server2:9092,server3:9092"


Topic: continuous-self-balancing-topic TopicId: DNZe6gBQTrCOEAruQ_y2tg PartitionCount: 8 ReplicationFactor: 1 Configs: min.insync.replicas=1,segment.bytes=1073741824
Topic: continuous-self-balancing-topic Partition: 0 Leader: 2 Replicas: 2 Isr: 2
Topic: continuous-self-balancing-topic Partition: 1 Leader: 1 Replicas: 1 Isr: 1
Topic: continuous-self-balancing-topic Partition: 2 Leader: 1 Replicas: 1 Isr: 1
Topic: continuous-self-balancing-topic Partition: 3 Leader: 2 Replicas: 2 Isr: 2
Topic: continuous-self-balancing-topic Partition: 4 Leader: 1 Replicas: 1 Isr: 1
Topic: continuous-self-balancing-topic Partition: 5 Leader: 2 Replicas: 2 Isr: 2
Topic: continuous-self-balancing-topic Partition: 6 Leader: 1 Replicas: 1 Isr: 1
Topic: continuous-self-balancing-topic Partition: 7 Leader: 2 Replicas: 2 Isr: 2

Manually Reassign Partitions

To facilitate the observation of continuous data rebalancing, we manually reassign partitions to node2.

  1. Create a partition reassignment plan.

echo '{
"partitions": [
{"topic": "continuous-self-balancing-topic", "partition": 0, "replicas": [2]},
{"topic": "continuous-self-balancing-topic", "partition": 1, "replicas": [2]},
{"topic": "continuous-self-balancing-topic", "partition": 2, "replicas": [2]},
{"topic": "continuous-self-balancing-topic", "partition": 3, "replicas": [2]},
{"topic": "continuous-self-balancing-topic", "partition": 4, "replicas": [2]},
{"topic": "continuous-self-balancing-topic", "partition": 5, "replicas": [2]},
{"topic": "continuous-self-balancing-topic", "partition": 6, "replicas": [2]},
{"topic": "continuous-self-balancing-topic", "partition": 7, "replicas": [2]}
],
"version": 1
}' > move.json

  1. Execute the partition reassignment plan.

docker run --network automq_net -v $(pwd)/move.json:/move.json automqinc/automq:latest /bin/bash -c "/opt/kafka/kafka/bin/kafka-reassign-partitions.sh --bootstrap-server server1:9092,server2:9092,server3:9092 --reassignment-json-file /move.json --execute"

  1. After the manual reassignment, view the partition distribution as follows:

Topic: continuous-self-balancing-topic TopicId: HtVB3bM7TYaNKKKmm7khQw PartitionCount: 8 ReplicationFactor: 1 Configs: min.insync.replicas=1,segment.bytes=1073741824
Topic: continuous-self-balancing-topic Partition: 0 Leader: 2 Replicas: 2 Isr: 2
Topic: continuous-self-balancing-topic Partition: 1 Leader: 2 Replicas: 2 Isr: 2
Topic: continuous-self-balancing-topic Partition: 2 Leader: 2 Replicas: 2 Isr: 2
Topic: continuous-self-balancing-topic Partition: 3 Leader: 2 Replicas: 2 Isr: 2
Topic: continuous-self-balancing-topic Partition: 4 Leader: 2 Replicas: 2 Isr: 2
Topic: continuous-self-balancing-topic Partition: 5 Leader: 2 Replicas: 2 Isr: 2
Topic: continuous-self-balancing-topic Partition: 6 Leader: 2 Replicas: 2 Isr: 2
Topic: continuous-self-balancing-topic Partition: 7 Leader: 2 Replicas: 2 Isr: 2

Start the Producer.


docker run --network automq_net automqinc/automq:latest /bin/bash -c "/opt/kafka/kafka/bin/kafka-producer-perf-test.sh --topic continuous-self-balancing-topic --num-records=1024000 --throughput 5120 --record-size 1024 --producer-props bootstrap.servers=server1:9092,server2:9092,server3:9092 linger.ms=100 batch.size=524288 buffer.memory=134217728 max.request.size=67108864"

Start the Consumer


docker run --network automq_net automqinc/automq:1.0.4 /bin/bash -c "/opt/kafka/kafka/bin/kafka-consumer-perf-test.sh --topic continuous-self-balancing-topic --show-detailed-stats --timeout 300000 --messages=1024000 --reporting-interval 1000 --bootstrap-server=server1:9092,server2:9092,server3:9092"

Check the Partition Distribution Again

After a period of time, you will observe that the producer generates the following logs.


[2024-05-16 10:29:50,448] 25622 records sent, 5123.4 records/sec (5.00 MB/sec), 15.7 ms avg latency, 41.0 ms max latency.
[2024-05-16 10:30:00,372] WARN [Producer clientId=perf-producer-client] Got error produce response with correlation id 10354 on topic-partition continuous-self-balancing-topic-7, retrying (2147483646 attempts left). Error: NOT_LEADER_OR_FOLLOWER (org.apache.kafka.clients.producer.internals.Sender)
[2024-05-16 10:30:00,373] WARN [Producer clientId=perf-producer-client] Received invalid metadata error in produce request on partition continuous-self-balancing-topic-7 due to org.apache.kafka.common.errors.NotLeaderOrFollowerException: For requests intended only for the leader, this error indicates that the broker is not the current leader. For requests intended for any replica, this error indicates that the broker is not a replica of the topic partition. Going to request metadata update now (org.apache.kafka.clients.producer.internals.Sender)
[2024-05-16 10:30:00,373] WARN [Producer clientId=perf-producer-client] Got error produce response with correlation id 10354 on topic-partition continuous-self-balancing-topic-0, retrying (2147483646 attempts left). Error: NOT_LEADER_OR_FOLLOWER (org.apache.kafka.clients.producer.internals.Sender)
[2024-05-16 10:30:00,373] WARN [Producer clientId=perf-producer-client] Received invalid metadata error in produce request on partition continuous-self-balancing-topic-0 due to org.apache.kafka.common.errors.NotLeaderOrFollowerException: For requests intended only for the leader, this error indicates that the broker is not the current leader. For requests intended for any replica, this error indicates that the broker is not a replica of the topic partition. Going to request metadata update now (org.apache.kafka.clients.producer.internals.Sender)
[2024-05-16 10:30:00,384] WARN [Producer clientId=perf-producer-client] Got error produce response with correlation id 10356 on topic-partition continuous-self-balancing-topic-7, retrying (2147483646 attempts left). Error: NOT_LEADER_OR_FOLLOWER (org.apache.kafka.clients.producer.internals.Sender)
[2024-05-16 10:30:00,384] WARN [Producer clientId=perf-producer-client] Received invalid metadata error in produce request on partition continuous-self-balancing-topic-7 due to org.apache.kafka.common.errors.NotLeaderOrFollowerException: For requests intended only for the leader, this error indicates that the broker is not the current leader. For requests intended for any replica, this error indicates that the broker is not a replica of the topic partition. Going to request metadata update now (org.apache.kafka.clients.producer.internals.Sender)
[2024-05-16 10:30:00,384] WARN [Producer clientId=perf-producer-client] Got error produce response with correlation id 10356 on topic-partition continuous-self-balancing-topic-0, retrying (2147483646 attempts left). Error: NOT_LEADER_OR_FOLLOWER (org.apache.kafka.clients.producer.internals.Sender)
[2024-05-16 10:30:00,384] WARN [Producer clientId=perf-producer-client] Received invalid metadata error in produce request on partition continuous-self-balancing-topic-0 due to org.apache.kafka.common.errors.NotLeaderOrFollowerException: For requests intended only for the leader, this error indicates that the broker is not the current leader. For requests intended for any replica, this error indicates that the broker is not a replica of the topic partition. Going to request metadata update now (org.apache.kafka.clients.producer.internals.Sender)
[2024-05-16 10:30:00,384] WARN [Producer clientId=perf-producer-client] Got error produce response with correlation id 10356 on topic-partition continuous-self-balancing-topic-6, retrying (2147483646 attempts left). Error: NOT_LEADER_OR_FOLLOWER (org.apache.kafka.clients.producer.internals.Sender)
[2024-05-16 10:30:00,384] WARN [Producer clientId=perf-producer-client] Received invalid metadata error in produce request on partition continuous-self-balancing-topic-6 due to org.apache.kafka.common.errors.NotLeaderOrFollowerException: For requests intended only for the leader, this error indicates that the broker is not the current leader. For requests intended for any replica, this error indicates that the broker is not a replica of the topic partition. Going to request metadata update now (org.apache.kafka.clients.producer.internals.Sender)
[2024-05-16 10:30:00,385] WARN [Producer clientId=perf-producer-client] Got error produce response with correlation id 10357 on topic-partition continuous-self-balancing-topic-0, retrying (2147483646 attempts left). Error: NOT_LEADER_OR_FOLLOWER (org.apache.kafka.clients.producer.internals.Sender)
[2024-05-16 10:30:00,385] WARN [Producer clientId=perf-producer-client] Received invalid metadata error in produce request on partition continuous-self-balancing-topic-0 due to org.apache.kafka.common.errors.NotLeaderOrFollowerException: For requests intended only for the leader, this error indicates that the broker is not the current leader. For requests intended for any replica, this error indicates that the broker is not a replica of the topic partition. Going to request metadata update now (org.apache.kafka.clients.producer.internals.Sender)
[2024-05-16 10:30:00,397] WARN [Producer clientId=perf-producer-client] Got error produce response with correlation id 10358 on topic-partition continuous-self-balancing-topic-7, retrying (2147483646 attempts left). Error: NOT_LEADER_OR_FOLLOWER (org.apache.kafka.clients.producer.internals.Sender)
[2024-05-16 10:30:00,397] WARN [Producer clientId=perf-producer-client] Received invalid metadata error in produce request on partition continuous-self-balancing-topic-7 due to org.apache.kafka.common.errors.NotLeaderOrFollowerException: For requests intended only for the leader, this error indicates that the broker is not the current leader. For requests intended for any replica, this error indicates that the broker is not a replica of the topic partition. Going to request metadata update now (org.apache.kafka.clients.producer.internals.Sender)
[2024-05-16 10:30:00,397] WARN [Producer clientId=perf-producer-client] Got error produce response with correlation id 10358 on topic-partition continuous-self-balancing-topic-6, retrying (2147483646 attempts left). Error: NOT_LEADER_OR_FOLLOWER (org.apache.kafka.clients.producer.internals.Sender)
[2024-05-16 10:30:00,397] WARN [Producer clientId=perf-producer-client] Received invalid metadata error in produce request on partition continuous-self-balancing-topic-6 due to org.apache.kafka.common.errors.NotLeaderOrFollowerException: For requests intended only for the leader, this error indicates that the broker is not the current leader. For requests intended for any replica, this error indicates that the broker is not a replica of the topic partition. Going to request metadata update now (org.apache.kafka.clients.producer.internals.Sender)
[2024-05-16 10:30:00,397] WARN [Producer clientId=perf-producer-client] Got error produce response with correlation id 10358 on topic-partition continuous-self-balancing-topic-4, retrying (2147483646 attempts left). Error: NOT_LEADER_OR_FOLLOWER (org.apache.kafka.clients.producer.internals.Sender)
[2024-05-16 10:30:00,397] WARN [Producer clientId=perf-producer-client] Received invalid metadata error in produce request on partition continuous-self-balancing-topic-4 due to org.apache.kafka.common.errors.NotLeaderOrFollowerException: For requests intended only for the leader, this error indicates that the broker is not the current leader. For requests intended for any replica, this error indicates that the broker is not a replica of the topic partition. Going to request metadata update now (org.apache.kafka.clients.producer.internals.Sender)
[2024-05-16 10:30:00,398] WARN [Producer clientId=perf-producer-client] Got error produce response with correlation id 10360 on topic-partition continuous-self-balancing-topic-6, retrying (2147483646 attempts left). Error: NOT_LEADER_OR_FOLLOWER (org.apache.kafka.clients.producer.internals.Sender)
[2024-05-16 10:30:00,398] WARN [Producer clientId=perf-producer-client] Received invalid metadata error in produce request on partition continuous-self-balancing-topic-6 due to org.apache.kafka.common.errors.NotLeaderOrFollowerException: For requests intended only for the leader, this error indicates that the broker is not the current leader. For requests intended for any replica, this error indicates that the broker is not a replica of the topic partition. Going to request metadata update now (org.apache.kafka.clients.producer.internals.Sender)
[2024-05-16 10:30:00,398] WARN [Producer clientId=perf-producer-client] Got error produce response with correlation id 10360 on topic-partition continuous-self-balancing-topic-4, retrying (2147483646 attempts left). Error: NOT_LEADER_OR_FOLLOWER (org.apache.kafka.clients.producer.internals.Sender)
[2024-05-16 10:30:00,398] WARN [Producer clientId=perf-producer-client] Received invalid metadata error in produce request on partition continuous-self-balancing-topic-4 due to org.apache.kafka.common.errors.NotLeaderOrFollowerException: For requests intended only for the leader, this error indicates that the broker is not the current leader. For requests intended for any replica, this error indicates that the broker is not a replica of the topic partition. Going to request metadata update now (org.apache.kafka.clients.producer.internals.Sender)
[2024-05-16 10:30:00,411] WARN [Producer clientId=perf-producer-client] Got error produce response with correlation id 10361 on topic-partition continuous-self-balancing-topic-4, retrying (2147483646 attempts left). Error: NOT_LEADER_OR_FOLLOWER (org.apache.kafka.clients.producer.internals.Sender)
[2024-05-16 10:30:00,412] WARN [Producer clientId=perf-producer-client] Received invalid metadata error in produce request on partition continuous-self-balancing-topic-4 due to org.apache.kafka.common.errors.NotLeaderOrFollowerException: For requests intended only for the leader, this error indicates that the broker is not the current leader. For requests intended for any replica, this error indicates that the broker is not a replica of the topic partition. Going to request metadata update now (org.apache.kafka.clients.producer.internals.Sender)
[2024-05-16 10:30:00,412] WARN [Producer clientId=perf-producer-client] Got error produce response with correlation id 10362 on topic-partition continuous-self-balancing-topic-4, retrying (2147483646 attempts left). Error: NOT_LEADER_OR_FOLLOWER (org.apache.kafka.clients.producer.internals.Sender)
[2024-05-16 10:30:00,412] WARN [Producer clientId=perf-producer-client] Received invalid metadata error in produce request on partition continuous-self-balancing-topic-4 due to org.apache.kafka.common.errors.NotLeaderOrFollowerException: For requests intended only for the leader, this error indicates that the broker is not the current leader. For requests intended for any replica, this error indicates that the broker is not a replica of the topic partition. Going to request metadata update now (org.apache.kafka.clients.producer.internals.Sender)
[2024-05-16 10:29:55,450] 25327 records sent, 5064.4 records/sec (4.95 MB/sec), 15.3 ms avg latency, 80.0 ms max latency.

Wait for several seconds, and production will return to normal. Then check the partition status again.


Topic: continuous-self-balancing-topic TopicId: HtVB3bM7TYaNKKKmm7khQw PartitionCount: 8 ReplicationFactor: 1 Configs: min.insync.replicas=1,segment.bytes=1073741824
Topic: continuous-self-balancing-topic Partition: 0 Leader: 1 Replicas: 1 Isr: 1
Topic: continuous-self-balancing-topic Partition: 1 Leader: 2 Replicas: 2 Isr: 2
Topic: continuous-self-balancing-topic Partition: 2 Leader: 2 Replicas: 2 Isr: 2
Topic: continuous-self-balancing-topic Partition: 3 Leader: 2 Replicas: 2 Isr: 2
Topic: continuous-self-balancing-topic Partition: 4 Leader: 1 Replicas: 1 Isr: 1
Topic: continuous-self-balancing-topic Partition: 5 Leader: 2 Replicas: 2 Isr: 2
Topic: continuous-self-balancing-topic Partition: 6 Leader: 1 Replicas: 1 Isr: 1
Topic: continuous-self-balancing-topic Partition: 7 Leader: 1 Replicas: 1 Isr: 1

It was observed that since we migrated all partitions to node2, all messages were sent to node2, creating a local hotspot on node2. This triggered AutoMQ's Self-Balancing feature. AutoMQ then redistributed the partitions to balance them across all nodes.