Example: Continuous Data Self-Balancing

This document explains the data auto-rebalance testing for the AutoMQ cluster using the Kafka CLI tool. The Kafka CLI tool is executed through the Docker image provided by AutoMQ.

Create a topic with multiple partitions and manually reassign partitions to specific nodes to induce an imbalance in partition distribution.
Subsequently, send a balanced load to all partitions and observe if the partitions automatically reassign between different brokers.

This automatic data balancing is a fundamental feature of AutoMQ, ensuring automatic equitable distribution of data across the cluster. By monitoring the distribution of partitions and the load on brokers, you can verify whether the automatic partition balancing functions as expected.

Prerequisites

Before conducting automatic partition data rebalance tests, the following conditions must be met:

Complete the installation and deployment of the AutoMQ cluster, you can refer to the following methods for installing and deploying AutoMQ:

Deploy Multi-Nodes Test Cluster on Docker▸

If you deploy a cluster using Deploy Multi-Nodes Cluster on Linux▸ or Deploy Multi-Nodes Cluster on Kubernetes▸, ensure that when starting the Controller, the autobalancer.controller.enable is set to true to enable automatic data rebalancing.

Additionally, the host running the test program needs to meet the following criteria:

Linux/Mac/Windows Subsystem for Linux
Docker

Experience: Continuous Data Rebalance

If a previous AutoMQ cluster was deployed using Deploy Multi-Nodes Test Cluster on Docker▸, the cluster Bootstrap address you obtain will be similar to "server1:9092,server2:9092,server3:9092", and the AutoMQ cluster will be located under the "automq_net" Docker network.

Please substitute the bootstrap-server address below with the actual cluster address as per the deployment configuration.

Create Topic

docker run  --network automq_net  automqinc/automq:1.5.1 /bin/bash -c "/opt/kafka/kafka/bin/kafka-topics.sh --partitions 8 --create --topic continuous-self-balancing-topic --bootstrap-server server1:9092,server2:9092,server3:9092"

View Partition Distribution

docker run  --network automq_net   automqinc/automq:1.5.1 /bin/bash -c "/opt/kafka/kafka/bin/kafka-topics.sh --topic continuous-self-balancing-topic --describe --bootstrap-server server1:9092,server2:9092,server3:9092"

Topic: continuous-self-balancing-topic TopicId: DNZe6gBQTrCOEAruQ_y2tg PartitionCount: 8       ReplicationFactor: 1    Configs: min.insync.replicas=1,segment.bytes=1073741824
    Topic: continuous-self-balancing-topic Partition: 0    Leader: 2   Replicas: 2     Isr: 2
    Topic: continuous-self-balancing-topic Partition: 1    Leader: 1   Replicas: 1     Isr: 1
    Topic: continuous-self-balancing-topic Partition: 2    Leader: 1   Replicas: 1     Isr: 1
    Topic: continuous-self-balancing-topic Partition: 3    Leader: 2   Replicas: 2     Isr: 2
    Topic: continuous-self-balancing-topic Partition: 4    Leader: 1   Replicas: 1     Isr: 1
    Topic: continuous-self-balancing-topic Partition: 5    Leader: 2   Replicas: 2     Isr: 2
    Topic: continuous-self-balancing-topic Partition: 6    Leader: 1   Replicas: 1     Isr: 1
    Topic: continuous-self-balancing-topic Partition: 7    Leader: 2   Replicas: 2     Isr: 2  

Manually Reassign Partitions

To facilitate the observation of continuous data rebalancing, we manually reassign partitions to node2.

Create a plan for partition reassignment.

echo '{
    "partitions": [
        {"topic": "continuous-self-balancing-topic", "partition": 0, "replicas": [2]},
        {"topic": "continuous-self-balancing-topic", "partition": 1, "replicas": [2]},
        {"topic": "continuous-self-balancing-topic", "partition": 2, "replicas": [2]},
        {"topic": "continuous-self-balancing-topic", "partition": 3, "replicas": [2]},
        {"topic": "continuous-self-balancing-topic", "partition": 4, "replicas": [2]},
        {"topic": "continuous-self-balancing-topic", "partition": 5, "replicas": [2]},
        {"topic": "continuous-self-balancing-topic", "partition": 6, "replicas": [2]},
        {"topic": "continuous-self-balancing-topic", "partition": 7, "replicas": [2]}
    ],
    "version": 1
}' > move.json

Implement the partition reassignment plan.

docker run --network automq_net -v $(pwd)/move.json:/move.json automqinc/automq:1.5.1 /bin/bash -c "/opt/kafka/kafka/bin/kafka-reassign-partitions.sh --bootstrap-server  server1:9092,server2:9092,server3:9092 --reassignment-json-file /move.json --execute"

After the manual reassignment, view the partition distribution as follows:

Topic: continuous-self-balancing-topic TopicId: HtVB3bM7TYaNKKKmm7khQw PartitionCount: 8   ReplicationFactor: 1    Configs: min.insync.replicas=1,segment.bytes=1073741824
    Topic: continuous-self-balancing-topic Partition: 0    Leader: 2   Replicas: 2 Isr: 2
    Topic: continuous-self-balancing-topic Partition: 1    Leader: 2   Replicas: 2 Isr: 2
    Topic: continuous-self-balancing-topic Partition: 2    Leader: 2   Replicas: 2 Isr: 2
    Topic: continuous-self-balancing-topic Partition: 3    Leader: 2   Replicas: 2 Isr: 2
    Topic: continuous-self-balancing-topic Partition: 4    Leader: 2   Replicas: 2 Isr: 2
    Topic: continuous-self-balancing-topic Partition: 5    Leader: 2   Replicas: 2 Isr: 2
    Topic: continuous-self-balancing-topic Partition: 6    Leader: 2   Replicas: 2 Isr: 2
    Topic: continuous-self-balancing-topic Partition: 7    Leader: 2   Replicas: 2 Isr: 2

Start the Producer.

docker run  --network automq_net  automqinc/automq:1.5.1 /bin/bash -c  "/opt/kafka/kafka/bin/kafka-producer-perf-test.sh --topic continuous-self-balancing-topic --num-records=1024000 --throughput 5120 --record-size 1024 --producer-props bootstrap.servers=server1:9092,server2:9092,server3:9092 linger.ms=100 batch.size=524288 buffer.memory=134217728 max.request.size=67108864"

Start the Consumer.

docker run --network automq_net  automqinc/automq:1.5.1 /bin/bash -c "/opt/kafka/kafka/bin/kafka-consumer-perf-test.sh --topic continuous-self-balancing-topic --show-detailed-stats --timeout 300000 --messages=1024000 --reporting-interval 1000 --bootstrap-server=server1:9092,server2:9092,server3:9092"

Rechecking Partition Distribution

After some time, you'll notice the producer generating the following logs.

[2024-05-16 10:29:50,448] 25622 records sent, 5123.4 records/sec (5.00 MB/sec), 15.7 ms avg latency, 41.0 ms max latency.
[2024-05-16 10:30:00,372] WARN [Producer clientId=perf-producer-client] Got error produce response with correlation id 10354 on topic-partition continuous-self-balancing-topic-7, retrying (2147483646 attempts left). Error: NOT_LEADER_OR_FOLLOWER (org.apache.kafka.clients.producer.internals.Sender)
[2024-05-16 10:30:00,373] WARN [Producer clientId=perf-producer-client] Received invalid metadata error in produce request on partition continuous-self-balancing-topic-7 due to org.apache.kafka.common.errors.NotLeaderOrFollowerException: For requests intended only for the leader, this error indicates that the broker is not the current leader. For requests intended for any replica, this error indicates that the broker is not a replica of the topic partition. Going to request metadata update now (org.apache.kafka.clients.producer.internals.Sender)
[2024-05-16 10:30:00,373] WARN [Producer clientId=perf-producer-client] Got error produce response with correlation id 10354 on topic-partition continuous-self-balancing-topic-0, retrying (2147483646 attempts left). Error: NOT_LEADER_OR_FOLLOWER (org.apache.kafka.clients.producer.internals.Sender)
[2024-05-16 10:30:00,373] WARN [Producer clientId=perf-producer-client] Received invalid metadata error in produce request on partition continuous-self-balancing-topic-0 due to org.apache.kafka.common.errors.NotLeaderOrFollowerException: For requests intended only for the leader, this error indicates that the broker is not the current leader. For requests intended for any replica, this error indicates that the broker is not a replica of the topic partition. Going to request metadata update now (org.apache.kafka.clients.producer.internals.Sender)
[2024-05-16 10:30:00,384] WARN [Producer clientId=perf-producer-client] Got error produce response with correlation id 10356 on topic-partition continuous-self-balancing-topic-7, retrying (2147483646 attempts left). Error: NOT_LEADER_OR_FOLLOWER (org.apache.kafka.clients.producer.internals.Sender)
[2024-05-16 10:30:00,384] WARN [Producer clientId=perf-producer-client] Received invalid metadata error in produce request on partition continuous-self-balancing-topic-7 due to org.apache.kafka.common.errors.NotLeaderOrFollowerException: For requests intended only for the leader, this error indicates that the broker is not the current leader. For requests intended for any replica, this error indicates that the broker is not a replica of the topic partition. Going to request metadata update now (org.apache.kafka.clients.producer.internals.Sender)
[2024-05-16 10:30:00,384] WARN [Producer clientId=perf-producer-client] Got error produce response with correlation id 10356 on topic-partition continuous-self-balancing-topic-0, retrying (2147483646 attempts left). Error: NOT_LEADER_OR_FOLLOWER (org.apache.kafka.clients.producer.internals.Sender)
[2024-05-16 10:30:00,384] WARN [Producer clientId=perf-producer-client] Received invalid metadata error in produce request on partition continuous-self-balancing-topic-0 due to org.apache.kafka.common.errors.NotLeaderOrFollowerException: For requests intended only for the leader, this error indicates that the broker is not the current leader. For requests intended for any replica, this error indicates that the broker is not a replica of the topic partition. Going to request metadata update now (org.apache.kafka.clients.producer.internals.Sender)
[2024-05-16 10:30:00,384] WARN [Producer clientId=perf-producer-client] Got error produce response with correlation id 10356 on topic-partition continuous-self-balancing-topic-6, retrying (2147483646 attempts left). Error: NOT_LEADER_OR_FOLLOWER (org.apache.kafka.clients.producer.internals.Sender)
[2024-05-16 10:30:00,384] WARN [Producer clientId=perf-producer-client] Received invalid metadata error in produce request on partition continuous-self-balancing-topic-6 due to org.apache.kafka.common.errors.NotLeaderOrFollowerException: For requests intended only for the leader, this error indicates that the broker is not the current leader. For requests intended for any replica, this error indicates that the broker is not a replica of the topic partition. Going to request metadata update now (org.apache.kafka.clients.producer.internals.Sender)
[2024-05-16 10:30:00,385] WARN [Producer clientId=perf-producer-client] Got error produce response with correlation id 10357 on topic-partition continuous-self-balancing-topic-0, retrying (2147483646 attempts left). Error: NOT_LEADER_OR_FOLLOWER (org.apache.kafka.clients.producer.internals.Sender)
[2024-05-16 10:30:00,385] WARN [Producer clientId=perf-producer-client] Received invalid metadata error in produce request on partition continuous-self-balancing-topic-0 due to org.apache.kafka.common.errors.NotLeaderOrFollowerException: For requests intended only for the leader, this error indicates that the broker is not the current leader. For requests intended for any replica, this error indicates that the broker is not a replica of the topic partition. Going to request metadata update now (org.apache.kafka.clients.producer.internals.Sender)
[2024-05-16 10:30:00,397] WARN [Producer clientId=perf-producer-client] Got error produce response with correlation id 10358 on topic-partition continuous-self-balancing-topic-7, retrying (2147483646 attempts left). Error: NOT_LEADER_OR_FOLLOWER (org.apache.kafka.clients.producer.internals.Sender)
[2024-05-16 10:30:00,397] WARN [Producer clientId=perf-producer-client] Received invalid metadata error in produce request on partition continuous-self-balancing-topic-7 due to org.apache.kafka.common.errors.NotLeaderOrFollowerException: For requests intended only for the leader, this error indicates that the broker is not the current leader. For requests intended for any replica, this error indicates that the broker is not a replica of the topic partition. Going to request metadata update now (org.apache.kafka.clients.producer.internals.Sender)
[2024-05-16 10:30:00,397] WARN [Producer clientId=perf-producer-client] Got error produce response with correlation id 10358 on topic-partition continuous-self-balancing-topic-6, retrying (2147483646 attempts left). Error: NOT_LEADER_OR_FOLLOWER (org.apache.kafka.clients.producer.internals.Sender)
[2024-05-16 10:30:00,397] WARN [Producer clientId=perf-producer-client] Received invalid metadata error in produce request on partition continuous-self-balancing-topic-6 due to org.apache.kafka.common.errors.NotLeaderOrFollowerException: For requests intended only for the leader, this error indicates that the broker is not the current leader. For requests intended for any replica, this error indicates that the broker is not a replica of the topic partition. Going to request metadata update now (org.apache.kafka.clients.producer.internals.Sender)
[2024-05-16 10:30:00,397] WARN [Producer clientId=perf-producer-client] Got error produce response with correlation id 10358 on topic-partition continuous-self-balancing-topic-4, retrying (2147483646 attempts left). Error: NOT_LEADER_OR_FOLLOWER (org.apache.kafka.clients.producer.internals.Sender)
[2024-05-16 10:30:00,397] WARN [Producer clientId=perf-producer-client] Received invalid metadata error in produce request on partition continuous-self-balancing-topic-4 due to org.apache.kafka.common.errors.NotLeaderOrFollowerException: For requests intended only for the leader, this error indicates that the broker is not the current leader. For requests intended for any replica, this error indicates that the broker is not a replica of the topic partition. Going to request metadata update now (org.apache.kafka.clients.producer.internals.Sender)
[2024-05-16 10:30:00,398] WARN [Producer clientId=perf-producer-client] Got error produce response with correlation id 10360 on topic-partition continuous-self-balancing-topic-6, retrying (2147483646 attempts left). Error: NOT_LEADER_OR_FOLLOWER (org.apache.kafka.clients.producer.internals.Sender)
[2024-05-16 10:30:00,398] WARN [Producer clientId=perf-producer-client] Received invalid metadata error in produce request on partition continuous-self-balancing-topic-6 due to org.apache.kafka.common.errors.NotLeaderOrFollowerException: For requests intended only for the leader, this error indicates that the broker is not the current leader. For requests intended for any replica, this error indicates that the broker is not a replica of the topic partition. Going to request metadata update now (org.apache.kafka.clients.producer.internals.Sender)
[2024-05-16 10:30:00,398] WARN [Producer clientId=perf-producer-client] Got error produce response with correlation id 10360 on topic-partition continuous-self-balancing-topic-4, retrying (2147483646 attempts left). Error: NOT_LEADER_OR_FOLLOWER (org.apache.kafka.clients.producer.internals.Sender)
[2024-05-16 10:30:00,398] WARN [Producer clientId=perf-producer-client] Received invalid metadata error in produce request on partition continuous-self-balancing-topic-4 due to org.apache.kafka.common.errors.NotLeaderOrFollowerException: For requests intended only for the leader, this error indicates that the broker is not the current leader. For requests intended for any replica, this error indicates that the broker is not a replica of the topic partition. Going to request metadata update now (org.apache.kafka.clients.producer.internals.Sender)
[2024-05-16 10:30:00,411] WARN [Producer clientId=perf-producer-client] Got error produce response with correlation id 10361 on topic-partition continuous-self-balancing-topic-4, retrying (2147483646 attempts left). Error: NOT_LEADER_OR_FOLLOWER (org.apache.kafka.clients.producer.internals.Sender)
[2024-05-16 10:30:00,412] WARN [Producer clientId=perf-producer-client] Received invalid metadata error in produce request on partition continuous-self-balancing-topic-4 due to org.apache.kafka.common.errors.NotLeaderOrFollowerException: For requests intended only for the leader, this error indicates that the broker is not the current leader. For requests intended for any replica, this error indicates that the broker is not a replica of the topic partition. Going to request metadata update now (org.apache.kafka.clients.producer.internals.Sender)
[2024-05-16 10:30:00,412] WARN [Producer clientId=perf-producer-client] Got error produce response with correlation id 10362 on topic-partition continuous-self-balancing-topic-4, retrying (2147483646 attempts left). Error: NOT_LEADER_OR_FOLLOWER (org.apache.kafka.clients.producer.internals.Sender)
[2024-05-16 10:30:00,412] WARN [Producer clientId=perf-producer-client] Received invalid metadata error in produce request on partition continuous-self-balancing-topic-4 due to org.apache.kafka.common.errors.NotLeaderOrFollowerException: For requests intended only for the leader, this error indicates that the broker is not the current leader. For requests intended for any replica, this error indicates that the broker is not a replica of the topic partition. Going to request metadata update now (org.apache.kafka.clients.producer.internals.Sender)
[2024-05-16 10:29:55,450] 25327 records sent, 5064.4 records/sec (4.95 MB/sec), 15.3 ms avg latency, 80.0 ms max latency.

After waiting a few seconds, production will resume normal operations. Check the partition status again afterward.

Topic: continuous-self-balancing-topic TopicId: HtVB3bM7TYaNKKKmm7khQw PartitionCount: 8   ReplicationFactor: 1    Configs: min.insync.replicas=1,segment.bytes=1073741824
    Topic: continuous-self-balancing-topic Partition: 0    Leader: 1   Replicas: 1 Isr: 1
    Topic: continuous-self-balancing-topic Partition: 1    Leader: 2   Replicas: 2 Isr: 2
    Topic: continuous-self-balancing-topic Partition: 2    Leader: 2   Replicas: 2 Isr: 2
    Topic: continuous-self-balancing-topic Partition: 3    Leader: 2   Replicas: 2 Isr: 2
    Topic: continuous-self-balancing-topic Partition: 4    Leader: 1   Replicas: 1 Isr: 1
    Topic: continuous-self-balancing-topic Partition: 5    Leader: 2   Replicas: 2 Isr: 2
    Topic: continuous-self-balancing-topic Partition: 6    Leader: 1   Replicas: 1 Isr: 1
    Topic: continuous-self-balancing-topic Partition: 7    Leader: 1   Replicas: 1 Isr: 1

It has been observed that because we reassigned all partitions to node2, all messages are being sent to node2 during production, creating a local hotspot on node2 and triggering AutoMQ's Self-Balancing. AutoMQ reassigns partitions to achieve a balanced state across the nodes.

Prerequisites​

Experience: Continuous Data Rebalance​

Create Topic​

View Partition Distribution​

Manually Reassign Partitions​

Start the Producer.​

Start the Consumer.​

Rechecking Partition Distribution​