Skip to main content

示例:持续数据自平衡

本文档介绍使用 Kafka CLI 工具对 AutoMQ 集群进行数据自动重平衡测试。其中 Kafka CLI 工具通过 AutoMQ 提供的 docker 镜像运行。

  1. 创建一个具有多个分区的 Topic,并手动将分区迁移到特定节点,制造出分区分布的不均衡。

  2. 然后向所有分区发送均衡的负载,观察分区是否会在不同的 Broker 之间自动迁移。

这种自动数据平衡是 AutoMQ 的内置功能,可以确保数据在集群中自动平衡分布。通过监控分区的分布情况和 Broker 的负载情况,可以验证分区自动平衡是否按预期工作。

前置条件

进行分区数据自动重平衡测试前,需要满足如下条件:

完成 AutoMQ 集群的安装部署 ,您可以参考以下方式安装部署 AutoMQ:

tip

如果通过Linux 主机部署多节点集群▸ 或者Kubernetes 部署多节点集群▸ 部署集群,需要确保启动 Controller 时设置 autobalancer.controller.enable 为 true 才能开启数据自动重平衡。

此外,运行测试程序的主机 需要满足如下条件:

  • Linux/Mac/Windows Subsystem for Linux

  • Docker

info

如果下载容器镜像速度慢,请参照 Docker Hub 镜像加速▸

如果此前的 AutoMQ 集群参考Docker 部署多节点测试集群▸ 部署,则获取的集群 Bootstrap 地址是类似 “server1:9092,server2:9092,server3:9092 ”,且 AutoMQ 集群位于“automq_net ” Docker 网络下。

tip

请根据部署的实际配置,将下方的 bootstrap-server 地址更换成实际集群的地址。

创建 Topic


docker run --network automq_net automqinc/automq:latest /bin/bash -c "/opt/kafka/kafka/bin/kafka-topics.sh --partitions 8 --create --topic continuous-self-balancing-topic --bootstrap-server server1:9092,server2:9092,server3:9092"

查看分区分布


docker run --network automq_net automqinc/automq:latest /bin/bash -c "/opt/kafka/kafka/bin/kafka-topics.sh --topic continuous-self-balancing-topic --describe --bootstrap-server server1:9092,server2:9092,server3:9092"


Topic: continuous-self-balancing-topic TopicId: DNZe6gBQTrCOEAruQ_y2tg PartitionCount: 8 ReplicationFactor: 1 Configs: min.insync.replicas=1,segment.bytes=1073741824
Topic: continuous-self-balancing-topic Partition: 0 Leader: 2 Replicas: 2 Isr: 2
Topic: continuous-self-balancing-topic Partition: 1 Leader: 1 Replicas: 1 Isr: 1
Topic: continuous-self-balancing-topic Partition: 2 Leader: 1 Replicas: 1 Isr: 1
Topic: continuous-self-balancing-topic Partition: 3 Leader: 2 Replicas: 2 Isr: 2
Topic: continuous-self-balancing-topic Partition: 4 Leader: 1 Replicas: 1 Isr: 1
Topic: continuous-self-balancing-topic Partition: 5 Leader: 2 Replicas: 2 Isr: 2
Topic: continuous-self-balancing-topic Partition: 6 Leader: 1 Replicas: 1 Isr: 1
Topic: continuous-self-balancing-topic Partition: 7 Leader: 2 Replicas: 2 Isr: 2

手动迁移分区

为了便于观察到持续数据自平衡,我们手动地将分区迁移到 node2 上。

  1. 创建分区迁移计划。

echo '{
"partitions": [
{"topic": "continuous-self-balancing-topic", "partition": 0, "replicas": [2]},
{"topic": "continuous-self-balancing-topic", "partition": 1, "replicas": [2]},
{"topic": "continuous-self-balancing-topic", "partition": 2, "replicas": [2]},
{"topic": "continuous-self-balancing-topic", "partition": 3, "replicas": [2]},
{"topic": "continuous-self-balancing-topic", "partition": 4, "replicas": [2]},
{"topic": "continuous-self-balancing-topic", "partition": 5, "replicas": [2]},
{"topic": "continuous-self-balancing-topic", "partition": 6, "replicas": [2]},
{"topic": "continuous-self-balancing-topic", "partition": 7, "replicas": [2]}
],
"version": 1
}' > move.json

  1. 执行分区迁移计划。

docker run --network automq_net -v $(pwd)/move.json:/move.json automqinc/automq:latest /bin/bash -c "/opt/kafka/kafka/bin/kafka-reassign-partitions.sh --bootstrap-server server1:9092,server2:9092,server3:9092 --reassignment-json-file /move.json --execute"

  1. 手动迁移后,查看分区分布如下:

Topic: continuous-self-balancing-topic TopicId: HtVB3bM7TYaNKKKmm7khQw PartitionCount: 8 ReplicationFactor: 1 Configs: min.insync.replicas=1,segment.bytes=1073741824
Topic: continuous-self-balancing-topic Partition: 0 Leader: 2 Replicas: 2 Isr: 2
Topic: continuous-self-balancing-topic Partition: 1 Leader: 2 Replicas: 2 Isr: 2
Topic: continuous-self-balancing-topic Partition: 2 Leader: 2 Replicas: 2 Isr: 2
Topic: continuous-self-balancing-topic Partition: 3 Leader: 2 Replicas: 2 Isr: 2
Topic: continuous-self-balancing-topic Partition: 4 Leader: 2 Replicas: 2 Isr: 2
Topic: continuous-self-balancing-topic Partition: 5 Leader: 2 Replicas: 2 Isr: 2
Topic: continuous-self-balancing-topic Partition: 6 Leader: 2 Replicas: 2 Isr: 2
Topic: continuous-self-balancing-topic Partition: 7 Leader: 2 Replicas: 2 Isr: 2

启动生产者


docker run --network automq_net automqinc/automq:latest /bin/bash -c "/opt/kafka/kafka/bin/kafka-producer-perf-test.sh --topic continuous-self-balancing-topic --num-records=1024000 --throughput 5120 --record-size 1024 --producer-props bootstrap.servers=server1:9092,server2:9092,server3:9092 linger.ms=100 batch.size=524288 buffer.memory=134217728 max.request.size=67108864"

启动消费者


docker run --network automq_net automqinc/automq:1.0.4 /bin/bash -c "/opt/kafka/kafka/bin/kafka-consumer-perf-test.sh --topic continuous-self-balancing-topic --show-detailed-stats --timeout 300000 --messages=1024000 --reporting-interval 1000 --bootstrap-server=server1:9092,server2:9092,server3:9092"

再次查看分区分布

经过一段时间后,你会观察到生产者产生以下日志。


[2024-05-16 10:29:50,448] 25622 records sent, 5123.4 records/sec (5.00 MB/sec), 15.7 ms avg latency, 41.0 ms max latency.
[2024-05-16 10:30:00,372] WARN [Producer clientId=perf-producer-client] Got error produce response with correlation id 10354 on topic-partition continuous-self-balancing-topic-7, retrying (2147483646 attempts left). Error: NOT_LEADER_OR_FOLLOWER (org.apache.kafka.clients.producer.internals.Sender)
[2024-05-16 10:30:00,373] WARN [Producer clientId=perf-producer-client] Received invalid metadata error in produce request on partition continuous-self-balancing-topic-7 due to org.apache.kafka.common.errors.NotLeaderOrFollowerException: For requests intended only for the leader, this error indicates that the broker is not the current leader. For requests intended for any replica, this error indicates that the broker is not a replica of the topic partition. Going to request metadata update now (org.apache.kafka.clients.producer.internals.Sender)
[2024-05-16 10:30:00,373] WARN [Producer clientId=perf-producer-client] Got error produce response with correlation id 10354 on topic-partition continuous-self-balancing-topic-0, retrying (2147483646 attempts left). Error: NOT_LEADER_OR_FOLLOWER (org.apache.kafka.clients.producer.internals.Sender)
[2024-05-16 10:30:00,373] WARN [Producer clientId=perf-producer-client] Received invalid metadata error in produce request on partition continuous-self-balancing-topic-0 due to org.apache.kafka.common.errors.NotLeaderOrFollowerException: For requests intended only for the leader, this error indicates that the broker is not the current leader. For requests intended for any replica, this error indicates that the broker is not a replica of the topic partition. Going to request metadata update now (org.apache.kafka.clients.producer.internals.Sender)
[2024-05-16 10:30:00,384] WARN [Producer clientId=perf-producer-client] Got error produce response with correlation id 10356 on topic-partition continuous-self-balancing-topic-7, retrying (2147483646 attempts left). Error: NOT_LEADER_OR_FOLLOWER (org.apache.kafka.clients.producer.internals.Sender)
[2024-05-16 10:30:00,384] WARN [Producer clientId=perf-producer-client] Received invalid metadata error in produce request on partition continuous-self-balancing-topic-7 due to org.apache.kafka.common.errors.NotLeaderOrFollowerException: For requests intended only for the leader, this error indicates that the broker is not the current leader. For requests intended for any replica, this error indicates that the broker is not a replica of the topic partition. Going to request metadata update now (org.apache.kafka.clients.producer.internals.Sender)
[2024-05-16 10:30:00,384] WARN [Producer clientId=perf-producer-client] Got error produce response with correlation id 10356 on topic-partition continuous-self-balancing-topic-0, retrying (2147483646 attempts left). Error: NOT_LEADER_OR_FOLLOWER (org.apache.kafka.clients.producer.internals.Sender)
[2024-05-16 10:30:00,384] WARN [Producer clientId=perf-producer-client] Received invalid metadata error in produce request on partition continuous-self-balancing-topic-0 due to org.apache.kafka.common.errors.NotLeaderOrFollowerException: For requests intended only for the leader, this error indicates that the broker is not the current leader. For requests intended for any replica, this error indicates that the broker is not a replica of the topic partition. Going to request metadata update now (org.apache.kafka.clients.producer.internals.Sender)
[2024-05-16 10:30:00,384] WARN [Producer clientId=perf-producer-client] Got error produce response with correlation id 10356 on topic-partition continuous-self-balancing-topic-6, retrying (2147483646 attempts left). Error: NOT_LEADER_OR_FOLLOWER (org.apache.kafka.clients.producer.internals.Sender)
[2024-05-16 10:30:00,384] WARN [Producer clientId=perf-producer-client] Received invalid metadata error in produce request on partition continuous-self-balancing-topic-6 due to org.apache.kafka.common.errors.NotLeaderOrFollowerException: For requests intended only for the leader, this error indicates that the broker is not the current leader. For requests intended for any replica, this error indicates that the broker is not a replica of the topic partition. Going to request metadata update now (org.apache.kafka.clients.producer.internals.Sender)
[2024-05-16 10:30:00,385] WARN [Producer clientId=perf-producer-client] Got error produce response with correlation id 10357 on topic-partition continuous-self-balancing-topic-0, retrying (2147483646 attempts left). Error: NOT_LEADER_OR_FOLLOWER (org.apache.kafka.clients.producer.internals.Sender)
[2024-05-16 10:30:00,385] WARN [Producer clientId=perf-producer-client] Received invalid metadata error in produce request on partition continuous-self-balancing-topic-0 due to org.apache.kafka.common.errors.NotLeaderOrFollowerException: For requests intended only for the leader, this error indicates that the broker is not the current leader. For requests intended for any replica, this error indicates that the broker is not a replica of the topic partition. Going to request metadata update now (org.apache.kafka.clients.producer.internals.Sender)
[2024-05-16 10:30:00,397] WARN [Producer clientId=perf-producer-client] Got error produce response with correlation id 10358 on topic-partition continuous-self-balancing-topic-7, retrying (2147483646 attempts left). Error: NOT_LEADER_OR_FOLLOWER (org.apache.kafka.clients.producer.internals.Sender)
[2024-05-16 10:30:00,397] WARN [Producer clientId=perf-producer-client] Received invalid metadata error in produce request on partition continuous-self-balancing-topic-7 due to org.apache.kafka.common.errors.NotLeaderOrFollowerException: For requests intended only for the leader, this error indicates that the broker is not the current leader. For requests intended for any replica, this error indicates that the broker is not a replica of the topic partition. Going to request metadata update now (org.apache.kafka.clients.producer.internals.Sender)
[2024-05-16 10:30:00,397] WARN [Producer clientId=perf-producer-client] Got error produce response with correlation id 10358 on topic-partition continuous-self-balancing-topic-6, retrying (2147483646 attempts left). Error: NOT_LEADER_OR_FOLLOWER (org.apache.kafka.clients.producer.internals.Sender)
[2024-05-16 10:30:00,397] WARN [Producer clientId=perf-producer-client] Received invalid metadata error in produce request on partition continuous-self-balancing-topic-6 due to org.apache.kafka.common.errors.NotLeaderOrFollowerException: For requests intended only for the leader, this error indicates that the broker is not the current leader. For requests intended for any replica, this error indicates that the broker is not a replica of the topic partition. Going to request metadata update now (org.apache.kafka.clients.producer.internals.Sender)
[2024-05-16 10:30:00,397] WARN [Producer clientId=perf-producer-client] Got error produce response with correlation id 10358 on topic-partition continuous-self-balancing-topic-4, retrying (2147483646 attempts left). Error: NOT_LEADER_OR_FOLLOWER (org.apache.kafka.clients.producer.internals.Sender)
[2024-05-16 10:30:00,397] WARN [Producer clientId=perf-producer-client] Received invalid metadata error in produce request on partition continuous-self-balancing-topic-4 due to org.apache.kafka.common.errors.NotLeaderOrFollowerException: For requests intended only for the leader, this error indicates that the broker is not the current leader. For requests intended for any replica, this error indicates that the broker is not a replica of the topic partition. Going to request metadata update now (org.apache.kafka.clients.producer.internals.Sender)
[2024-05-16 10:30:00,398] WARN [Producer clientId=perf-producer-client] Got error produce response with correlation id 10360 on topic-partition continuous-self-balancing-topic-6, retrying (2147483646 attempts left). Error: NOT_LEADER_OR_FOLLOWER (org.apache.kafka.clients.producer.internals.Sender)
[2024-05-16 10:30:00,398] WARN [Producer clientId=perf-producer-client] Received invalid metadata error in produce request on partition continuous-self-balancing-topic-6 due to org.apache.kafka.common.errors.NotLeaderOrFollowerException: For requests intended only for the leader, this error indicates that the broker is not the current leader. For requests intended for any replica, this error indicates that the broker is not a replica of the topic partition. Going to request metadata update now (org.apache.kafka.clients.producer.internals.Sender)
[2024-05-16 10:30:00,398] WARN [Producer clientId=perf-producer-client] Got error produce response with correlation id 10360 on topic-partition continuous-self-balancing-topic-4, retrying (2147483646 attempts left). Error: NOT_LEADER_OR_FOLLOWER (org.apache.kafka.clients.producer.internals.Sender)
[2024-05-16 10:30:00,398] WARN [Producer clientId=perf-producer-client] Received invalid metadata error in produce request on partition continuous-self-balancing-topic-4 due to org.apache.kafka.common.errors.NotLeaderOrFollowerException: For requests intended only for the leader, this error indicates that the broker is not the current leader. For requests intended for any replica, this error indicates that the broker is not a replica of the topic partition. Going to request metadata update now (org.apache.kafka.clients.producer.internals.Sender)
[2024-05-16 10:30:00,411] WARN [Producer clientId=perf-producer-client] Got error produce response with correlation id 10361 on topic-partition continuous-self-balancing-topic-4, retrying (2147483646 attempts left). Error: NOT_LEADER_OR_FOLLOWER (org.apache.kafka.clients.producer.internals.Sender)
[2024-05-16 10:30:00,412] WARN [Producer clientId=perf-producer-client] Received invalid metadata error in produce request on partition continuous-self-balancing-topic-4 due to org.apache.kafka.common.errors.NotLeaderOrFollowerException: For requests intended only for the leader, this error indicates that the broker is not the current leader. For requests intended for any replica, this error indicates that the broker is not a replica of the topic partition. Going to request metadata update now (org.apache.kafka.clients.producer.internals.Sender)
[2024-05-16 10:30:00,412] WARN [Producer clientId=perf-producer-client] Got error produce response with correlation id 10362 on topic-partition continuous-self-balancing-topic-4, retrying (2147483646 attempts left). Error: NOT_LEADER_OR_FOLLOWER (org.apache.kafka.clients.producer.internals.Sender)
[2024-05-16 10:30:00,412] WARN [Producer clientId=perf-producer-client] Received invalid metadata error in produce request on partition continuous-self-balancing-topic-4 due to org.apache.kafka.common.errors.NotLeaderOrFollowerException: For requests intended only for the leader, this error indicates that the broker is not the current leader. For requests intended for any replica, this error indicates that the broker is not a replica of the topic partition. Going to request metadata update now (org.apache.kafka.clients.producer.internals.Sender)
[2024-05-16 10:29:55,450] 25327 records sent, 5064.4 records/sec (4.95 MB/sec), 15.3 ms avg latency, 80.0 ms max latency.

等待若干秒后,生产将恢复正常。然后再次检查分区状态。


Topic: continuous-self-balancing-topic TopicId: HtVB3bM7TYaNKKKmm7khQw PartitionCount: 8 ReplicationFactor: 1 Configs: min.insync.replicas=1,segment.bytes=1073741824
Topic: continuous-self-balancing-topic Partition: 0 Leader: 1 Replicas: 1 Isr: 1
Topic: continuous-self-balancing-topic Partition: 1 Leader: 2 Replicas: 2 Isr: 2
Topic: continuous-self-balancing-topic Partition: 2 Leader: 2 Replicas: 2 Isr: 2
Topic: continuous-self-balancing-topic Partition: 3 Leader: 2 Replicas: 2 Isr: 2
Topic: continuous-self-balancing-topic Partition: 4 Leader: 1 Replicas: 1 Isr: 1
Topic: continuous-self-balancing-topic Partition: 5 Leader: 2 Replicas: 2 Isr: 2
Topic: continuous-self-balancing-topic Partition: 6 Leader: 1 Replicas: 1 Isr: 1
Topic: continuous-self-balancing-topic Partition: 7 Leader: 1 Replicas: 1 Isr: 1

观察到,由于我们将分区全部迁移到了 node2 上,发送消息时全部消息都会被发送到 node2 上,造成了 node2 的局部热点,触发了 AutoMQ 的 Self-Balancing。AutoMQ 将分区重新迁移到各个节点均衡的状态。