Skip to Main Content

AutoMQ vs. Apache Kafka Benchmarks and Cost

AutoMQ utilizes cloud storage to transform the storage layer of Apache Kafka in a cloud-native way. It not only ensures full compatibility with the Apache Kafka API but also achieves a 14-fold cost advantage compared to Apache Kafka, with a data throughput of 1GiB/s (prior to compression). The total monthly cost of ownership decreases from $65,385 with Apache Kafka to $4,380. The detailed costs are provided in the table below:

Comparison Item
AutoMQ
Apache Kafka
Compute
$714
$5,242
Storage
$1,723
$20,183
S3 API
$1,928
$0
Inter-Zone data transfer
$15
$39,960
Total
$4,380
$65,385

Benchmark Preparation

For ease of comparison with other products, AutoMQ selected an industry-standard load test scenario: using a traffic rate of 1 GiB/s (before compression) to conduct production and consumption load testing on a topic with 256 partitions.

AutoMQ Cluster

AutoMQ can be deployed on Kubernetes using the Bitnami Kafka Helm Chart. You can refer to Install AutoMQ by Helm Chart to deploy a 3-availability zone AutoMQ cluster on AWS:

  • A total of 6 nodes, including 3 servers and 3 brokers, are evenly distributed across 3 availability zones;

  • Physical node type selection m7g.xlarge: Specifications 4C16G, network baseline 238 MiB/s, price $119.14/month.

Load Tester

The load tester deploys one m6n.8xlarge in each of the 3 availability zones to simulate multi-availability zone workloads. The load testing is conducted using the automq-perf-test.sh script provided by AutoMQ with the following configuration.


KAFKA_HEAP_OPTS="-Xmx32g -Xms32g" nohup ./bin/automq-perf-test.sh --bootstrap-server $bootstrap_server --record-size 1024 --random-ratio 0.25 --topics 1 --partitions-per-topic 256 --topic-prefix perf --producers-per-topic 32 --groups-per-topic 1 --consumers-per-group 32 --send-rate 350000 --warmup-duration 2 --test-duration 180 --producer-configs batch.size=1048576 linger.ms=100 buffer.memory=134217728 max.request.size=67108864 compression.type=lz4 client.id='automq_az=apse1-az1' --consumer-configs fetch.max.bytes=104857600 max.partition.fetch.bytes=104857600 client.id='automq_az=apse1-az1' --reset &> nohup.log &

KAFKA_HEAP_OPTS="-Xmx32g -Xms32g" nohup ./bin/automq-perf-test.sh --bootstrap-server $bootstrap_server ... client.id='automq_az=apse1-az2' --await-topic-ready false &> nohup.log &

KAFKA_HEAP_OPTS="-Xmx32g -Xms32g" nohup ./bin/automq-perf-test.sh --bootstrap-server $bootstrap_server ... client.id='automq_az=apse1-az3' --await-topic-ready false &> nohup.log &

With this configuration, the load tester simulates the following scenario:

  • A total of 96 Producers and 96 Consumers, evenly distributed across 3 availability zones, are marked with client.id=automq_az=apse1-az1 to indicate the client's respective availability zone.

  • 1,050,000 records of 1 KiB each are being sent per second, resulting in a total data rate of 1 GiB/s (before compression), with an expected non-random data ratio of 0.25.

  • Sending parameters are additionally configured with batch.size=1048576 and linger.ms=100 to achieve better batching performance, suitable for most high-throughput Kafka scenarios.

Costs

In a scenario with data retention of 3 days (log.retention.hours=72), using AWS's us-east-1 region as an example, the total cost of ownership for AutoMQ is $4,380 per month. The total cost of ownership includes compute, storage, S3 API, and Inter-Zone traffic.

  • Compute: In this performance test scenario, AutoMQ utilizes 6 m7g.xlarge instances, with a total compute resource cost of 6 * $119.14 = $714 per month.

  • Storage: Traffic * 3 days * S3 price = (296 / 1024) * (60 * 60 * 24 * 3) * 0.023 = $1,723 per month.

  • S3 API: With this traffic, the average Get request is 416 per second, and the average Put request is 115.52 per second, leading to a cost of $1,928 per month.

  • Cross-Zone Traffic: While AutoMQ ensures clients send and receive messages only from Brokers within the same availability zone, Brokers still generate a small amount of RPC requests for synchronizing KRaft metadata and forwarding ZoneRouterProduceRequest. We employ iftop -t -s 60 -L 100 to monitor cross-zone traffic on Broker nodes. The average cross-zone Network In + Network Out for one node is 0.1 MiB/s, resulting in a total cross-zone traffic cost for 6 nodes of 0.1 / 1024 * 6 * (60 * 60 * 24 * 30) * 0.01 ~= $15 per month.

The total cost of AutoMQ is: Compute + Storage + S3 API + Cross-Zone Traffic = $4,381.115 per month.

Compared to Apache Kafka, This Represents a 14x Cost Reduction.

The same workload on Apache Kafka requires $65,385 per month, which is 14 times that of AutoMQ. The total cost of ownership includes compute, storage, and Inter-Zone traffic.

  • Storage: Store ((296 / 1024 / 1024) * (60 * 60 * 24 * 3)) = 73 TiB of data. Storage with 3 replicas would consume 219 TiB of space. Considering buffering and uneven data distribution, and assuming 50% effective disk utilization, 438 TiB of disk space should be prepared. To save on disk storage costs using st1 as the storage medium, the storage cost is (438 * 1024 * 0.045) = $20,183 per month.

  • Compute: An EBS can be a maximum of 16 TiB, thus requiring at least 27 volumes, corresponding to 27 brokers. According to production practices, the smallest recommended instance type is r4.xlarge, which costs $5,242 per month.

  • Inter-Zone traffic: With partitions and traffic evenly distributed, 2/3 of the production traffic will be sent across availability zones. Consumption can avoid Inter-Zone traffic via Fetch From Follower. Fetch From Follower means the corresponding replicas are distributed across 3 availability zones, generating two additional sets of production traffic between Broker ISR. Overall Inter-Zone traffic is: production traffic ((2 / 3 + 2)) * 30d = (296 * (2 / 3 + 2) / 1024) * ((60 * 60 * 24 * 30)) * 0.02 = $39,960 per month.

The total cost of Apache Kafka is: compute + storage + Inter-Zone traffic = $65,385 per month.

In addition to reducing static costs, the stateless architecture of AutoMQ delivers further cost advantages:

  • AutoBalancing: AutoMQ includes a load balancing component within the Controller, which automatically balances the load based on traffic between nodes. Partition reassignment is completed in less than 2 seconds, eliminating the need for manual load balancing by operations personnel.

  • No Overprovisioning: With stateless nodes and second-level partition reassignments, AutoMQ can scale up or down within minutes—most of the time is spent on resource preparation. After expansion, load balancing is completed in seconds. AutoMQ does not require pre-reserving resources for peak times or handling load balancing at the hour level.

Performance:

1 GiB/s of data, after compression, is reduced to 296 MiB/s, with the CPU utilization of each AutoMQ Broker at 50%. There is ample CPU headroom to handle traffic spikes and additional small packet requests.

AutoMQ writes to S3 using default batch parameters of 8MiB / 250ms. Records are persisted to S3 before a successful acknowledgment is returned to the client. In this test, the latency performance is as follows:

  • Send: avg 369ms, p99 642ms.

  • Consume E2E: avg 1577ms, p99 3162ms.


2025-05-15 09:34:09 - INFO Summary | Prod rate 350044.42 msg/s / 341.84 MiB/s | Prod total 630.31 M msg / 601.11 GiB / 0.00 K err | Cons rate 393002.48 msg/s / 383.79 MiB/s | Cons total 707.66 M msg / 674.88 GiB | Prod Latency (ms) avg: 369.097 - 50%: 362.969 - 75%: 431.169 - 90%: 490.921 - 95%: 529.383 - 99%: 642.595 - 99.9%: 830.523 - 99.99%: 1084.759 - Max: 1469.495 | E2E Latency (ms) avg: 1577.448 - 50%: 1692.823 - 75%: 2086.991 - 90%: 2363.135 - 95%: 2513.055 - 99%: 3162.623 - 99.9%: 6984.543 - 99.99%: 7713.023 - Max: 8380.255