Kafka Cross-AZ Replication Cost | The Hidden AWS Bill

Kafka cross-AZ cost on AWS comes from three places: broker-to-broker replication, producer traffic that lands on a leader in another Availability Zone (AZ), and consumer reads that cross AZ boundaries. In a balanced three-AZ Apache Kafka deployment, a 300 MiB/s write workload with 2x read fanout can create roughly 1,200 MiB/s of cross-AZ traffic before you count any other application traffic. That is why Kafka replication cost on AWS often shows up as a network bill, not as a Kafka line item.

The frustrating part is where the charge appears. AWS does not label it "Kafka replication." It usually lands under generic data transfer categories. A platform team can spend weeks tuning brokers, EBS volumes, and instance sizes while the largest line item is quietly being generated by the shape of the replication path.

How AWS Charges for Cross-AZ Data Transfer

AWS charges for some traffic that crosses AZ boundaries inside the same Region. In AWS's own data-transfer guidance, traffic between services in the same Region but different AZs can be charged at around $0.01/GB per direction, depending on the exact service path and region. For this article, the cost model uses AWS us-east-1 pricing as of May 2026 and the same assumptions used by the AutoMQ pricing calculator: cross-AZ Kafka traffic is modeled at approximately $0.02/GiB for a full cross-AZ transfer path.

That distinction matters. Data transfer inside the same AZ is not the problem. A producer talking to a broker in the same AZ is usually fine. The bill grows when Kafka's leader, followers, producers, and consumers are spread across AZs, because the same logical record can travel across zone boundaries several times.

The charge is not AWS being unfair. AWS is billing for regional network movement. The surprise comes from how much network movement traditional Kafka creates when you run it as a highly available, multi-AZ system.

Why Kafka Multi-AZ Replication Creates So Much Traffic

Kafka uses leader/follower replication. For each partition, one replica is elected leader, and the other replicas are followers. Producers send records to the leader. Followers fetch those records from the leader and keep their local logs caught up. Kafka's in-sync replica (ISR) set tracks the replicas that are sufficiently current to participate in durability guarantees.

In a production multi-AZ deployment, teams commonly use replication.factor=3 and spread replicas across AZs. That is a sensible durability model: if one broker or one AZ has a problem, Kafka can continue serving data from other replicas. The cost issue is that durability is achieved by copying full record data between brokers.

So every byte written to a leader may be copied to two follower replicas in other AZs. Consumer traffic adds another layer. Classic Kafka consumers read from the leader by default, which means a consumer in AZ-A can read from a leader in AZ-B even if a follower replica exists locally.

Kafka is not paying for "replication" as a Kafka feature. You are paying AWS for the network bytes that replication produces.

The Formula: Kafka Cross-AZ Traffic at 300 MiB/s

Use a simple three-AZ model:

W = write throughput
F = read fanout
Leaders, producers, and consumers are evenly distributed across three AZs
Consumers fetch from leaders, not closest followers

Under those assumptions, Kafka cross-AZ traffic is approximately:

plaintext

cross_az_throughput ~= 2W + (2/3)W + (2/3 x F)W

The three terms come from different parts of the data path. Broker replication is 2W because a replication.factor=3 topic writes one leader copy and two follower copies. Producer-to-leader traffic contributes (2/3)W because, with even distribution, two out of three producer writes are expected to land on a leader in another AZ. Consumer reads contribute (2/3 x F)W for the same locality reason, multiplied by read fanout.

For W = 300 MiB/s and F = 2, the formula becomes:

plaintext

2W + (2/3)W + (2/3 x 2)W
= 600 + 200 + 400
= 1,200 MiB/s of cross-AZ traffic

This is an estimate, not a packet-level trace. Real clusters vary with partition leadership, client placement, rack awareness, consumer group assignment, and failover events. The point is that the multiplier is structural. A 300 MiB/s Kafka workload does not create 300 MiB/s of billable cross-AZ movement. Under common assumptions, it can create about 4x that amount.

Kafka AWS Network Cost: What the Bill Looks Like

The AutoMQ pricing calculator reports 769,921.88 GiB of monthly ingress for this workload: 300 MiB/s sustained writes over a 30-day month, with 72h retention and 2,000 partitions. Applying the cross-AZ traffic model above gives the following monthly network cost:

Cross-AZ source	Formula	Monthly cost
Producer to leader	`(2/3)W`	$10,265.63
Consumer reads	`(2/3 x 2)W`	$20,531.25
Broker replication	`2W`	$30,796.88
Total cross-AZ	`4W`	$61,593.75

The consumer line is the one that keeps growing with fanout. If you add another independent consumer group, you add another (2/3)W worth of cross-AZ read traffic under this model. Follower Fetching can reduce this part, but it does not touch the broker replication term, which is the largest slice in the chart.

Full Kafka Multi-AZ Cost Breakdown

For the same 300 MiB/s write, 2x read fanout, 72h retention workload on AWS us-east-1, the calculator estimates self-managed Kafka at $103,194.63/month:

Cost item	Monthly cost	Share
Cross-AZ traffic	$61,593.75	59.7%
EBS storage	$36,450.00	35.3%
Compute	$5,150.88	5.0%
Total	$103,194.63	100%

This is the part that catches teams off guard. The broker fleet is not the main cost. Compute is only about 5% in this scenario. EBS storage is large because Kafka stores three replicated copies and production clusters need headroom. But the largest single line item is network movement between AZs.

That is why Kafka AWS network cost can feel hard to control. You can right-size instances and tune storage, but if your architecture still moves full record data across AZs for durability, the network bill scales with throughput.

Why Common Fixes Only Reduce Part of the Problem

There are legitimate ways to reduce part of the bill. Follower Fetching, introduced through KIP-392, allows consumers to fetch from a closer replica instead of always reading from the leader. That can reduce consumer-side cross-AZ traffic, especially for read-heavy workloads.

Rack-aware placement and careful client deployment also help. If producers and consumers connect to same-AZ brokers whenever possible, the producer and consumer terms in the formula shrink. These practices are worth doing, especially in high-throughput clusters.

The catch is that neither approach eliminates broker-to-broker replication. Kafka still has to copy records from the leader to follower replicas. Lowering replication.factor can reduce that traffic, but it changes the durability and availability trade-off. Most production teams keep replication.factor=3 because the risk of losing a critical topic is worse than the bill.

These are optimizations. They reduce slices of the pie. They do not remove the pie.

How S3-Native Architecture Eliminates Kafka Cross-AZ Cost

AutoMQ changes the cost model by changing where durability lives. Instead of using broker-to-broker replication as the durable storage layer, AutoMQ stores Kafka data in S3-compatible object storage and makes brokers stateless. S3 provides regional durability internally, so brokers do not need to replicate full record data across AZs for Kafka data durability.

The design principle is "Stay Local, Store Regional." Clients connect to same-AZ brokers where possible. Brokers write data to shared storage. Only metadata and coordination traffic needs to cross AZ boundaries, which is negligible compared with full record replication.

For the same workload, the pricing calculator estimates:

Platform	Monthly cost	Cross-AZ data replication cost
Apache Kafka on AWS	$103,194.63	$61,593.75
AutoMQ BYOC	$21,804.35	$0

That is a 78.9% reduction in total monthly cost for this scenario. The point is not that every Kafka workload will have the same number. The point is that the largest Kafka cost driver in this example is architectural. Once broker-to-broker data replication is removed from the durability path, the economics change.

Look for the Network Line First

When a Kafka bill on AWS starts to look wrong, look at the regional data transfer line before you blame the brokers. In a high-throughput multi-AZ cluster, cross-AZ traffic may not be noise. It may be the main bill.

The math is not complicated once you split the data path into producer traffic, broker replication, and consumer reads. A 300 MiB/s write workload with 2x read fanout can behave like a 1,200 MiB/s cross-AZ network workload under common placement assumptions. That is the hidden AWS bill behind Kafka replication.

You can reduce parts of it with follower reads and strict AZ-aware placement. To eliminate the largest piece, you have to move durability out of broker-to-broker replication. Run your own workload through the AutoMQ pricing calculator, or read how the AutoMQ Diskless Engine uses S3-native storage to remove Kafka cross-AZ data replication from the cost equation.