Kafka tiered storage is a serious improvement for teams that have been carrying long retention on broker-local disks. It changes the retention equation: completed log segments can move to a remote tier, often backed by object storage, while recent active data remains close to the broker.
The limitation is not that tiered storage fails to solve its own problem. The limitation is that many architecture reviews expect it to solve a broader problem than it was designed to address. Remote log storage reduces pressure from older segments; it does not automatically turn broker compute into stateless capacity, remove local hot-data responsibilities, or make every reassignment and recovery path independent of broker-local state.
That boundary is the whole decision. If the root pain is longer retention without expanding local disks forever, tiered storage may be right. If broker-local storage still drives capacity, recovery, elasticity, and operational risk, remote log storage is a partial step.
What tiered storage actually changes
Apache Kafka's tiered storage work, introduced through KIP-405, separates the log into a local tier and a remote tier. The local tier is still the normal broker log path for active segments. The remote tier stores copied log segments after they are rolled, and Kafka can serve older fetches from that remote location when the data no longer exists locally.
That is a meaningful operating model. It lets teams tune local retention separately from total retention, offload cold history, and avoid sizing every broker for the full historical footprint. It can also reduce the blast radius of some disk-capacity incidents because old data is no longer competing with hot data for the same local capacity.
But the model still has a two-tier contract:
- Hot and active data stays local. The broker still needs local capacity and I/O for active segments, leader writes, follower fetches, page cache, and operational headroom.
- Remote data is older data. The remote tier is reached when the requested segment has moved out of the local tier.
- The architecture still has broker-local responsibilities. Remote storage can reduce local history, but it does not by itself make persistent log ownership independent of broker identity.
This is why "Kafka stores data in object storage" can be a misleading shorthand. A tiered-storage cluster may use object storage, but object storage is not necessarily the primary durable substrate for the active log. The active write path and hot-read path still need to be evaluated as broker-local Kafka first, with remote storage added for older segments.
Limitation 1: hot data remains a local-disk problem
Most production Kafka traffic is tail-heavy. Producers append to the end of the log, consumer groups often read near the head, and stream processors usually chase current offsets. Tiered storage is strongest when the data being offloaded is not part of that hot working set. It is less transformative when the cluster's pain comes from active ingest, hot fan-out, or short-retention data that still must live locally.
The result is that local disk does not disappear from capacity planning. Operators still need enough disk for active segments, replication lag, compaction behavior, retention buffers, and surge periods. They also need enough local throughput for write amplification, fetch load, and follower replication. A topic with a large active working set may keep most of its meaningful operational pressure on broker disks even if older segments are successfully uploaded to the remote tier.
This matters during sizing discussions. A team can reduce locally stored history and still overrun local disks during traffic spikes, consumer stalls, or compaction-heavy workloads. Tiered storage changes the retention curve; it does not guarantee that the hot set is small, stable, or evenly distributed.
Limitation 2: local disk still shapes broker operations
Traditional Kafka operations are deeply tied to where partition replicas live. Broker placement, leadership, local log directories, ISR membership, and replica movement all assume that brokers carry local data responsibilities. Tiered storage does not erase those responsibilities for active data.
That means teams still need to plan for:
| Operational area | What tiered storage helps | What still needs local planning |
|---|---|---|
| Retention | Older segments can move remote | Local retention window and active segment headroom |
| Broker sizing | Less full-history disk pressure | Hot data, replication, cache, and disk throughput |
| Topic growth | Longer history can be less local-disk bound | Partition skew and leader-local workload imbalance |
| Maintenance | Less cold data may need local handling | Broker drain, local replica state, and validation |
This is a reminder that the local tier remains part of the platform's reliability model. A cluster may be easier to operate after enabling remote log storage and still remain a stateful broker-local system.
Limitation 3: reassignment and recovery are not fully rewritten
Partition reassignment and broker recovery are where remote-log assumptions often become too optimistic. If a team expects tiered storage to make broker replacement behave like replacing a stateless compute node, they need to inspect the active-data path carefully.
When a broker fails, Kafka still depends on leader election, ISR state, and the availability of in-sync replicas for active data. Remote segments can help with older history, but they do not automatically remove the need to preserve and restore the active local replica model. Similarly, when partitions are moved for balancing, the system still has to reason about local replicas, leadership, and catch-up for data that belongs in the local tier.
The important distinction is between reducing data movement and eliminating durable-data coupling. Tiered storage may reduce how much historical data must be copied or retained locally. But if a broker still owns active persistent state on local disk, recovery and reassignment remain data-aware operations.
For high-churn environments, that difference is visible. Cloud infrastructure teams may add and remove nodes often. Kubernetes teams may replace pods, rotate nodes, or rebalance capacity. FinOps teams may want tighter right-sizing. If every change still requires careful handling of broker-local active data, the cluster is not operating like a fully elastic storage-compute separated system.
Limitation 4: remote reads change latency and cost behavior
Remote reads are useful because older data remains queryable without keeping every segment on local disk. They are also different from local reads. A consumer replaying old data may traverse object storage or another remote storage backend, incurring different latency, throughput, request, and network behaviors from a local page-cache-heavy fetch.
For platform teams, the issue is not that remote reads are unusable. The issue is predictability. Backfills, reprocessing jobs, audit reads, and recovery workflows often arrive in bursts. A quiet remote tier can look cost-effective until a large replay creates object requests, egress or inter-zone traffic, and latency tails that were not part of the original hot-path SLO.
Ask four questions before assuming remote reads are operationally invisible:
- Which consumer groups are allowed to replay old data at high parallelism?
- Are object storage request costs and data transfer costs included in the Kafka cost model?
- How does the system protect hot traffic when remote-read demand spikes?
- What cache or prefetch behavior exists for repeated historical reads?
Many teams can tolerate slower historical reads because the requirement is recoverability rather than low-latency replay. But if replay speed is part of the service contract, tiered storage needs a runbook and a cost model.
Limitation 5: metadata and operations become a new surface area
Remote log storage adds an operational surface that traditional local-only Kafka did not have. The cluster now has remote log metadata, object lifecycle concerns, storage permissions, upload and deletion behavior, remote fetch paths, and failure modes that bridge Kafka and the external storage system.
This is a fair tradeoff when the retention benefit is large, but it is still a tradeoff. Operators need to answer where remote log metadata is stored, how metadata remains consistent with objects, what happens when uploads lag, what metrics indicate remote-tier health, and how object lifecycle policies interact with Kafka retention.
The hard part is that many failures are cross-system failures. A broker can be healthy while remote uploads are delayed. Object storage can be reachable but throttled. Metadata can indicate a remote segment that an operator cannot fetch because permissions changed. Deletion can become a correctness and cost issue if Kafka and bucket lifecycle rules disagree.
Tiered storage therefore moves some problems out of broker disks and into storage integration. That is often worth doing. It should still be treated as a production dependency with its own observability, quotas, access controls, and incident drills.
Why these limits come from "older data remote tier"
The common thread is architectural priority. In tiered storage, the remote tier is a place for older log segments after they leave the hot local tier. It is not necessarily the primary home of all durable Kafka data from the moment records are acknowledged. The broker-local tier remains the first-class participant for active writes and reads.
That explains the limits without turning them into surprises:
- Hot data remains local because the remote tier is not the primary hot write path.
- Broker disks still matter because the local tier still carries active log responsibility.
- Reassignment still has data concerns because active replicas still exist locally.
- Remote reads have different performance and cost because they are historical fetches through an external backend.
- Metadata and operations expand because the system now coordinates two storage tiers.
The design is coherent. The mistake is evaluating it as if it were shared primary storage.
What shared primary storage changes
Shared primary storage asks a different question: can the durable log be externalized from broker-local disks so brokers behave more like replaceable compute over shared durable state? In that model, object storage or S3-compatible storage is not merely a remote tier for older segments. It is part of the primary storage architecture, typically combined with a write-ahead path, metadata ownership rules, fencing, caching, and object layout designed for streaming workloads.
This is where AutoMQ naturally enters the discussion. AutoMQ is a Kafka-compatible streaming platform that uses S3Stream, a shared streaming storage layer built on object storage, to separate durable stream storage from broker-local disks. Its stateless broker architecture is not the same claim as "Kafka tiered storage is enabled." The claim is that persistent Kafka data is designed around shared object storage, while brokers handle Kafka protocol serving, leadership, caching, and runtime responsibilities without being the long-lived home of durable log history.
That distinction matters for platform teams because it changes the questions in an architecture review:
- Can broker replacement proceed as compute reassignment rather than local log reconstruction?
- Can capacity planning separate storage growth from broker count more cleanly?
- Can scaling events focus on traffic and partition responsibility rather than copying durable history between broker disks?
- Does the system preserve Kafka protocol behavior while changing the storage substrate?
AutoMQ still needs to be evaluated with production-shaped tests. Shared storage does not make latency, fencing, cache warm-up, object storage throttling, or metadata correctness vanish. It changes where the hard engineering sits. For teams whose pain is storage-compute coupling rather than long retention alone, that is the more relevant architecture question.
A decision matrix after tiered storage
After a team has already studied or adopted tiered storage, the next decision should be specific. Do not ask whether remote storage is good in the abstract. Ask which constraint remains after remote log storage is in place.
Use this matrix in the architecture review:
| Remaining constraint | Tiered storage may be enough when | Evaluate shared primary storage when |
|---|---|---|
| Hot data | Hot working set is modest and predictable | Hot data still drives broker disk and I/O sizing |
| Local disk | Disk pain is mostly long retention | Broker disks still dominate capacity planning |
| Recovery | Broker replacement is infrequent and well rehearsed | Recovery and scaling windows are operational bottlenecks |
| Remote reads | Historical replay can be slower and controlled | Replay speed and cost are part of the service contract |
| Operations | Team can operate remote-tier metadata and storage policies | Team wants storage and compute responsibilities separated |
This framing keeps tiered storage in its proper place: a valuable retention and local-disk pressure feature. It also prevents teams from stopping one step early when the real objective is less broker-local durable state.
Evaluation checklist for platform teams
Before declaring tiered storage "enough," run a small but serious review:
- Map the hot set. Estimate the data volume and throughput that remains local under normal traffic, peak traffic, consumer lag, and compaction.
- Model remote reads. Include object requests, data transfer, throttling behavior, replay parallelism, and hot-traffic isolation.
- Drill broker replacement. Measure what happens when a broker disappears under produce traffic and when replacement capacity joins.
- Trace metadata failure modes. Validate remote segment metadata, object access, deletion, lifecycle policies, and alerting.
- Separate retention goals from elasticity goals. If the goal is longer history, tiered storage may be direct. If the goal is stateless broker behavior, evaluate shared primary storage.
The most useful outcome is not a universal winner. It is a precise statement of what problem remains. Tiered storage can remove a painful retention ceiling. Shared primary storage, including AutoMQ's object-storage-backed architecture, becomes relevant when the platform team wants Kafka compatibility while changing the broker-local storage contract itself.
References
- Apache Kafka Tiered Storage documentation
- Apache Kafka KIP-405: Kafka Tiered Storage
- Apache Kafka Tiered Storage configuration
- AutoMQ architecture overview
- AutoMQ S3Stream shared streaming storage overview
- AutoMQ Stateless Broker documentation
FAQ
What is the main limitation of Kafka tiered storage?
The main limitation is scope. Kafka tiered storage can offload older completed log segments to a remote tier, reducing local retention pressure, but active data and broker-local operational responsibilities can remain important.
Does Kafka tiered storage make brokers stateless?
No. Tiered storage does not automatically make Kafka brokers stateless. A stateless broker architecture requires durable log ownership to be separated from broker-local disks, not merely older segments copied to remote storage.
When is Kafka tiered storage enough?
Tiered storage can be enough when long retention is the dominant problem, historical reads are controlled, and broker-local recovery and scaling are already acceptable. It is a targeted retention architecture improvement.
Why can remote reads become expensive?
Remote reads may use object storage or another external backend. Large replays can create request volume, data transfer, throttling exposure, and latency tails that are different from local fetch behavior.
How is shared primary storage different from remote log storage?
Remote log storage usually stores older segments outside the broker. Shared primary storage treats the external storage layer as the primary durable home for stream data, with brokers acting more like replaceable compute over shared state.
Where does AutoMQ fit?
AutoMQ fits when the desired change is storage-compute separation rather than retention offload alone. It keeps Kafka protocol compatibility while using S3Stream and shared object storage so persistent stream data is not tied to broker-local disks.