The phrase "object storage for Kafka" sounds precise until two teams use it differently. One team may mean tiered storage: recent data stays on broker-local disks, and older segments are offloaded to S3-compatible object storage. Another team may mean object-storage-native or shared-storage Kafka: object storage is the primary durable repository, while brokers act more like compute nodes with WAL and cache around the hot path. Those designs can both involve S3, but they do not create the same cost model, recovery behavior, or scaling story.
This distinction matters when the search starts with Redpanda tiered storage. Redpanda's documentation describes Tiered Storage as a feature that offloads log segments to object storage and lets operators configure how much data remains local. Apache Kafka's Tiered Storage documentation uses the same basic split between a local tier and a remote tier. Object-storage-native Kafka takes a different architectural bet: broker-local disks are no longer the long-term source of truth for the log.
The practical question is not whether object storage appears in the diagram. It is whether object storage is a cold tier behind a broker-local hot path or the durable center of the system.
Why Storage Terminology Matters
Kafka operators rarely evaluate storage in isolation. Retention affects disk sizing, disk sizing affects broker count, broker count affects replication traffic, and replication traffic affects the failure plan. A small vocabulary mistake can turn into a large architecture mistake because "we use object storage" does not answer where active durable state lives.
Tiered storage starts from the broker-local mental model. Producers write to a leader, recent log segments remain local, and retention pressure is reduced by moving older segments to a remote object store. If a consumer asks for older offsets, the broker can fetch from the remote tier, but the local tier still serves the active workload and participates in the immediate write path.
Object-storage-native Kafka starts from the opposite direction. Durable log data is stored in shared object storage as the primary repository. Brokers still matter, but their durable identity is weaker. They process requests, coordinate leadership, maintain cache, and write through a WAL layer, but long-lived data is not tied to a broker disk. AutoMQ is an example: its public architecture describes S3Stream as replacing Kafka log storage with WAL storage, S3 storage, and data caching.
That difference changes the test plan:
- Retention growth: Does longer retention require larger broker disks, or does it mainly grow object storage?
- Broker replacement: Does a replacement node need large local data movement before it becomes useful?
- Scale-in: Can compute shrink after a peak window without first evacuating a large local data footprint?
- Catch-up reads: Are older reads a remote-tier exception, or is remote-backed access part of the normal design?
- Cost accounting: Are storage, replication, cache, WAL, API operations, and network visible as separate levers?
None of these questions makes tiered storage wrong. It is a strong tool when the main pain is long retention on local disks. The point is narrower: tiered storage is not a synonym for shared storage.
What Redpanda Tiered Storage Solves
Redpanda is a Kafka API-compatible streaming platform with its own storage engine and Raft-based replication model. In a standard Redpanda deployment, broker-local storage remains central to the active log. Tiered Storage extends that model by offloading older log segments to object storage. Redpanda documents this as a way to retain more data without keeping all retained data on local disks, and its remote read path can serve historical reads from the object store.
That is useful for retention-heavy workloads. A team with short hot windows and long replay requirements may not want every broker sized for the full historical window. Apache Kafka's Tiered Storage follows a similar pattern: data is split between local and remote log tiers. The trade-off is that tiering does not remove the broker-local hot path. Local disk capacity, local IO, replica placement, leadership balance, and broker replacement still matter.
| Evaluation area | Redpanda Tiered Storage | What to verify |
|---|---|---|
| Long retention | Older log segments can be offloaded to object storage. | Local retention settings, remote read behavior, object storage cost, and restore assumptions. |
| Hot reads and writes | Recent data remains broker-local. | Local disk, CPU, cache, and Raft replication under peak traffic. |
| Historical replay | Consumers may read older data through the remote tier. | Catch-up throughput, latency, object request rate, and cache behavior. |
| Broker operations | Brokers still own local state for the active workload. | Replacement, draining, rebalance, and replica placement under failure. |
The table is mechanism-first because object storage is not a magic cost eraser. It reduces some storage pressure, but the surrounding data path decides whether the operational model changes.
What Object-Storage-Native Kafka Changes
Object-storage-native Kafka is a storage architecture shift, not a retention feature. Durable data moves out of broker-local disks and into shared object storage. The system then needs a design that turns object storage into a streaming substrate: a WAL for durable low-latency appends, object organization to avoid small-file and API-cost problems, metadata to map offsets to objects, and cache for hot reads.
AutoMQ is useful as a concrete example because its public materials are explicit about the design. The architecture overview says object storage is the primary repository for data, while S3Stream documentation describes WAL storage, S3 storage, and data caching as the replacement for Kafka's native log storage. This does not mean object-storage-native Kafka is "Kafka with no storage work." It means the storage work moves to WAL, object storage, cache, metadata, and network paths.
The cost path changes because retention no longer has to scale broker-local disks in the same way. Long-lived data can grow in object storage, while compute can be sized around active traffic, cache behavior, and failure headroom. In cloud environments, that separation can matter when traffic is bursty or retention grows faster than throughput.
The failure path changes too. In a broker-local design, replacing a broker means the cluster has to account for local replicas, placement, and movement. In a shared-storage design, durable data already sits in shared storage, so another broker can take over the compute role with less local byte movement. Metadata correctness, WAL recovery, and cache warm-up remain critical, but the center of gravity shifts away from reconstructing a local disk footprint.
Retention Economics
Retention is the obvious reason teams search for tiered storage. Kafka-like systems often start with a short retention window, then analytics, fraud investigation, ML backfills, audit requirements, or replay-heavy pipelines stretch that window. At first, the answer looks like a bigger disk. Then bigger disks become bigger brokers, bigger brokers become slower replacement events, and retention starts shaping the whole cluster.
Tiered storage helps by reducing how much historical data must stay local. For workloads where most consumers read near the tail and only occasional jobs replay older data, that can be a strong match. The local tier remains sized for hot data, while the remote tier carries older segments. The team should still model object storage reads, remote fetch latency, and cache behavior during replay, because the bill and user experience move rather than disappear.
Object-storage-native Kafka changes the baseline. Retention growth is planned around object storage from the start, not bolted on as an overflow tier. Compute, WAL, cache, object storage, and network each become separate cost levers. Active traffic belongs mostly to compute and cache; retained bytes belong mainly to object storage; the WAL bridges streaming semantics with cloud storage primitives.
Catch-Up Reads: Remote Tier vs Normal Design Path
Catch-up reads are where many architectures reveal their assumptions. A tailing consumer reads close to the end of the log and expects low-latency access to fresh records. A catch-up consumer may ask for hours, days, or weeks of older offsets after downtime, backfill, or another downstream application. Those two access patterns stress different parts of the system.
In Redpanda Tiered Storage, remote reads are tied to the fact that older segments may live in object storage. The operational questions are direct: how fast can remote reads serve the replay, what happens to broker resources during that replay, and how do object storage request rates and network paths affect the incident?
In object-storage-native Kafka, older reads are not an exception to a local-first design. The system expects object storage to contain durable data and uses cache, prefetching, and metadata to serve reads efficiently. The hard problems are cache warming, object layout, metadata scale, batching, and avoiding inefficient object access patterns. Production evaluations need both:
- Tailing reads under normal producer load, with realistic message sizes and consumer fan-out.
- Catch-up reads from offsets outside the hot window, including multiple consumers replaying at once.
- Broker replacement during replay, because the worst incidents combine failure and backlog.
- Object storage metrics, including request rate, throttling, latency, and data transfer path.
If tiered storage is enough, these tests should show that remote-tier reads meet the replay objective without making broker operations fragile. If object-storage-native Kafka is the better fit, they should show that shared durable storage and cache behavior reduce recovery and scaling pain without violating latency requirements.
Scaling And Recovery: The Part Tiering Does Not Fully Change
Scaling is not only adding nodes. It is what happens to the data while nodes are added, removed, drained, or replaced. In local-storage architectures, data placement and compute placement are tightly coupled. Even with a remote tier, the active local footprint still has to be balanced across brokers.
Tiered storage can reduce the amount of historical data involved in local operations. That is valuable. But the hot set still lives locally, and the cluster must keep enough local capacity for active writes, tailing reads, and failure headroom.
Object-storage-native Kafka tries to make broker lifecycle closer to compute lifecycle. Since durable data is in shared storage, a replacement broker can become useful without first receiving the full historical log. The exact result depends on WAL recovery, metadata propagation, cache warm-up, controller behavior, and cloud storage performance.
The useful distinction is simple: tiering reduces retained local bytes; shared storage reduces the durable state attached to a broker. Those are related, but they are not the same operation.
Where AutoMQ Fits
AutoMQ fits naturally into the object-storage-native/shared-storage Kafka category. It keeps Kafka protocol compatibility while replacing Kafka's broker-local log storage with S3Stream. Public AutoMQ documentation describes S3Stream as a shared streaming storage layer made of WAL storage, S3 storage, and data caching. It also describes stateless brokers, where durable data is not bound to local disks.
The important nuance is the WAL. Object storage is durable and cost-effective for retained data, but it is not a drop-in local disk for streaming appends. AutoMQ uses a WAL layer to make writes durable before data is uploaded and organized in object storage. AutoMQ Open Source uses S3-compatible storage for the WAL path, while AutoMQ commercial editions document additional WAL storage options.
AutoMQ should not be evaluated as "tiered storage but more cloud-native." It is a different architecture category. The right comparison is against the specific workload constraint: long retention, broker replacement, scale-in, replay behavior, BYOC boundaries, or Kafka compatibility. If local disk usage for old data is the main pain, tiered storage may be enough. If every capacity or failure event behaves like a data movement project, object-storage-native Kafka deserves a separate proof.
Choosing The Right Storage Boundary
Redpanda Tiered Storage and object-storage-native Kafka both respond to a real cloud problem: broker-local disks are an awkward place to keep ever-growing streaming history. Redpanda addresses that problem by extending the local-storage model with a remote object tier. Object-storage-native Kafka addresses it by making object storage the durable center and designing the broker layer around that choice.
That is the decision boundary. If the pain is mostly historical retention, tiered storage can be the smaller change. If the pain is the coupling of compute, storage, scaling, and recovery, shared storage changes more of the operating model. Run the proof with real producers, consumers, retention settings, replay patterns, broker failure drills, object storage metrics, and network placement.
The search may start with Redpanda tiered storage, but the real question is where durable state should live. Use tiering when a remote cold tier solves the problem; evaluate object-storage-native Kafka when the broker-local source of truth is the problem.
If your review points toward Kafka-compatible shared storage, start with the AutoMQ architecture overview and test the design against your own retention, replay, and broker replacement scenarios.
FAQ
Is Redpanda Tiered Storage the same as shared storage?
No. Redpanda Tiered Storage offloads older log segments to object storage while recent data and the active hot path remain broker-local. Shared storage means durable data is stored in a shared storage layer as the primary repository.
Does Apache Kafka Tiered Storage make Kafka object-storage-native?
No. Apache Kafka Tiered Storage adds a remote log tier to the Kafka storage model. It improves retention economics and remote reads, but it does not make object storage the primary durable log store behind stateless brokers.
When is tiered storage enough?
Tiered storage is often enough when long retention consumes too much local broker disk, while scaling, replacement, and recovery behavior remain acceptable. Test remote read throughput, object storage cost, and replay behavior.
When should object-storage-native Kafka be evaluated?
Evaluate object-storage-native Kafka when retention growth, broker replacement, elastic scaling, or replay-heavy workloads are painful because durable data is attached to brokers. Test tailing reads, catch-up reads, failure recovery, object storage behavior, and cost.
Is AutoMQ a tiered storage system?
AutoMQ is better described as a Kafka-compatible shared-storage or object-storage-backed system. Its architecture uses S3Stream, WAL storage, S3-compatible object storage, and data caching.
Does shared storage remove the need for benchmarking?
No. Shared storage changes the failure and cost model, but results depend on WAL implementation, cache hit rate, object layout, metadata scale, cloud network paths, and workload shape. Benchmark with production-like client settings and failure drills.
References
- Redpanda Docs: Tiered Storage
- Redpanda Docs: Architecture
- Apache Kafka Docs: Tiered Storage
- Apache Kafka KIP: KIP-405: Kafka Tiered Storage
- AutoMQ Docs: Architecture overview
- AutoMQ Docs: S3Stream shared streaming storage
- AutoMQ Docs: WAL storage
- AutoMQ Docs: Stateless broker
- AutoMQ GitHub: AutoMQ repository