Kafka teams are asking a different storage question in 2026. The old question was how much broker disk to provision, how many replicas to keep, and how often to rebalance partitions. The newer question is whether the broker disk should be the primary durable storage layer at all. Object storage has become too durable, too elastic, and too operationally convenient for streaming vendors to ignore.
That does not mean every "Kafka on S3" product is the same thing. Some platforms write the active log path to object storage and make brokers mostly stateless. Some keep Kafka's local-disk write path and use object storage for older segments. Some are roadmap-level Apache Kafka work rather than generally available product features. If you blur those categories, you will make a bad architecture decision while feeling very current.
Quick Answer
If you want a deployable diskless Kafka-style platform today, shortlist AutoMQ, WarpStream, and Aiven Inkless first. If you want upstream Apache Kafka's direction, track KIP-1150 and its follow-up KIPs, but treat it as accepted design work rather than a finished Kafka feature. If you mostly need longer retention on lower-cost storage, Apache Kafka tiered storage, Confluent Platform tiered storage, Redpanda tiered storage, and Apache Pulsar's BookKeeper-plus-tiered-storage architecture may fit even though they are not the same as diskless Kafka.
| Platform or approach | Category | Object storage role | Kafka compatibility | 2026 maturity signal |
|---|---|---|---|---|
| AutoMQ | Diskless Kafka-compatible platform | Primary durable stream storage through S3-compatible object storage | Kafka protocol compatible | Deployable open source and private BYOC paths |
| WarpStream | Diskless Kafka-compatible service | Agents write/read data through object storage with cloud metadata | Kafka protocol compatible | Deployable BYOC product under Confluent |
| Aiven Inkless | Managed Kafka with diskless topics | Opt-in diskless topics store retained data in object storage | Standard Kafka APIs and clients | Available on Aiven Cloud and BYOC according to Aiven docs |
| Apache Kafka KIP-1150 | Upstream Kafka roadmap | Proposed diskless topic type using remote storage | Native Kafka when implemented | KIP accepted; implementation details live in follow-up KIPs |
| Apache Kafka / Confluent tiered storage | Tiered storage | Older closed segments move to remote storage | Native Kafka | Production-ready for retention use cases, not diskless active writes |
| Redpanda tiered storage | Kafka API-compatible tiered storage | Remote write/read offloads segments to object storage | Kafka API-compatible | Enterprise feature for retention and recovery patterns |
| Apache Pulsar | Adjacent streaming architecture | BookKeeper handles persistent storage; tiered storage offloads older data | Not Kafka-native by default | Mature streaming system, but not a Kafka-compatible drop-in by itself |
The ranking is less important than the category. Diskless Kafka changes the active write path and broker state model. Tiered storage changes long-term retention economics. Both use object storage, but they solve different parts of the problem.
What "Diskless Kafka" Actually Means
Diskless does not literally mean that no process ever touches a disk. Even Apache Kafka's KIP-1150 clarifies that diskless topics may still use broker disk for metadata, temporary buffering, or caching. The important distinction is whether broker disks remain the primary durable store for user data.
Classic Kafka writes records to broker-local log segments, replicates those segments to other brokers for durability, and moves partition ownership through data movement when the cluster changes. Tiered storage improves that model by moving inactive or closed segments to object storage, but the hot write path still depends on broker-local durable storage. Diskless Kafka tries to move durability for user data into shared remote storage earlier, so broker identity and durable data ownership are less tightly coupled.
That design shift matters because it attacks several cloud-era pain points at once:
- Storage over-provisioning. Traditional Kafka capacity planning ties retention, peak throughput, replication factor, and broker disk sizing together. Shared object storage lets durable capacity scale outside the broker fleet.
- Rebalancing cost. When data is bound to brokers, partition movement means data movement. When durable data is in shared storage, compute nodes can be replaced or rescaled with less storage reshuffling.
- Cross-zone replication traffic. Kafka's replication model was designed for a world where direct server-to-server copies were a reasonable durability primitive. In public cloud, those copies can become a line item.
- Failure recovery. If a broker is mostly compute plus cache, losing it is different from losing a stateful storage owner.
The trade-off is real. Object storage has different latency, request cost, consistency, and throughput behavior than local NVMe or block storage. Good diskless systems hide much of that with write-ahead logs, batching, caching, metadata services, or topic-level controls. They do not repeal physics. For low-latency hot paths, you still need to understand the write path instead of stopping at the phrase "S3-native."
Top 7 Platforms and Approaches
1. AutoMQ
AutoMQ is a Kafka-compatible streaming platform built around storage-compute separation. Its documentation describes S3-compatible object storage as the actual storage location for stream data, with WAL used for write acceleration and recovery rather than as the long-term durable store. That makes AutoMQ one of the clearest examples of a diskless Kafka architecture rather than Kafka plus a cold-storage add-on.
AutoMQ is especially relevant for teams that want Kafka protocol compatibility but do not want broker-local disks to shape every scaling and recovery event. The project also has an open source path: AutoMQ's licensing documentation states that AutoMQ Open Source is covered under the Apache License 2.0. For enterprise teams, BYOC is the more practical evaluation path when they want the control plane and data plane inside their own cloud account: AutoMQ's BYOC environment overview describes the environment console/control plane and Kafka service/data plane as deployed in the user-defined network environment.
The main evaluation question is operational fit. AutoMQ changes Kafka's storage layer, so your proof of concept should test the workloads that normally reveal storage-path assumptions: high fan-out reads, consumer lag catch-up, broker replacement, topic expansion, and bursty write patterns. If those tests pass, the architecture is compelling because it addresses cloud cost and elasticity at the mechanism level rather than through retention tuning alone.
2. WarpStream
WarpStream's architecture documentation describes a stateless Agent that speaks the Apache Kafka protocol and communicates with object storage and a cloud metadata store. That is a different mental model from operating a physical Kafka broker cluster. WarpStream separates storage from compute, separates data from metadata, and separates the data plane from the control plane.
WarpStream is a strong fit for teams that like the Kafka protocol but want a BYOC operating model where data sits in their object storage. It is also now part of Confluent: Confluent announced the acquisition of WarpStream to advance BYOC data streaming. That gives WarpStream a different commercial context than independent open source projects or cloud-provider-native Kafka services.
The trade-off is the managed metadata/control-plane dependency and latency profile. WarpStream's design reduces stateful broker operations, but you should evaluate metadata availability, private connectivity, latency-sensitive workloads, and exit paths. For workloads that prioritize lower operational overhead and object-storage economics over the lowest possible broker-local write latency, it belongs near the top of the shortlist.
3. Aiven Inkless
Aiven Inkless is one of the most interesting developments because it brings diskless topics into a managed Apache Kafka service. Aiven documents Inkless as running on Kafka 4.x, supporting both classic and diskless topics in the same service, and remaining compatible with standard Kafka APIs and clients. Diskless topics are opt-in, which is a sensible design for mixed workloads.
That per-topic model is important. Many Kafka estates contain a blend of latency-critical topics, retention-heavy topics, bursty ingestion topics, and topics that mostly exist because downstream consumers fall behind. A single storage mode rarely fits all of them. Inkless gives teams a way to test diskless behavior without turning every topic into an object-storage-first topic at once.
The caution is managed-service scope. Aiven documents availability on Aiven Cloud and BYOC deployments, but buyers still need to confirm region, cloud, feature, and commercial details for their environment. Inkless is a serious option, but it should be evaluated as Aiven's managed Kafka implementation with diskless topic support, not as a generic upstream Kafka feature that any operator can enable in any Kafka distribution.
4. Apache Kafka KIP-1150
KIP-1150 is the upstream Apache Kafka signal everyone should watch. The KIP's current state is Accepted, and it frames diskless topics as a new topic type that can write through to object storage while using local disks for caching rather than primary durable user-data storage. It also explicitly distinguishes diskless topics from existing tiered storage by noting that tiered storage does not remove the need for replication and durable storage of active segments.
That acceptance matters because it shows the Kafka community agrees the direction belongs in Kafka. It does not mean diskless topics are already a production feature in Apache Kafka. The KIP itself says acceptance establishes the need and end-user requirements, while implementation details are handled by follow-up KIPs such as KIP-1163 and KIP-1164.
For platform teams, the practical stance is to track KIP-1150 for roadmap alignment while using current products for current deployments. If your organization has a strict upstream-only posture, KIP-1150 may support waiting or contributing. If you need object-storage-first economics now, you need to evaluate deployable products rather than cite the KIP as if it already changed your clusters.
5. Apache Kafka and Confluent Tiered Storage
Apache Kafka's tiered storage documentation covers remote log storage for Kafka, and KIP-405 is the core upstream work behind that direction. Confluent Platform has also documented tiered storage for years, with support for object stores such as S3, Google Cloud Storage, Azure Blob Storage, and S3-compatible systems in supported configurations.
Tiered storage is valuable, but it is not the same as diskless Kafka. The hot write path still starts with Kafka brokers and local log segments. Remote storage helps with longer retention, historical reads, broker storage pressure, and some recovery scenarios. It does not make every broker a stateless compute node in the same way a diskless architecture tries to.
This option is the right fit when you are committed to Kafka's native architecture and mainly need to reduce retention cost or keep more historical data online. It is the wrong fit if your primary requirement is to eliminate broker-local durable storage from the active write path. The overlap in vocabulary is real, but the operational outcomes differ.
6. Redpanda Tiered Storage
Redpanda is Kafka API-compatible rather than Apache Kafka itself, and its tiered storage documentation describes remote write uploads to object storage and remote read fetches from object storage. Redpanda supports common cloud object storage backends and positions tiered storage as a way to offload log segments while retaining access to older data.
This makes Redpanda relevant for object-storage streaming evaluations, especially when teams already like Redpanda's operational model. It is not, however, the same category as a diskless Kafka platform whose primary durable write path is object storage from the start. Redpanda's docs describe tiered storage as lowering storage cost by offloading log segments, with configurable local retention.
Redpanda should be evaluated when Kafka API compatibility is enough, JVM Kafka is not a requirement, and tiered retention matters. If your shortlist is specifically "diskless Kafka," keep Redpanda in the adjacent category unless Redpanda documents a fully diskless active write path for the version and deployment model you plan to use.
7. Apache Pulsar
Apache Pulsar belongs on this list because it has long separated serving and storage concerns more explicitly than Kafka. Pulsar brokers serve traffic, while Apache BookKeeper provides persistent message storage; Pulsar also documents tiered storage/offload patterns for moving older data to blob storage such as S3 or GCS. That architecture is object-storage-adjacent and highly relevant to the broader streaming-storage conversation.
But Pulsar is not diskless Kafka. It has its own protocol, operational model, metadata stack, client ecosystem, and topic semantics. Some vendors and deployments expose Kafka-compatible interfaces around Pulsar-like storage systems, but Apache Pulsar itself should not be treated as a drop-in Kafka replacement without compatibility testing.
Pulsar is worth evaluating when you are open to a different streaming system and want multi-tenancy, geo-replication patterns, and a storage architecture that is not classic Kafka. It is less attractive when your constraint is "standard Kafka clients and ecosystem tools with minimal application change." In that case, a Kafka-compatible diskless platform or upstream Kafka roadmap is a closer match.
Current Products vs Roadmap-Level Architecture
The object-storage streaming market is noisy because vendors use overlapping words for different maturity levels. A clean evaluation separates the options by what you can deploy today and what role object storage plays in the write path.
Use this maturity ladder as a sanity check:
- Production-ready diskless platforms: AutoMQ and WarpStream are built around object storage as the durable data layer, with different open source, BYOC, and control-plane trade-offs. AutoMQ's control plane is customer-environment resident in BYOC; WarpStream uses a managed control-plane model with customer-side agents.
- Managed diskless Kafka topics: Aiven Inkless is important because it brings diskless topics into a managed Kafka service while keeping classic topics available.
- Upstream Kafka roadmap: KIP-1150 is accepted and strategically significant, but it is not the same as a feature you can enable in any Apache Kafka cluster today.
- Tiered storage systems: Apache Kafka tiered storage, Confluent tiered storage, and Redpanda tiered storage are strong retention tools, but they do not erase the active local write path.
- Adjacent streaming architectures: Pulsar proves that compute/storage separation has a long history in streaming, but its compatibility and operations story is different from Kafka.
This is also where the term "S3-native Kafka" can mislead. A system can use S3 for retention, backups, snapshots, remote reads, or the primary log path. Those are not interchangeable. Ask where the record goes before it is acknowledged, what happens when a broker dies, how consumers read lagged data, and which component owns ordering metadata.
How to Choose
Start with the workload, not the category label. Diskless Kafka is attractive, but the correct choice depends on whether your real bottleneck is storage cost, broker operations, latency, elasticity, governance, or compatibility risk.
| If your priority is... | Start with... | Why |
|---|---|---|
| Lower cloud storage and replication overhead with Kafka compatibility | AutoMQ, WarpStream, Aiven Inkless | These are the clearest diskless or diskless-topic options available as products |
| Strict upstream Apache Kafka alignment | Kafka tiered storage now; track KIP-1150 | Tiered storage is current Kafka; diskless topics are accepted roadmap work |
| Long retention with fewer broker disks | Kafka/Confluent/Redpanda tiered storage | Object storage helps retention without changing the entire write path |
| Managed service with topic-level diskless adoption | Aiven Inkless | Classic and diskless topics can coexist in the same service |
| BYOC control-plane and data-plane control | AutoMQ first; compare WarpStream if a managed control plane is acceptable | AutoMQ places the environment console/control plane and Kafka service/data plane in the user's network environment; WarpStream uses a different managed-control-plane model |
| A non-Kafka streaming architecture | Pulsar | Pulsar has a mature separated architecture, but it is a platform change |
Proof-of-concept tests should be storage-path tests, not demo-topic tests. Measure produce latency under sustained load, consumer catch-up after lag, broker or agent replacement, scale-out behavior, object-store request patterns, and operational failure modes. Also verify the boring parts: ACLs, quotas, transactions, idempotent producers, compaction, schema tooling, connectors, observability, backup/export, and migration rollback.
The best sign that a vendor is serious is not the phrase "object storage." It is a clear explanation of the write path, metadata path, read path, cache behavior, and failure model. If those are vague, the architecture probably is too.
Where AutoMQ Fits
AutoMQ fits the part of the market that wants Kafka compatibility with an object-storage-native storage layer, an open source foundation, and a stricter BYOC boundary. Its strongest argument is architectural: brokers are not supposed to be long-lived owners of durable user data, and the control plane does not have to live outside the customer's environment either, so scaling, recovery, and governance do not have to behave like traditional Kafka partition movement plus external managed-service control.
That does not make AutoMQ the automatic answer for every team. A Kafka estate with heavy compaction, ultra-low-latency workloads, specialized broker plugins, or strict managed-service procurement may need a narrower proof of concept. But for cloud teams that are tired of treating broker disks as the center of the streaming universe, AutoMQ is one of the few options that attacks the problem directly instead of tuning around it.
FAQ
Is diskless Kafka the same as Kafka tiered storage?
No. Tiered storage usually moves older or closed log segments to remote object storage while the active write path still uses broker-local storage. Diskless Kafka moves durable user data away from broker-local disks earlier in the write path, often making brokers closer to stateless compute plus cache.
Is KIP-1150 available in Apache Kafka today?
KIP-1150 is accepted as of the Apache Kafka wiki page, but it is a design direction and umbrella proposal. The KIP points to follow-up work for implementation details. Treat it as a roadmap signal unless your Kafka distribution explicitly ships and supports the completed feature.
Which platforms are closest to true diskless Kafka?
AutoMQ, WarpStream, and Aiven Inkless are the closest options in this list. They differ in deployment model, control plane, open source posture, and topic-level behavior, so they should not be treated as interchangeable. The control-plane difference is especially important: AutoMQ BYOC keeps the environment console/control plane inside the user's network environment, while other managed BYOC models may keep orchestration outside the customer boundary.
Does object storage make Kafka slower?
It can, depending on the write path and workload. Object storage has different latency and request behavior from local disks or block storage. Diskless platforms use WALs, caches, batching, metadata services, or topic-level controls to manage that trade-off. The only reliable answer is to test your own hot path and lagged-read path.
Should I replace Kafka with Pulsar for object-storage architecture?
Only if you are comfortable evaluating a different streaming platform, not merely a Kafka storage mode. Pulsar has a mature separated architecture, but it changes clients, operations, ecosystem assumptions, and compatibility boundaries.
What should I ask vendors before choosing?
Ask where data is durably stored before acknowledgments, how ordering metadata is managed, what happens when compute nodes disappear, how lagged consumers read from remote storage, what Kafka APIs are unsupported, and how you migrate out. Those answers matter more than the label "S3-native."