Blog

Kafka Shared Storage: Architecture for Stateless Brokers

Kafka storage architecture has always carried an implicit assumption: a broker is both a compute process and the long-lived home of some partition replicas. That assumption works well when storage is local and sized together with the server. It becomes more complicated in elastic cloud environments, where compute instances are replaceable and storage services scale independently.

Kafka shared storage changes that assumption. Instead of treating each broker as the durable owner of local log replicas, a shared-storage design separates broker runtime responsibility from durable log placement. Brokers still handle client requests, partition leadership, fetch serving, and coordination. The difference is that persistent log data is not permanently bound to the disk of one broker.

That shift is the foundation behind stateless Kafka brokers. The term can be misleading because a broker still has runtime state: connections, caches, group coordination, leadership assignments, in-flight writes, and metrics. Stateless, in this context, means the broker does not own the durable log as local persistent state. If the broker disappears, recovery should reassign compute responsibility instead of reconstructing the log from the failed machine.

Shared-Nothing vs Shared-Storage Kafka

Traditional Kafka is shared-nothing

Traditional Kafka is usually described as a shared-nothing distributed system. Each broker has its own CPU, memory, network interface, and local disks. Partitions are split into replicas, and those replicas are placed across brokers. One replica is the leader for a partition; the others follow the leader by fetching records and writing them to their own local logs.

This design has a clear strength: the data path is explicit. A produce request is appended to the leader log, replicated to followers according to the topic replication factor and acknowledgment policy, and later served from broker storage. Availability is achieved by electing a new leader from the in-sync replica set when the old leader fails.

The shared-nothing model also gives Kafka its operational personality. Storage is not an external dependency in the hot path; the broker process and the broker disk are one operational unit. Capacity planning therefore starts with broker-local resources:

  • Disk capacity and retention. More retention means more local disk per broker or more brokers to distribute replicas.
  • Disk throughput. Produce traffic, follower replication, consumer catch-up, and log cleaner activity compete on the same storage devices.
  • Replica placement. Failure domains are modeled through replica distribution across brokers, racks, and availability zones.
  • Recovery time. A failed disk or broker can require copying replicas before the cluster returns to the desired layout.

Kafka was designed around append-only logs and sequential disk I/O, and local storage gave deployments a way to combine performance with replication. The trade-off is that broker identity and durable log placement become tightly coupled. If a broker is replaced, the cluster must reason about data and compute movement.

What shared storage means for Kafka

Kafka shared storage is not a single feature flag. It is an architectural pattern in which the persistent log layer is externalized from broker-local disks into a shared durable storage system that can be accessed by broker processes under controlled ownership rules.

The important word is controlled. Shared storage does not mean every broker writes anything anywhere. Kafka still needs ordering, fencing, leader epochs, offsets, and metadata consistency. The architecture has to define which broker may write, how readers find data, how ownership changes after failure, and when an acknowledgment is durable.

Persistent log data

The persistent log is the long-lived record stream: batches, offsets, indexes, and the metadata needed to read them correctly. In shared-nothing Kafka, this data is stored in broker log directories. In shared-storage Kafka, the durable log is placed in a shared layer so a replacement broker can attach to existing data.

This is where terminology gets messy. Remote log storage usually means storing log segments outside local broker disk. Tiered storage keeps newer data local while older completed segments move to a remote tier. Shared storage is broader: durable log data is accessible independently of any one broker's disk. Object storage is a common substrate, but it is not the architecture by itself.

The distinction matters because tiered storage can reduce the amount of local data that must be recovered, while still keeping brokers stateful for hot data. A fully shared-storage design goes further. It makes durable log ownership a metadata and storage-layer concern rather than a property of one broker's attached disks.

Broker compute and leadership

Brokers do not disappear in a shared-storage design. They remain the compute layer that implements the Kafka protocol. Producers and consumers still connect to brokers. Partition leaders still sequence writes, and consumers still fetch from brokers rather than talking directly to object storage in normal Kafka client flows.

The difference is the failure contract. In traditional Kafka, a broker is a compute process plus the local persistent replicas it owns. In shared-storage Kafka, a broker is closer to a runtime executor for partition responsibility. It may cache data, buffer requests, and serve reads efficiently, but the durable source of truth is outside local disk.

That makes leadership transfer a different kind of operation. The system still needs fencing so only the valid leader can append. It still needs ordering and partial-failure handling. But the target of failover is no longer "find another local replica with the data." It becomes "assign responsibility to a broker that can continue from shared durable state."

Metadata and control plane

Kafka's metadata layer prevents shared storage from becoming shared confusion. Topic definitions, partition assignments, leader epochs, broker registrations, and controller decisions must be consistent with the storage layer's notion of ownership. In KRaft mode, Kafka uses a metadata quorum rather than ZooKeeper to manage cluster metadata.

A shared-storage Kafka architecture therefore has three separable planes:

PlaneTraditional shared-nothing KafkaShared-storage Kafka
Data planeBrokers append to and read from local replicasBrokers serve protocol traffic while durable data lives in shared storage
Control planeController tracks partitions, leaders, ISR, and broker healthController also supports fast compute reassignment against externalized data
Storage planeBroker disks are the durable log locationShared durable storage is the long-lived log location

The table hides an important burden: the three planes must agree during failure. A broker that loses leadership must not keep appending. A replacement broker must know the correct offset and epoch. A read path must find records without violating ordering expectations.

Why shared storage enables stateless brokers

Stateless brokers are not created by deleting disks from a configuration file. They are created when the architecture removes long-lived persistent log ownership from broker-local storage. Once that happens, broker replacement becomes a compute scheduling problem instead of a data reconstruction problem.

Imagine a broker hosting many partition leaders. In shared-nothing Kafka, losing that broker affects both serving capacity and local replica placement. The cluster can elect other in-sync replicas, but it may also need to rebuild replicas later. That rebuild consumes network and disk I/O.

With shared storage, the recovery sequence can be shorter in shape, even when the implementation details are complex:

  • The control plane detects that the broker is gone.
  • Partition responsibility is reassigned to healthy broker processes.
  • The new broker obtains the relevant metadata and resumes serving from the shared durable log.
  • Local caches and buffers warm up as runtime state, not as a prerequisite for durable recovery.

This does not make distributed systems effortless. Shared storage introduces its own design questions: write amplification, object layout, tail latency, metadata scale, throttling, and read-path caching. It also requires a careful answer to the produce acknowledgment question. If a producer receives an acknowledgment before the record is durable in the shared layer or otherwise protected by a durable write-ahead path, the system has moved risk rather than removed it.

Stateless Broker Responsibility Map

The clean mental model is this: a stateless broker may hold operational state, but it should not hold irreplaceable log state. If the broker's local disk is lost, the cluster should lose cache warmth and temporary runtime context, not the authoritative copy of partition data.

How AutoMQ implements shared storage

AutoMQ is one example of a Kafka-compatible system built around shared storage. Its architecture keeps the Kafka protocol surface while moving the storage layer to S3-compatible object storage through S3Stream. The broker remains responsible for Kafka-facing compute, but durable stream data is organized in a shared storage layer rather than being permanently attached to broker-local disks.

The interesting part is not "Kafka plus S3" as a slogan. Object storage has a different API and performance model from a local append-only disk. A streaming system has to map Kafka's ordered log abstraction onto objects, indexes, metadata, and background maintenance without exposing those mechanics to clients. AutoMQ's S3Stream layer is the storage abstraction that makes object storage act like shared streaming storage for brokers.

KRaft metadata is the other half of the story. Kafka-compatible shared storage still needs a control plane that can track brokers, topics, partitions, and leadership. With KRaft, metadata is managed through Kafka's quorum-based controller architecture. In a design such as AutoMQ's, that metadata works alongside shared durable storage so broker replacement can be driven by ownership and metadata updates rather than by copying persistent log replicas to a new local disk.

This is also why stateless broker architecture should not be confused with client-side storage access. Producers and consumers continue to use Kafka APIs. The storage architecture is beneath the broker boundary, so clients and ecosystem tools should not have to learn a new storage API because the broker internals changed.

Shared storage vs tiered storage vs remote log storage

Architects often compare shared storage with Kafka tiered storage because both involve data outside broker-local disks. They solve related but different problems.

TermWhat it usually meansBroker state implication
Shared-nothing KafkaBrokers own local persistent log replicasBrokers are stateful because replicas live on local disks
Remote log storageLog data is copied to a remote systemDepends on whether hot data still requires local durable replicas
Tiered storageLocal tier for recent data, remote tier for older segmentsBrokers remain stateful for local active segments
Shared storage KafkaDurable log data is externalized from broker disksBrokers can become operationally stateless if recovery attaches to shared data
Stateless brokerBroker runtime state can be rebuilt or reassignedDurable log state is not pinned to local disk

Tiered storage is useful when retention growth is the main pressure. It can reduce local storage requirements and make older data less disruptive to broker disks. Shared storage asks a different question: can broker compute be scaled, replaced, and recovered without making local persistent log replicas the center of the operation?

Both patterns can use object storage. Evaluation should start with the failure and scaling model rather than the name of the storage service.

Evaluation checklist for architects

A shared-storage Kafka design should be evaluated as a distributed log system, not as a storage bill optimization exercise. The storage layer is only one part of the correctness envelope. The broker, controller, and storage planes have to preserve Kafka semantics together.

Architect Evaluation Checklist

Use these questions during architecture review:

  • Where does durability begin? Identify the exact point at which a produce acknowledgment is safe across broker loss, zone loss, and controller failover.
  • How is write ownership fenced? A stale leader must be unable to append after leadership changes, especially when multiple brokers can reach the same storage layer.
  • What is local and what is durable? Separate cache, buffer, and temporary index state from persistent record state. The architecture should be explicit about each category.
  • How does recovery behave under load? Broker replacement, reassignment, and consumer catch-up should be tested with real traffic patterns.
  • What happens when storage slows down? Shared storage is still infrastructure with quotas, throttling, latency variance, and operational incidents. Admission control and observability need to cover this path.
  • How compatible is the system above the broker boundary? Kafka protocol compatibility is necessary, but platform teams also care about ACLs, quotas, metrics, rebalancing behavior, backup processes, and deployment automation.

The most useful proof is a failure drill showing that a broker can be replaced without treating local log reconstruction as the critical path.

When shared storage is a good fit

Shared storage is most compelling when Kafka is operated as elastic infrastructure rather than as a fixed fleet of storage servers. Platform teams running many clusters, variable workloads, or frequent maintenance windows often feel broker-local data gravity. Every scale event becomes a storage event, and every replacement carries the shadow of replica movement.

The architecture is less compelling when the environment is small, static, and already operationally simple. A compact Kafka cluster with predictable retention, stable brokers, and mature operational practices may not need a storage architecture change. Shared storage is a response to a specific pressure: the mismatch between broker-local logs and cloud-native compute operations.

For architects, the decision comes down to which constraint dominates. If the constraint is long retention on local disks, tiered storage may be enough. If the constraint is broker replacement, elastic scaling, and recovery without data movement, shared storage deserves a deeper look. If the constraint is application compatibility, any answer has to preserve the Kafka contract before it optimizes storage.

References

FAQ

What is Kafka shared storage?

Kafka shared storage externalizes durable log data from broker-local disks into a shared durable storage layer. Brokers still serve Kafka clients and manage partition leadership, but persistent records are not permanently tied to one broker's local storage.

Is a stateless Kafka broker really stateless?

Not completely. A stateless broker still has runtime state such as connections, caches, buffers, metrics, and leadership assignments. The distinction is that long-lived persistent log data is not bound to local disk.

How is shared storage different from Kafka tiered storage?

Tiered storage typically keeps recent active data on local broker disks and moves older completed segments to remote storage. Shared storage can make the durable log layer independent of broker-local disks, enabling operationally stateless brokers.

Does shared storage remove the need for Kafka metadata?

No. Shared storage makes metadata more important. The system still needs consistent partition ownership, leader epochs, broker registration, fencing, and recovery decisions. Without a strong control plane, shared storage would create correctness risks.

Why use object storage for Kafka?

Object storage is attractive because it is durable, elastic, and widely available across cloud environments. A Kafka-compatible system still needs a storage abstraction that handles ordering, indexing, write ownership, and read paths above the raw object API.

Where does AutoMQ fit in this architecture?

AutoMQ is a Kafka-compatible implementation that uses S3Stream to place durable stream data on S3-compatible object storage while brokers remain responsible for Kafka protocol handling and runtime compute. It is one concrete example of the shared-storage pattern for stateless brokers.

Newsletter

Subscribe for the latest on cloud-native streaming data infrastructure, product launches, technical insights, and efficiency optimizations from the AutoMQ team.

Join developers worldwide who leverage AutoMQ's Apache 2.0 licensed platform to simplify streaming data infra. No spam, just actionable content.

I'm not a robot
reCAPTCHA

Never submit confidential or sensitive data (API keys, passwords, credit card numbers, or personal identification information) through this form.