Blog

Kafka to Pub/Sub Migration: What Changes When Moving from Kafka to Google Pub/Sub?

Moving from Apache Kafka to Google Cloud Pub/Sub sounds straightforward until the first application team asks where its offsets went. Both systems move events between producers and consumers, both can serve high-throughput cloud workloads, and both are common in data platform conversations. But a Kafka to Pub/Sub migration is not a broker swap. It changes the API contract, the consumption model, replay behavior, ordering assumptions, connector choices, and the operational signals your SREs use during incidents.

That does not make Pub/Sub the wrong choice. It means the migration has to be designed as a semantic migration, not a lift-and-shift infrastructure project. Pub/Sub is a Google Cloud-native messaging service with managed scaling and a topic/subscription model. Kafka is a distributed event log where topic partitions, offsets, consumer groups, and client libraries are part of the application contract. The hard part is not creating a Pub/Sub topic; it is deciding which Kafka assumptions your applications are allowed to lose.

Kafka to Pub/Sub change map

Why Teams Consider Moving from Kafka to Pub/Sub

Most Kafka-to-Pub/Sub discussions begin with operations. A team is tired of broker sizing, disk pressure, partition reassignments, patching, or capacity forecasting. Pub/Sub offers an attractive alternative when the workload is already committed to Google Cloud and the team wants a fully managed messaging substrate instead of operating Kafka clusters.

The strongest cases usually share three traits:

  • The application already lives on Google Cloud. If producers, consumers, IAM, observability, and downstream analytics are GCP-native, Pub/Sub can simplify platform ownership.
  • The workload is message-oriented rather than log-oriented. Pub/Sub fits event delivery and fan-out well when applications do not depend heavily on Kafka offsets, partition-local ordering, or long replay windows.
  • The team can tolerate application rewrites. Client code, consumer logic, connector configuration, schema handling, and monitoring dashboards all need attention.

The weaker cases are more revealing. If your real goal is to stop managing Kafka brokers while preserving Kafka APIs, Kafka Streams jobs, Kafka Connect pipelines, consumer groups, and offset-based replay, Pub/Sub may remove one operational burden by creating a different engineering project.

Kafka and Pub/Sub Optimize for Different Contracts

Kafka exposes a durable, partitioned log. Producers write records to topics, records land in partitions, consumers read from offsets, and consumer groups coordinate partition ownership. This gives application teams a precise mental model: a consumer can know where it is, move backward or forward, and reason about ordering inside a partition.

Pub/Sub exposes topics and subscriptions. Publishers send messages to a topic, subscriptions receive messages, and subscribers acknowledge delivery. Replay is tied to subscription state, message retention, snapshots, or seek operations rather than an always-visible partition offset. Ordering can be enabled with ordering keys, but it is not the same design as Kafka partition ordering. That difference matters because many Kafka applications do not explicitly say "I depend on offsets"; they encode that dependency in retry logic, backfills, lag dashboards, and exactly where a job restarts after failure.

AreaKafka expectationPub/Sub migration impact
Producer APIKafka client, topic, partitioning, headers, serializersRewrite publisher code or introduce an adapter; revisit batching, retries, and message attributes
Consumer modelConsumer group owns partitions and commits offsetsReplace group/offset logic with subscription delivery, acknowledgments, and redelivery handling
ReplaySeek to offsets or timestamps within retained log dataUse Pub/Sub retention, snapshots, and seek where available; verify replay window and operational workflow
OrderingOrdered records within a topic partitionUse ordering keys where required; redesign workloads that assume partition-wide order
EcosystemKafka Connect, Kafka Streams, ksqlDB, Kafka-native toolingReplace, bridge, or retire Kafka ecosystem components
OperationsBroker, partition, ISR, offset lag, disk, controller metricsMove to Pub/Sub subscription metrics, ack latency, delivery attempts, and backlog signals

The table is the migration in miniature. The project is manageable when each row maps cleanly to a new design. It becomes risky when teams discover hidden dependencies late, especially around replay, ordering, and connector behavior.

API and Semantic Changes

The most visible rewrite is the producer and consumer API. Kafka producers usually carry serializers, partitioning rules, retry policy, and keys that drive partition placement. Pub/Sub publishers send messages with data and attributes to topics, and the service handles delivery to subscriptions. If a Kafka producer uses the record key to guarantee per-entity ordering or route workloads across partitions, the Pub/Sub design needs an explicit ordering-key strategy.

Consumer code changes more deeply. A Kafka consumer polls records, processes them, and commits offsets. In Pub/Sub, subscribers receive messages and acknowledge them; missed acknowledgments can cause redelivery. Kafka applications often use "commit after side effect" as the core correctness pattern. Pub/Sub applications need the same discipline, but the implementation uses acknowledgments, ack deadlines, retry policy, and dead-letter handling instead of offset commits.

The migration team should inspect code for these patterns before writing any bridge:

  • Manual offset commits. These often hide business rules about when a record is considered durable downstream.
  • Timestamp or offset seeking. Backfills, reprocessing jobs, and incident recovery playbooks may assume Kafka's log navigation model.
  • Partition-aware state. Some services shard local state by partition assignment; Pub/Sub does not hand you Kafka partition ownership.

Classify every Kafka application by the semantics it uses, not by the volume it handles. A low-throughput service with offset-based replay can be harder to migrate than a high-throughput fire-and-forget event stream.

Offsets, Replay, and Retention

Replay is where many migrations become uncomfortable. In Kafka, offsets are stable coordinates in a partitioned log. Within the configured retention period, a consumer can typically reset position by offset or timestamp and reprocess data. That behavior makes Kafka useful for recovery, audit trails, and rebuilding downstream materialized views.

Pub/Sub supports replay through mechanisms such as message retention, snapshots, and seek, but the operational model is different. You need to decide how long messages must be retained, whether acknowledged messages should be retained for replay, which subscriptions need seek capability, and how teams will perform recovery without Kafka's offset vocabulary.

For migration planning, replay requirements fall into three buckets:

  • Operational replay over a defined window. Pub/Sub may fit, but the team must verify retention settings, snapshot workflows, and how replay affects downstream consumers.
  • Log-style reprocessing as a product requirement. Kafka remains a better conceptual match when application teams routinely seek by offset or timestamp, run new consumers over historical data, or treat the stream as a durable event log.

This changes who owns recovery. In Kafka, the application or platform team often controls replay by changing consumer group position. In Pub/Sub, recovery becomes a subscription-level operation with different permissions and tooling.

Ordering and Delivery

Kafka's ordering promise is partition-scoped. Records in the same partition are read in order, and application designers often choose keys so related events land in the same partition. That model combines ordering, scaling, and consumer assignment into one familiar mechanism.

Pub/Sub can provide ordered delivery when messages are published with ordering keys and the feature is configured appropriately. That gives teams a path for per-key ordering, but it does not preserve Kafka's partition model. "Events for customer 123 are processed in order" can often be redesigned. "Partition 7 is owned by this consumer instance and drives this state shard" is a larger architectural change.

Delivery semantics also need careful wording. Kafka and Pub/Sub both support reliable delivery patterns, but your end-to-end semantics depend on the application, idempotency, side effects, and retry behavior. Trace the full path from publish to side effect to acknowledgment or offset commit. That is where correctness lives.

Connector and Ecosystem Impact

Kafka's ecosystem is one reason teams adopt it in the first place. Kafka Connect, Kafka Streams, schema registries, ACL models, and monitoring conventions all attach to the Kafka contract.

When moving to Pub/Sub, each ecosystem dependency needs one of four decisions:

DependencyMigration choiceWhat to verify
Kafka Connect sourceReplace with a Pub/Sub-native connector or Dataflow pipelineField mapping, schema compatibility, backpressure, retry behavior
Kafka Connect sinkReplace, bridge, or keep Kafka for that pipelineDelivery guarantees, batching, destination idempotency
Kafka Streams appRewrite to a Pub/Sub-compatible processing frameworkState stores, windowing, repartition topics, restore behavior
Schema Registry usageMap to Pub/Sub schemas or another schema governance modelCompatibility rules, deployment workflow, producer validation
Monitoring and alertingRebuild around Pub/Sub metrics and logsBacklog, ack latency, error rates, dead-letter volume

Dataflow can help bridge or transform streams during migration. It can move bytes while the organization is rewriting semantics, but it does not decide what an offset, replay, ordering key, or schema compatibility rule means in the target system.

A Practical Migration Path

The safest Kafka to Pub/Sub migration starts with a workload inventory rather than a cluster inventory. Cluster metrics tell you throughput and retention. Application inventory tells you which teams depend on Kafka behavior, and that predicts migration effort more accurately.

Kafka to Pub/Sub migration decision tree

Begin with a pilot topic that has simple delivery requirements, low replay sensitivity, and owners who can rewrite code quickly. Run shadow delivery before changing the source of truth, and compare message counts, latency, error behavior, and downstream side effects.

A practical sequence looks like this:

  1. Inventory semantics. Classify applications by offset use, replay expectations, ordering model, schema dependency, and connector usage.
  2. Design Pub/Sub equivalents. Define topics, subscriptions, ordering keys, retention, dead-letter handling, IAM, and monitoring.
  3. Bridge only where it reduces risk. Use a bridge or Dataflow pipeline to support phased migration, not to hide unresolved semantic differences.
  4. Run shadow traffic. Compare data integrity and operational signals before switching consumers.
  5. Cut over by dependency group. Move producers, consumers, connectors, and dashboards in an order that preserves rollback.

The reason to move slowly is respect for the hidden contract Kafka often has with applications. You can migrate infrastructure quickly and still spend months repairing assumptions that no one documented.

When Pub/Sub Is the Right Target

Pub/Sub is compelling when the desired end state is a Google Cloud-native eventing architecture around services such as Cloud Run, GKE, BigQuery, Dataflow, and Google Cloud IAM.

Good candidates include event notification streams, asynchronous task fan-out, service integration events, and pipelines that fit Pub/Sub subscriptions and Google Cloud data services. You are choosing a different messaging contract because it better fits the platform.

Poor candidates are workloads where Kafka is not incidental. If a business process uses Kafka offsets, Kafka Streams, Kafka Connect, and partition assignment for state locality, moving to Pub/Sub becomes an application modernization program.

When AutoMQ Is a Lower-Change Alternative

Some teams start with "replace Kafka with Pub/Sub" when the real sentence is "stop operating traditional Kafka." If the problem is the Kafka protocol, ecosystem, or log model, Pub/Sub may be the right target. If the problem is broker operations, storage cost, scaling friction, or recovery time, a Kafka-compatible replacement can preserve more of the application contract.

AutoMQ fits that second category. It is a Kafka-compatible cloud-native streaming system that keeps Kafka protocol semantics while moving durable storage away from broker-local disks and into object storage. Brokers become more stateless, scaling is less tied to data movement, and existing Kafka clients and ecosystem tools can remain part of the architecture.

Lower-change Kafka-compatible replacement path

That distinction is useful during executive review. A Pub/Sub migration asks application teams to accept a new messaging contract. An AutoMQ migration asks platform teams to modernize the Kafka substrate while keeping the Kafka-facing contract familiar. They solve different versions of the problem.

Use this decision rule:

Primary goalBetter-fit path
Adopt a GCP-native messaging model and rewrite applications around itPub/Sub migration
Preserve Kafka APIs, offsets, clients, and ecosystem while reducing traditional Kafka operationsKafka-compatible replacement such as AutoMQ
Retire Kafka Connect and rebuild pipelines on Dataflow or other GCP servicesPub/Sub and GCP-native data pipeline redesign
Keep Kafka Streams or offset-heavy applications with minimal application changeKafka-compatible architecture

If the team cannot explain which row applies, pause the migration plan. The wrong target will not become right because the cutover date is close.

Decision Checklist Before You Commit

A Kafka to Pub/Sub migration should leave planning only after application owners have answered concrete questions:

  • Which applications use manual offset commits, timestamp seeking, or replay playbooks?
  • Which topics require per-key ordering, and which workloads depend on partition assignment?
  • Which connectors, stream processors, and schema systems must be replaced or bridged?
  • What is the required replay window, and how will Pub/Sub retention, snapshots, and seek support it?
  • What does rollback mean after producers have switched to Pub/Sub?
  • Is the goal to leave Kafka semantics, or mainly to leave Kafka operations?

The final question is the most important one. If the organization wants a GCP-native messaging contract, Pub/Sub deserves a clean design. If the organization wants Kafka behavior without traditional broker pain, evaluate Kafka-compatible cloud-native options before committing every application team to a rewrite. AutoMQ's documentation is a useful next step because it shows what a Kafka-compatible, object-storage-backed architecture looks like in practice.

References

FAQ

Is Kafka to Pub/Sub migration a lift-and-shift project?

No. It usually requires application and semantic changes because Kafka's topic partition, consumer group, offset, and replay model differs from Pub/Sub's topic, subscription, acknowledgment, and seek model. Some simple event streams migrate cleanly, but offset-heavy applications need deeper redesign.

Can Pub/Sub replace Kafka consumer groups?

Pub/Sub subscriptions can support fan-out and subscriber delivery, but they are not a drop-in replacement for Kafka consumer groups. Kafka consumer groups coordinate partition ownership and offset commits. Pub/Sub uses subscriptions, acknowledgments, ack deadlines, and redelivery behavior.

How should teams handle replay after moving from Kafka to Pub/Sub?

Define replay requirements before migration. Pub/Sub replay depends on retention, acknowledged-message retention settings, snapshots, and seek operations. Teams that frequently reprocess historical streams by offset or timestamp should verify that Pub/Sub's operational model supports their recovery playbooks.

Does Pub/Sub preserve Kafka ordering?

Pub/Sub can support ordered delivery with ordering keys, but it does not preserve Kafka's partition model. If your application only needs per-entity ordering, Pub/Sub may fit. If it depends on partition assignment for state locality, expect a larger rewrite.

When is AutoMQ a better alternative than Pub/Sub?

AutoMQ is worth evaluating when the goal is to reduce traditional Kafka operations while preserving Kafka APIs, clients, offsets, and ecosystem compatibility. Pub/Sub is a better fit when the team wants to adopt a GCP-native messaging contract and can rewrite applications around that model.

Newsletter

Subscribe for the latest on cloud-native streaming data infrastructure, product launches, technical insights, and efficiency optimizations from the AutoMQ team.

Join developers worldwide who leverage AutoMQ's Apache 2.0 licensed platform to simplify streaming data infra. No spam, just actionable content.

I'm not a robot
reCAPTCHA

Never submit confidential or sensitive data (API keys, passwords, credit card numbers, or personal identification information) through this form.