Teams rarely move from Google Pub/Sub to Kafka because Pub/Sub failed at basic messaging. More often, the pressure comes from the systems around the message stream. A platform team wants Kafka Connect connectors instead of custom integration code. A data engineering team wants Kafka topics as the stable boundary for Flink, Kafka Streams, CDC, or cross-cloud pipelines. An SRE team wants replay controls that match existing Kafka runbooks. A CTO wants one streaming contract across Google Cloud, AWS, and on-prem environments.
That is why a Pub/Sub to Kafka migration should not be treated as a queue replacement. Pub/Sub and Kafka both publish and consume events, but they expose different operating models. Pub/Sub centers the application around topics, subscriptions, acknowledgements, delivery attempts, retention settings, and Google Cloud-managed elasticity. Kafka centers the application around topics, partitions, offsets, consumer groups, log retention, and a large ecosystem built on the Kafka protocol. The hard work is not moving bytes. It is deciding which semantics must survive the move.
When a Pub/Sub to Kafka Migration Makes Sense
The strongest reason to migrate is not dislike of Pub/Sub. It is dependency on Kafka semantics or Kafka ecosystem gravity. If the business has standardized on Kafka for event contracts, observability, connector operations, schema governance, or stream processing, keeping Pub/Sub as a separate messaging island can create operational friction.
There are several common triggers:
- Kafka ecosystem dependency. Teams need Kafka Connect, Kafka Streams, Flink jobs, schema registry workflows, or existing Kafka-compatible tools as the default integration layer.
- Cross-cloud consistency. The organization wants the same producer and consumer contract across Google Cloud and other environments instead of maintaining Pub/Sub-specific application branches.
- Replay and backfill control. Data teams want log-oriented replay by consumer group and offset, especially for CDC, feature pipelines, fraud systems, and audit streams.
- Application portability. A platform team wants to reduce coupling to Google Cloud APIs while keeping event streaming as a central platform capability.
- Operational consolidation. SREs already operate Kafka lag, partitions, connector tasks, and topic retention elsewhere, and Pub/Sub introduces a second mental model.
The inverse is also true. If the workload is deeply Google Cloud-native, does not require Kafka clients, and benefits from Pub/Sub's service-owned scaling and subscription model, migration can add complexity without much gain. Kafka becomes more compelling when the stream is not only a delivery path but also a shared event log and ecosystem boundary.
The Semantic Differences You Must Plan For
A migration plan that begins with a connector choice is already late. The connector can move records, but it cannot decide whether a Pub/Sub subscription maps to a Kafka consumer group, whether an ordering key maps to a partition key, or whether a retry path maps to a dead-letter topic. Those decisions belong in the architecture design because they affect application correctness.
The most useful first pass is a semantic mapping table:
| Pub/Sub concept | Kafka concept | Migration warning |
|---|---|---|
| Topic | Topic | Names can map cleanly, but payload format, attributes, and retention policy may change. |
| Subscription | Consumer group or dedicated topic | A subscription is not only a consumer identity; it also carries delivery, retry, filtering, and acknowledgement behavior. |
| Acknowledgement | Offset commit | Both mark progress, but Pub/Sub ack state and Kafka committed offsets are different operational controls. |
| Ordering key | Partition key | Ordering scope must be revalidated because Pub/Sub ordered delivery and Kafka partition ordering are not identical. |
| Message retention and seek | Log retention and offset reset | Replay may exist in both systems, but the runbook and blast radius change. |
| Message attributes | Headers or payload fields | Attribute mapping affects routing, filtering, schema validation, and observability. |
This table is deliberately conservative. It prevents the migration from assuming one-to-one equivalence where none exists. Kafka's model is an append-only log split into partitions; consumers track offsets and consumer groups coordinate partition ownership. Pub/Sub's model is a managed messaging service where each subscription receives messages, acknowledges delivery, and can be configured with retention, retry, dead-letter, filtering, and ordering behavior.
Topics and Subscriptions vs Topics and Partitions
Pub/Sub topics are publish targets, and subscriptions define independent delivery paths from those topics. The subscription owns important behavior: acknowledgement deadline, retry policy, dead-lettering, message filtering, and whether acknowledged messages are retained for seek operations. Kafka topics are logs divided into partitions, while consumer groups read partitions and commit offsets. A Pub/Sub subscription with filtering and dead-letter behavior may become a Kafka consumer group plus a processor plus a dead-letter topic. Another subscription may map cleanly to a consumer group. Treat each subscription as a workload, not as a line in a migration spreadsheet.
Acknowledgement vs Offsets
Pub/Sub acknowledgement tells the service that a subscriber has processed a message and that the message does not need to be redelivered to that subscription. Kafka offset commits tell the Kafka cluster where a consumer group should resume reading in each partition. Both represent progress, but they behave differently during failure recovery. "Rewind the consumer group to offset 814220" is not the same runbook as "seek this subscription to a timestamp or snapshot." If SREs cannot describe the new replay command, duplicate-processing behavior, and rollback boundary, the migration is not ready for production traffic.
Ordering Key vs Partition Key
Ordering is another trap. Pub/Sub supports ordered delivery when message ordering is enabled and publishers use ordering keys. Kafka preserves order within a partition, and the partition key determines which records share that ordering lane. If the Pub/Sub ordering key was account_id, the Kafka partition key will often be the same field, but the team still needs to test hot keys, partition skew, retry behavior, and replay behavior. Happy-path ordering is easy; failure-path ordering is where migrations earn their scars.
Migration Architecture Patterns
There are three common ways to move from Pub/Sub to Kafka. None is universally correct, and the wrong choice usually comes from optimizing for speed while ignoring rollback.
The first pattern is a bridge. A service, connector, or Dataflow pipeline reads from Pub/Sub and writes to Kafka. This is useful when producers cannot change immediately or when the team wants Kafka consumers to start validating data before the write path moves. A bridge still needs careful mapping for keys, headers, schemas, timestamps, retries, and duplicate handling.
The second pattern is producer migration. Producers gradually switch from publishing to Pub/Sub to producing to Kafka. This can produce a cleaner long-term architecture, but it requires application changes, authentication changes, and a coordinated rollout.
The third pattern is dual write. Producers write to both Pub/Sub and Kafka for a validation window. Dual write is attractive because it gives downstream teams time to compare streams, but it is not a free safety net. If the producer can fail after writing to one system and before writing to the other, the two streams can diverge. Write the failure model down before trusting dual write.
For most production migrations, the bridge is the least disruptive starting point, and producer migration is the cleaner ending point. The bridge gives the Kafka side data early. Producer migration removes the bridge after confidence builds. Dual write sits between them when comparison is worth the extra consistency work.
Designing the Target Kafka Model
The target Kafka cluster is not the final step. It is the design center. Topic naming, partition count, retention, schemas, ACLs, quotas, and observability should be set before application teams begin testing.
A good Kafka target design answers five questions:
- What is the partition key? It should preserve required ordering while distributing traffic enough to avoid hot partitions.
- What is the retention contract? Retention should reflect replay and backfill needs, not default cluster settings.
- What is the schema boundary? Pub/Sub message attributes, payloads, and schemas need a consistent Kafka representation.
- What is the dead-letter strategy? Pub/Sub retry and dead-letter behavior may need Kafka topics, consumer logic, or connector rules.
- What is the ownership model? Topic creation, ACLs, quota changes, and incident response should have a clear platform owner.
This is also where teams often reconsider the infrastructure behind Kafka. Google Cloud offers Managed Service for Apache Kafka for teams that want a managed Kafka service on GCP. Self-managed Kafka on GKE or Compute Engine still gives deep control, but it brings broker sizing, disk planning, partition reassignment, and upgrade work back to the platform team. AutoMQ fits a different target profile: Kafka-compatible cloud-native streaming with compute and storage separated. Application teams keep Kafka clients and ecosystem expectations, while durable log data is backed by cloud object storage and brokers are less tied to local disk state. That matters when the migration goal is Kafka compatibility without making traditional Kafka operations the new bottleneck.
Data, Schema, and Connector Mapping
Record shape is where small assumptions become production bugs. Pub/Sub messages include data and attributes. Kafka records include key, value, headers, timestamp, and partition metadata. The most important decision is the Kafka key because it controls partition placement, ordering, and sometimes compaction. If the Pub/Sub message has an ordering key, the migration often uses it as the Kafka record key. Schema mapping deserves the same discipline: Pub/Sub schema features and Kafka registry workflows may both validate data, but they enforce compatibility at different points.
A practical mapping document should include:
| Field | Pub/Sub source | Kafka target | Validation |
|---|---|---|---|
| Message body | data | Record value | Decode, checksum, and schema validation. |
| Entity key | Ordering key or attribute | Record key | Per-key ordering and partition distribution test. |
| Attributes | Message attributes | Headers or value fields | Downstream filtering and routing test. |
| Event time | Attribute or payload field | Timestamp or value field | Late-event and windowing test. |
| Retry state | Delivery attempt / DLQ context | Header or DLQ topic | Failure replay and alert test. |
Connector configuration should implement the mapping; it should not be the only place the mapping exists.
Validation and Cutover Checklist
Migration confidence comes from repeated evidence, not from a single successful test message. You want to prove that the Kafka stream is equivalent where it must be equivalent and intentionally different where the architecture changed.
Start with shadow consumption. Let the bridge or migrated producers write to Kafka while existing Pub/Sub consumers continue serving production. Kafka consumers should run in observe-only mode, compare payloads, check ordering, measure lag, and validate schemas.
Cutover should include these checks:
- Completeness. Every Pub/Sub message in the validation window has a corresponding Kafka record, accounting for intentional filters.
- Ordering. Events with the same ordering key arrive in the expected order under normal load, retry, and backfill.
- Replay. A Kafka consumer group can replay from the required point without disrupting live consumers.
- Latency and lag. Kafka producer latency, consumer lag, and bridge backlog stay within the workload's SLO.
- Duplicate handling. Consumers tolerate duplicates or the pipeline enforces idempotency where required.
- Rollback. The team can move traffic back to Pub/Sub or continue the bridge without data loss beyond the documented failure model.
The cleanest cutovers have boring dashboards. Pub/Sub backlog drains as expected. Kafka consumer lag stays explainable. Error topics stay quiet. Rollback remains unused because everyone knows exactly how it would work.
Choosing the Right Kafka Target on GCP
Once the team decides to move to Kafka, the next question is what kind of Kafka should receive the traffic. On Google Cloud, the practical choices are managed Kafka, self-managed Kafka, and Kafka-compatible cloud-native systems.
| Target | Best fit | Trade-off to test |
|---|---|---|
| Google Managed Service for Apache Kafka | Teams that want Kafka on GCP with managed cluster lifecycle. | Check feature coverage, networking, pricing, quotas, and operational boundaries. |
| Self-managed Kafka on GKE or Compute Engine | Teams that need maximum control over brokers, storage, versions, and plugins. | Plan for upgrades, disks, replication, rebalancing, incident response, and capacity. |
| AutoMQ | Teams that need Kafka compatibility but want cloud-native storage and elastic operations. | Validate workload behavior, deployment model, object storage configuration, and ecosystem compatibility. |
This comparison should stay grounded in the migration goal. If Kafka is only needed for one downstream connector, a small managed target may be enough. If Kafka becomes the platform event backbone, storage architecture, scale operations, replay economics, and cross-cloud portability deserve more attention. The target is the new contract your application teams will depend on.
References
- Google Cloud Pub/Sub message ordering
- Google Cloud Pub/Sub replay and seek overview
- Google Cloud Pub/Sub schemas
- Google Cloud Managed Service for Apache Kafka overview
- Apache Kafka documentation
- Kafka Connect documentation
- AutoMQ architecture overview
- AutoMQ deployment on Google Cloud GKE
- AutoMQ GitHub repository
FAQ
Can Pub/Sub be migrated to Kafka with no application changes?
Sometimes a bridge can keep publishers unchanged for a while, but consumers usually need some level of change. Pub/Sub subscriptions, acknowledgements, ordering, retry behavior, and filtering do not map perfectly to Kafka consumer groups, offsets, partitions, and topics.
Is Pub/Sub ordering the same as Kafka partition ordering?
No. Pub/Sub ordered delivery uses ordering keys when message ordering is enabled, while Kafka ordering is guaranteed within a partition. The same entity key may work in both systems, but you still need to test hot keys, retry behavior, replay, and partition distribution.
Should each Pub/Sub subscription become a Kafka consumer group?
Often, but not always. A simple subscription may map to a consumer group. A subscription with filtering, custom retry behavior, dead-lettering, or different retention expectations may require a processor, a separate Kafka topic, or additional consumer logic.
What is the safest migration pattern?
A bridge followed by shadow validation is often the safest starting point because it lets Kafka consumers observe production-like traffic before cutover. The final state is usually cleaner when producers write directly to Kafka and the bridge is removed after validation.
Where does AutoMQ fit in a Pub/Sub to Kafka migration?
AutoMQ fits when the team has decided Kafka compatibility is required but does not want broker-local storage operations to define the new platform. It keeps the Kafka protocol and ecosystem orientation while using a cloud-native, object-storage-backed architecture.
If your migration is really about keeping Kafka semantics while escaping disk-bound operations, review the AutoMQ architecture overview and compare the target architecture against managed Kafka and self-managed Kafka on GCP before cutover.