Skip to Main Content

Overview

AutoMQ Table Topic provides seamless Iceberg integration, allowing streaming data to flow into the data lake for analysis and querying. This article introduces the technical architecture, principles, and core concepts related to the Table Topic feature.

Architecture and Benefits

The AutoMQ Table Topic feature enables one-stop real-time data lake entry and query analysis through its built-in stream table architecture. The technical architecture is outlined as follows:

Compared to traditional ETL data lake solutions, Table Topic offers the following advantages:

  • Out-of-the-box: With just one click, AutoMQ Table Topic can be activated, effortlessly streaming data into Iceberg tables for continuous and real-time analysis.

  • ETL-Free (Extract, Transform, Load): Traditional data lake ingestion methods often require tools like Kafka Connect or Flink. Table Topic eliminates the need for such ETL pipelines, significantly cutting costs and reducing operational complexity.

  • Auto-Scaling: AutoMQ features a stateless and elastic architecture, enabling brokers to effortlessly scale up or down with dynamic partition reassignment. Table Topic utilizes this framework to efficiently manage data ingestion rates ranging from hundreds of MiB/s to several GiB/s.

  • Seamless Integration with AWS S3 Table: Table Topic seamlessly integrates with S3 Table, leveraging its Data Catalog and maintenance functions, such as compression, snapshot management, and unreferenced file deletion. This integration also enables large-scale data analytics via AWS Athena.

Constraints and Limitations

To utilize AutoMQ's Table Topic functionality, the following conditions must be met:

  • Version Constraint: Requires AutoMQ version >= 1.5.

  • Feature Constraint: The Table Topic feature must be configured at the time of deploying the AutoMQ cluster for subsequent use. Once the cluster is deployed, the Table Topic feature cannot be enabled.

  • Catalog Requirements: To use Table Topic, users must provide an externally accessible Data Catalog service. Currently, AutoMQ supports the following Catalog types:

    • AWS S3 Table Catalog: AWS S3 offers a new Table Bucket with integrated Catalog management and data lake storage.

    • AWS Glue Catalog: AWS Glue provides cloud-based unified Catalog management, supporting integration with query tools such as Athena.

    • Hive Catalog: Users can either set up their own Hive Metastore Catalog within the Hadoop ecosystem or choose a managed EMR HMS service offered by a cloud provider.

Procedure

To use the AutoMQ Table Topic feature, users should follow these configuration steps: