Broker and Controller Configuration

This document aims to explain the configuration parameters involved in the deployment of AutoMQ. It includes configuration definitions, descriptions, setting ranges, and specifications to aid developers in making necessary custom adjustments in production environments.

AutoMQ implements a storage-compute separation based on object storage and is fully compatible with Apache Kafka. Therefore, Kafka's functional configurations (e.g., ACL, network, etc.) can be referred to from the official configuration documentation. This document only lists the configuration parameters related to the new storage module added by AutoMQ.

Public Configuration

elasticstream.enable

Item	Description
Configuration Description	Whether to enable AutoMQ, this parameter must be set to true.
Value Type	boolean
Default Value	false
Valid Input Range	N/A
Importance Level	High, requires careful configuration

automq.zonerouter.channels

By enabling cross-AZ (Availability Zone) request routing, you can reduce cross-AZ data transmission and decrease traffic costs. For more details, please refer to Overview▸.

Item	Description
Configuration Description	Configuration for Inter-Zone channel. By configuring Inter-Zone routing components, the cost caused by cross-zone traffic can be significantly reduced. Currently, only object storage is supported. The format is: 0@s3://$bucket?region=$region[&batchInterval=250][&maxBytesInBatch=8388608]
Value Type	string
Default Value	null
Valid Input Range	N/A
Importance Level	High, requires careful configuration

s3.data.buckets

Item	Description
Configuration Description	The URI for data plane object storage. The format is: 0@s3://$bucket?region=$region[&endpoint=$endpoint][&pathStyle=$enablePathStyle][&authType=$authType][&accessKey=$accessKey][&secretKey=$secretKey][&checksumAlgorithm=$checksumAlgorithm]. For configurations from different vendors, see: Object Storage Configuration▸.
Value Type	string
Default Value	null
Valid Input Range	N/A
Importance Level	High, requires careful configuration

s3.ops.buckets

Item	Description
Configuration Description	The URI for control plane object storage. The format is: 1@s3://$bucket?region=$region[&endpoint=$endpoint][&pathStyle=$enablePathStyle][&authType=$authType][&accessKey=$accessKey][&secretKey=$secretKey][&checksumAlgorithm=$checksumAlgorithm]. For configurations from different vendors, see: Object Storage Configuration▸.
Value Type	string
Default Value	null
Valid Input Range	N/A
Importance Level	High, requires careful configuration

s3.wal.path

Item	Description
Configuration Description	The mount path for block storage devices used for storing local WAL, with the format: 0@s3://$bucket?region=$region[&batchInterval=250][&maxBytesInBatch=8388608] For details on different vendors, refer to: Object Storage Configuration▸
Value Type	string
Default Value	null
Valid Input Range	N/A
Importance Level	High, requires careful configuration

s3.wal.cache.size

Item	Description
Configuration Description	WAL (Write-Ahead Logging) cache is a FIFO (first-in-first-out) queue that contains data which has not been uploaded to object storage, as well as data that has been uploaded but not yet evicted from the cache. When the cache data that has not been uploaded fills up the entire capacity, storage will exert backpressure on subsequent requests until the data upload is completed. By default, it sets a reasonable value based on memory.
Value Type	long, measured in bytes
Default Value	-1, automatically set by the program to a suitable parameter value
Legal Input Range	[1, ...]
Importance Level	Low, set relatively broad

s3.wal.upload.threshold

Item	Description
Configuration Description	The threshold that triggers WAL uploads to object storage. The configuration value needs to be less than `s3.wal.cache.size`. The larger the configuration value, the higher the data aggregation degree, leading to lower metadata storage costs. By default, it sets a reasonable value based on memory.
Value Type	long, measured in bytes
Default Value	-1, automatically set by the program to a suitable parameter value
Legal Input Range	[1, ...]
Importance Level	Low, set relatively broad

s3.block.cache.size

Item	Description
Configuration Description	`s3.block.cache.size` is the size of the block cache. The block cache is used to cache cold data read from object storage. It is recommended to set this configuration item to more than 4MB * the number of concurrent cold reads per partition to achieve better cold read performance. By default, it sets a reasonable value based on memory.
Value Type	long, measured in bytes
Default Value	-1, automatically set by the program to a suitable parameter value
Legal Input Range	[1, ...]
Importance Level	Low, set relatively broad

s3.stream.object.compaction.interval.minutes

Item	Description
Configuration Description	The interval period for compaction in Stream objects. The larger the interval, the lower the cost of API calls, but it increases the size of metadata storage.
Value Type	int, measured in minutes
Default Value	30
Legal Input Range	[1, ...]
Importance Level	Low, set relatively broad

s3.stream.object.compaction.max.size.bytes

Item	Description
Configuration Description	Stream object compaction allows for the maximum size of synthetic objects. The larger this value, the higher the cost of API calls, but the smaller the scale of metadata storage.
Value Type	long, measured in bytes
Default Value	1073741824
Legal Input Range	[1, ...]
Importance Level	Low, set relatively broad

s3.stream.set.object.compaction.interval.minutes

Item	Description
Configuration Description	Sets the interval for stream object compaction. The smaller this value, the smaller the scale of the metadata storage, and the data becomes compacted sooner. However, the final stream objects will undergo compaction more frequently.
Value Type	int, measured in minutes
Default Value	20
Legal Input Range	[1, ...]
Importance Level	Low, set relatively broad

s3.stream.set.object.compaction.cache.size

Item	Description
Configuration Description	The size of memory available during the stream object compaction process. The larger this value, the lower the cost of API calls.
Value Type	long, in bytes
Default Value	209715200
Legal Input Range	[1048576, ...]
Importance Level	Low, set relatively broad

s3.stream.set.object.compaction.stream.split.size

Item	Description
Configuration Description	During the Stream object compaction process, if the data volume within a single Stream exceeds this threshold, the Stream's data will be directly split and written into individual Stream objects. The smaller this value, the earlier the data is split from the Stream set object, resulting in lower subsequent API call costs for Stream object compaction, but leading to higher API call costs for the split.
Value Type	long, measured in bytes
Default Value	8388608
Legal Input Range	[1, ...]
Importance Level	Low, set relatively broad

s3.network.baseline.bandwidth

Item	Description
Configuration Description	The total available bandwidth for object storage requests. This is used to prevent stream set object compaction and catch-up reads from occupying normal read-write traffic. Production and consumption will also individually consume inbound and outbound traffic. For instance, if this value is set to 100MB/s and normal read-write traffic is 80MB/s, then the available traffic for stream set object compaction would be 20MB/s.
Value Type	long, measured in byte/s
Default Value	104857600
Legal Input Range	[1, ...]
Importance Level	Low, set relatively broad

s3.stream.allocator.policy

Item	Description
Configuration Description	S3Stream memory allocator policy. Note that when configured to use DIRECT memory, the heap size in the virtual machine options (e.g., -Xmx) and the direct memory size (e.g., -XX:MaxDirectMemorySize) need to be adjusted. You can set them via the environment variable KAFKA_HEAP_OPTS.
Value Type	string
Default	POOLED_HEAP
Valid Values	POOLED_HEAP, POOLED_DIRECT
Importance Level	Low, set relatively broad

s3.telemetry.metrics.level

Item	Description
Configuration Description	Sets the level of Metrics logging. The "INFO" level includes metrics that most users should be concerned with, such as throughput and latency of common stream operations. The "DEBUG" level includes detailed metrics useful for diagnostics, such as latencies at various stages when writing to underlying block devices.
Value Type	string
Default Value	INFO
Valid Input Range	INFO, DEBUG
Importance Level	Low, set relatively broad

s3.telemetry.exporter.report.interval.ms

Item	Description
Configuration Description	Sets the interval for Metrics export.
Value Type	int, in milliseconds
Default Value	30000
Valid Input Range	N/A
Importance Level	Low, set relatively broad

s3.telemetry.metrics.base.labels

Item	Description
Configuration Description	Multi-dimensional labels for metrics, used to attach static multi-dimensional labels to all monitoring metrics, enabling categorization, aggregation, and fine-grained analysis of metrics. The format is: key1=value1,key2=value2.
Value Type	string
Default Value	null
Valid Input Range	N/A
Importance level	Medium, requires careful configuration

s3.telemetry.metrics.exporter.uri

Item	Description
Configuration description	The export URI for Metrics. The format is: $type://?$param1=$value1&$param2=$value2. Currently the supported types for 'type' are prometheus and otlp. The format for prometheus: prometheus://?host=$hostname&port=$port The format for otlp: otlp://?endpoint=$endpoint&protocol=$protocol&compression=$compression
Value Type	string
Default Value	null
Valid Input Range	N/A
Importance Level	High, requires careful configuration

Persistent Data Rebalancing Configuration

metric.reporters

Item	Description
Configuration Description	A list of classes for metrics reporters. By implementing the `org.apache.kafka.common.metrics.MetricsReporter` interface, you can dynamically load new metrics. `JmxReporter` is always included to register JMX statistics. To enable AutoBalancing, `metric.reporters` must include `kafka.autobalancer.metricsreporter.AutoBalancerMetricsReporter`.
Value Type	list
Default Value	""
Valid Input Range	N/A
Importance Level	Low, set relatively broad

autobalancer.reporter.metrics.reporting.interval.ms

Item	Description
Configuration Description	Interval for reporting data by Metrics Reporter.
Value Type	long, unit is milliseconds
Default Value	10000
Legal Input Range	[1000, ...]
Importance Level	High, requires careful configuration

autobalancer.controller.enable

Item	Description
Configuration Description	Whether to enable automatic rebalancing.
Value Type	boolean
Default Value	false
Valid Input Range	N/A
Importance Level	High, requires careful configuration

autobalancer.controller.anomaly.detect.interval.ms

Item	Description
Configuration description	The minimum interval for the Controller to check if a data rebalance is needed. The actual time for the next rebalance also depends on the number of partitions that have been reassigned. Reducing the minimum check interval can increase the sensitivity of data rebalancing. This value should be greater than the broker metrics reporting interval to prevent the controller from missing recent reassignment results.
Value Type	long, unit is milliseconds
Default Value	60000
Legal Input Range	[1, ...]
Importance Level	High, requires careful configuration

autobalancer.controller.exclude.topics

Item	Description
Configuration Description	List of Topics to be excluded from data rebalancing.
Value Type	list
Default Value	""
Valid Input Range	N/A
Importance Level	High, requires careful configuration

autobalancer.controller.exclude.broker.ids

Item	Description
Configuration Description	List of Broker Ids that should be excluded from data rebalancing.
Value Type	list
Default Value	""
Valid Input Range	N/A
Importance Level	High, requires careful configuration

Table Topic

Table Topic is a core feature designed by AutoMQ for modern data lake architectures, primarily aimed at achieving seamless integration between streaming data and static data lakes. This architectural innovation addresses traditional challenges such as the separation of streaming and batch processing, complex ETL processes, and high costs. For more details, please see Overview▸.

automq.table.topic.enable

Item	Description
Configuration Description	Enable or disable AutoMQ Table Topic. When enabled, an Iceberg Table will be created to store AutoMQ Table data.
Value Type	boolean
Default Value	false
Valid Input Range	N/A
Importance Level	High, requires careful configuration

automq.table.topic.commit.interval.ms

Item	Description
Configuration Description	The commit interval for Table Topic data. A shorter commit interval results in higher data real-time performance, but also increases processing cost, and vice versa.
Value Type	long
Default Value	300000, in ms
Valid Input Range	Whole integers
Importance Level	High, requires careful configuration

automq.table.topic.upsert.enable

Item	Description
Configuration Description	Determines whether the Upsert feature for Table Topic is enabled. When enabled, the system will automatically discern whether to insert a new record or update an existing one based on the primary key.
Value Type	boolean
Default Value	false
Valid Input Range	N/A
Importance Level	High, requires careful configuration

automq.table.topic.partition.by

Item	Description
Configuration Description	Defines the partitioning rules for a Table Topic, partitioning data by field or function.
Value Type	string
Default Value	null
Valid Input Range	N/A
Importance Level	High, requires careful configuration

automq.table.topic.id.columns

Item	Description
Configuration Description	Specifies the unique primary key column for the table.
Value Type	string
Default Value	null
Valid Input Range	N/A
Importance Level	High, requires careful configuration

automq.table.topic.cdc.field

Item	Description
Configuration Description	Specifies the field name that records the CDC (Change Data Capture) operation type. This is used to identify the type of database change operation, with a value as a single character: `I` , `U` , or `D`, corresponding to insert, update, and delete actions respectively.
Value Type	string
Default Value	null
Valid Input Range	N/A
Importance Level	High, requires careful configuration

automq.table.topic.schema.type

Item	Description
Configuration Description	schema type. Supports two modes: `schemaless` (does not parse message content) and `schema` (requires the message value's Schema to be predefined in the schema registry and written into Iceberg based on that Schema).
Value Type	string
Default Value	schemaless
Valid Input Range	N/A
Importance Level	High, requires careful configuration

automq.table.topic.namespace

Item	Description
Configuration Description	Namespace for the Table topic
Value Type	string
Default Value	null
Valid Input Range	N/A
Importance Level	High, requires careful configuration

automq.table.topic.schema.registry.url

Item	Description
Configuration Description	The service URL of the Schema Registry
Value Type	string
Default Value	null
Valid Input Range	N/A
Importance Level	High, requires careful configuration

automq.table.topic.catalog.type

Item	Description
Configuration Description	Specifies the type of Catalog. Currently supports five types: rest, glue, nessie, tablebucket, hive
Value Type	string
Default Value	null
Valid Input Range	N/A
Importance Level	High, requires careful configuration

AutoMQ Table Topic supports various types of Iceberg Catalogs, each needing distinct configuration parameters. For additional information, refer to:

Table Topic Configuration▸

Rest

automq.table.topic.catalog.uri

Item	Description
Configuration Description	Specify the Iceberg REST Catalog service address
Value Type	string
Default Value	null
Valid Input Range	N/A
Importance Level	High, requires careful configuration

automq.table.topic.catalog.warehouse

Item	Description
Configuration Description	Specify the S3 path of the Catalog repository
Value Type	string
Default Value	null
Valid Input Range	N/A
Importance Level	High, requires careful configuration

automq.table.topic.catalog.oauth2-server-uri

Item	Description
Configuration Description	Specifies the address of the oauth authentication service
Value Type	string
Default Value	null
Valid Input Range	N/A
Importance Level	Optional

automq.table.topic.catalog.credential

Item	Description
Configuration Description	Specify the certificate for Catalog
Value Type	string
Default Value	null
Valid Input Range	N/A
Importance Level	Optional

automq.table.topic.catalog.token

Item	Description
Configuration Description	Specifies the token for the Catalog
Value Type	string
Default Value	null
Valid Input Range	N/A
Importance Level	Optional

automq.table.topic.catalog.scope

Item	Description
Configuration Description	Specifies the Catalog authorization scope
Value Type	string
Default Value	null
Valid Input Range	N/A
Importance Level	Optional

Glue

automq.table.topic.catalog.warehouse

Item	Description
Configuration Description	Specifies the S3 path for the Iceberg repository
Value Type	string
Default Value	null
Valid Input Range	N/A
Importance Level	High, requires careful configuration

Nessie

automq.table.topic.catalog.uri

Item	Description
Configuration Description	Specifies the Iceberg Catalog service address
Value Type	string
Default Value	null
Valid Input Range	N/A
Importance Level	High, requires careful configuration

automq.table.topic.catalog.warehouse

Item	Description
Configuration Description	Specifies the S3 path for the Iceberg repository
Value Type	string
Default Value	null
Valid Input Range	N/A
Importance Level	High, requires careful configuration

Tablebucket

automq.table.topic.catalog.warehouse

Item	Description
Configuration Description	Specifies the S3 path for the Iceberg repository
Value Type	string
Default Value	null
Valid Input Range	N/A
Importance Level	High, requires careful configuration

Hive

automq.table.topic.catalog.uri

Item	Description
Configuration Description	Specifies the Iceberg Catalog service address
Value Type	string
Default Value	null
Valid Input Range	N/A
Importance Level	High, requires careful configuration

automq.table.topic.catalog.warehouse

Item	Description
Configuration Description	Specifies the S3 path for the Iceberg repository
Value Type	string
Default Value	null
Valid Input Range	N/A
Importance Level	High, requires careful configuration

automq.table.topic.catalog.auth

Item	Description
Configuration Description	Specifies the address of the Catalog authentication service
Value Type	string
Default Value	null
Valid Input Range	N/A
Importance Level	Optional

Public Configuration​

elasticstream.enable​

automq.zonerouter.channels​

S3Stream Related Configuration​

s3.data.buckets​

s3.ops.buckets​

s3.wal.path​

s3.wal.cache.size​

s3.wal.upload.threshold​

s3.block.cache.size​

s3.stream.object.compaction.interval.minutes​

s3.stream.object.compaction.max.size.bytes​

s3.stream.set.object.compaction.interval.minutes​

s3.stream.set.object.compaction.cache.size​

s3.stream.set.object.compaction.stream.split.size​

s3.network.baseline.bandwidth​

s3.stream.allocator.policy​

s3.telemetry.metrics.level​

s3.telemetry.exporter.report.interval.ms​

s3.telemetry.metrics.base.labels​

s3.telemetry.metrics.exporter.uri​

Persistent Data Rebalancing Configuration​

metric.reporters​

autobalancer.reporter.metrics.reporting.interval.ms​

autobalancer.controller.enable​

autobalancer.controller.anomaly.detect.interval.ms​

autobalancer.controller.exclude.topics​

autobalancer.controller.exclude.broker.ids​

Table Topic​

automq.table.topic.enable​

automq.table.topic.commit.interval.ms​

automq.table.topic.upsert.enable​

automq.table.topic.partition.by​

automq.table.topic.id.columns​

automq.table.topic.cdc.field​

automq.table.topic.schema.type​

automq.table.topic.namespace​

automq.table.topic.schema.registry.url​

automq.table.topic.catalog.type​

Rest​

automq.table.topic.catalog.uri​

automq.table.topic.catalog.warehouse​

automq.table.topic.catalog.oauth2-server-uri​

automq.table.topic.catalog.credential​

automq.table.topic.catalog.token​

automq.table.topic.catalog.scope​

Glue​

automq.table.topic.catalog.warehouse​

Nessie​

automq.table.topic.catalog.uri​

automq.table.topic.catalog.warehouse​

Tablebucket​

automq.table.topic.catalog.warehouse​

Hive​

automq.table.topic.catalog.uri​

automq.table.topic.catalog.warehouse​

automq.table.topic.catalog.auth​

Public Configuration

elasticstream.enable

automq.zonerouter.channels

S3Stream Related Configuration

s3.data.buckets

s3.ops.buckets

s3.wal.path

s3.wal.cache.size

s3.wal.upload.threshold

s3.block.cache.size

s3.stream.object.compaction.interval.minutes

s3.stream.object.compaction.max.size.bytes

s3.stream.set.object.compaction.interval.minutes

s3.stream.set.object.compaction.cache.size

s3.stream.set.object.compaction.stream.split.size

s3.network.baseline.bandwidth

s3.stream.allocator.policy

s3.telemetry.metrics.level

s3.telemetry.exporter.report.interval.ms

s3.telemetry.metrics.base.labels

s3.telemetry.metrics.exporter.uri

Persistent Data Rebalancing Configuration

metric.reporters

autobalancer.reporter.metrics.reporting.interval.ms

autobalancer.controller.enable

autobalancer.controller.anomaly.detect.interval.ms

autobalancer.controller.exclude.topics

autobalancer.controller.exclude.broker.ids

Table Topic

automq.table.topic.enable

automq.table.topic.commit.interval.ms

automq.table.topic.upsert.enable

automq.table.topic.partition.by

automq.table.topic.id.columns

automq.table.topic.cdc.field

automq.table.topic.schema.type

automq.table.topic.namespace

automq.table.topic.schema.registry.url

automq.table.topic.catalog.type

Rest

automq.table.topic.catalog.uri

automq.table.topic.catalog.warehouse

automq.table.topic.catalog.oauth2-server-uri

automq.table.topic.catalog.credential

automq.table.topic.catalog.token

automq.table.topic.catalog.scope

Glue

automq.table.topic.catalog.warehouse

Nessie

automq.table.topic.catalog.uri

automq.table.topic.catalog.warehouse

Tablebucket

automq.table.topic.catalog.warehouse

Hive

automq.table.topic.catalog.uri

automq.table.topic.catalog.warehouse

automq.table.topic.catalog.auth