Broker and Controller Configuration
This document aims to explain the configuration parameters involved in the deployment of AutoMQ. It includes configuration definitions, descriptions, setting ranges, and specifications to aid developers in making necessary custom adjustments in production environments.
AutoMQ implements a storage-compute separation based on object storage and is fully compatible with Apache Kafka. Therefore, Kafka's functional configurations (e.g., ACL, network, etc.) can be referred to from the official configuration documentation. This document only lists the configuration parameters related to the new storage module added by AutoMQ.
Public Configuration
elasticstream.enable
Item | Description |
---|---|
Configuration Description | Whether to enable AutoMQ, this parameter must be set to true. |
Value Type | boolean |
Default Value | false |
Valid Input Range | N/A |
Importance Level | High, requires careful configuration |
automq.zonerouter.channels
By enabling cross-AZ (Availability Zone) request routing, you can reduce cross-AZ data transmission and decrease traffic costs. For more details, please refer to Overview▸.
Item | Description |
---|---|
Configuration Description | Configuration for Inter-Zone channel. By configuring Inter-Zone routing components, the cost caused by cross-zone traffic can be significantly reduced. Currently, only object storage is supported. The format is: 0@s3://$bucket?region=$region[&batchInterval=250][&maxBytesInBatch=8388608] |
Value Type | string |
Default Value | null |
Valid Input Range | N/A |
Importance Level | High, requires careful configuration |
S3Stream Related Configuration
s3.data.buckets
Item | Description |
---|---|
Configuration Description | The URI for data plane object storage. The format is: 0@s3://$bucket?region=$region[&endpoint=$endpoint][&pathStyle=$enablePathStyle][&authType=$authType][&accessKey=$accessKey][&secretKey=$secretKey][&checksumAlgorithm=$checksumAlgorithm]. For configurations from different vendors, see: Object Storage Configuration▸. |
Value Type | string |
Default Value | null |
Valid Input Range | N/A |
Importance Level | High, requires careful configuration |
s3.ops.buckets
Item | Description |
---|---|
Configuration Description | The URI for control plane object storage. The format is: 1@s3://$bucket?region=$region[&endpoint=$endpoint][&pathStyle=$enablePathStyle][&authType=$authType][&accessKey=$accessKey][&secretKey=$secretKey][&checksumAlgorithm=$checksumAlgorithm]. For configurations from different vendors, see: Object Storage Configuration▸. |
Value Type | string |
Default Value | null |
Valid Input Range | N/A |
Importance Level | High, requires careful configuration |
s3.wal.path
Item | Description |
---|---|
Configuration Description | The mount path for block storage devices used for storing local WAL, with the format: 0@s3://$bucket?region=$region[&batchInterval=250][&maxBytesInBatch=8388608] For details on different vendors, refer to: Object Storage Configuration▸ |
Value Type | string |
Default Value | null |
Valid Input Range | N/A |
Importance Level | High, requires careful configuration |
s3.wal.cache.size
Item | Description |
---|---|
Configuration Description | WAL (Write-Ahead Logging) cache is a FIFO (first-in-first-out) queue that contains data which has not been uploaded to object storage, as well as data that has been uploaded but not yet evicted from the cache. When the cache data that has not been uploaded fills up the entire capacity, storage will exert backpressure on subsequent requests until the data upload is completed. By default, it sets a reasonable value based on memory. |
Value Type | long, measured in bytes |
Default Value | -1, automatically set by the program to a suitable parameter value |
Legal Input Range | [1, ...] |
Importance Level | Low, set relatively broad |
s3.wal.upload.threshold
Item | Description |
---|---|
Configuration Description | The threshold that triggers WAL uploads to object storage. The configuration value needs to be less than s3.wal.cache.size . The larger the configuration value, the higher the data aggregation degree, leading to lower metadata storage costs. By default, it sets a reasonable value based on memory. |
Value Type | long, measured in bytes |
Default Value | -1, automatically set by the program to a suitable parameter value |
Legal Input Range | [1, ...] |
Importance Level | Low, set relatively broad |
s3.block.cache.size
Item | Description |
---|---|
Configuration Description | s3.block.cache.size is the size of the block cache. The block cache is used to cache cold data read from object storage. It is recommended to set this configuration item to more than 4MB * the number of concurrent cold reads per partition to achieve better cold read performance. By default, it sets a reasonable value based on memory. |
Value Type | long, measured in bytes |
Default Value | -1, automatically set by the program to a suitable parameter value |
Legal Input Range | [1, ...] |
Importance Level | Low, set relatively broad |
s3.stream.object.compaction.interval.minutes
Item | Description |
---|---|
Configuration Description | The interval period for compaction in Stream objects. The larger the interval, the lower the cost of API calls, but it increases the size of metadata storage. |
Value Type | int, measured in minutes |
Default Value | 30 |
Legal Input Range | [1, ...] |
Importance Level | Low, set relatively broad |
s3.stream.object.compaction.max.size.bytes
Item | Description |
---|---|
Configuration Description | Stream object compaction allows for the maximum size of synthetic objects. The larger this value, the higher the cost of API calls, but the smaller the scale of metadata storage. |
Value Type | long, measured in bytes |
Default Value | 1073741824 |
Legal Input Range | [1, ...] |
Importance Level | Low, set relatively broad |
s3.stream.set.object.compaction.interval.minutes
Item | Description |
---|---|
Configuration Description | Sets the interval for stream object compaction. The smaller this value, the smaller the scale of the metadata storage, and the data becomes compacted sooner. However, the final stream objects will undergo compaction more frequently. |
Value Type | int, measured in minutes |
Default Value | 20 |
Legal Input Range | [1, ...] |
Importance Level | Low, set relatively broad |
s3.stream.set.object.compaction.cache.size
Item | Description |
---|---|
Configuration Description | The size of memory available during the stream object compaction process. The larger this value, the lower the cost of API calls. |
Value Type | long, in bytes |
Default Value | 209715200 |
Legal Input Range | [1048576, ...] |
Importance Level | Low, set relatively broad |
s3.stream.set.object.compaction.stream.split.size
Item | Description |
---|---|
Configuration Description | During the Stream object compaction process, if the data volume within a single Stream exceeds this threshold, the Stream's data will be directly split and written into individual Stream objects. The smaller this value, the earlier the data is split from the Stream set object, resulting in lower subsequent API call costs for Stream object compaction, but leading to higher API call costs for the split. |
Value Type | long, measured in bytes |
Default Value | 8388608 |
Legal Input Range | [1, ...] |
Importance Level | Low, set relatively broad |
s3.network.baseline.bandwidth
Item | Description |
---|---|
Configuration Description | The total available bandwidth for object storage requests. This is used to prevent stream set object compaction and catch-up reads from occupying normal read-write traffic. Production and consumption will also individually consume inbound and outbound traffic. For instance, if this value is set to 100MB/s and normal read-write traffic is 80MB/s, then the available traffic for stream set object compaction would be 20MB/s. |
Value Type | long, measured in byte/s |
Default Value | 104857600 |
Legal Input Range | [1, ...] |
Importance Level | Low, set relatively broad |
s3.stream.allocator.policy
Item | Description |
---|---|
Configuration Description | S3Stream memory allocator policy. Note that when configured to use DIRECT memory, the heap size in the virtual machine options (e.g., -Xmx) and the direct memory size (e.g., -XX:MaxDirectMemorySize) need to be adjusted. You can set them via the environment variable KAFKA_HEAP_OPTS. |
Value Type | string |
Default | POOLED_HEAP |
Valid Values | POOLED_HEAP, POOLED_DIRECT |
Importance Level | Low, set relatively broad |
s3.telemetry.metrics.level
Item | Description |
---|---|
Configuration Description | Sets the level of Metrics logging. The "INFO" level includes metrics that most users should be concerned with, such as throughput and latency of common stream operations. The "DEBUG" level includes detailed metrics useful for diagnostics, such as latencies at various stages when writing to underlying block devices. |
Value Type | string |
Default Value | INFO |
Valid Input Range | INFO, DEBUG |
Importance Level | Low, set relatively broad |
s3.telemetry.exporter.report.interval.ms
Item | Description |
---|---|
Configuration Description | Sets the interval for Metrics export. |
Value Type | int, in milliseconds |
Default Value | 30000 |
Valid Input Range | N/A |
Importance Level | Low, set relatively broad |
s3.telemetry.metrics.base.labels
Item | Description |
---|---|
Configuration Description | Multi-dimensional labels for metrics, used to attach static multi-dimensional labels to all monitoring metrics, enabling categorization, aggregation, and fine-grained analysis of metrics. The format is: key1=value1,key2=value2. |
Value Type | string |
Default Value | null |
Valid Input Range | N/A |
Importance level | Medium, requires careful configuration |
s3.telemetry.metrics.exporter.uri
Item | Description |
---|---|
Configuration description | The export URI for Metrics. The format is: $type://?$param1=$value1&$param2=$value2. Currently the supported types for 'type' are prometheus and otlp. The format for prometheus: prometheus://?host=$hostname&port=$port The format for otlp: otlp://?endpoint=$endpoint&protocol=$protocol&compression=$compression |
Value Type | string |
Default Value | null |
Valid Input Range | N/A |
Importance Level | High, requires careful configuration |
Persistent Data Rebalancing Configuration
metric.reporters
Item | Description |
---|---|
Configuration Description | A list of classes for metrics reporters. By implementing the org.apache.kafka.common.metrics.MetricsReporter interface, you can dynamically load new metrics. JmxReporter is always included to register JMX statistics. To enable AutoBalancing, metric.reporters must include kafka.autobalancer.metricsreporter.AutoBalancerMetricsReporter . |
Value Type | list |
Default Value | "" |
Valid Input Range | N/A |
Importance Level | Low, set relatively broad |
autobalancer.reporter.metrics.reporting.interval.ms
Item | Description |
---|---|
Configuration Description | Interval for reporting data by Metrics Reporter. |
Value Type | long, unit is milliseconds |
Default Value | 10000 |
Legal Input Range | [1000, ...] |
Importance Level | High, requires careful configuration |
autobalancer.controller.enable
Item | Description |
---|---|
Configuration Description | Whether to enable automatic rebalancing. |
Value Type | boolean |
Default Value | false |
Valid Input Range | N/A |
Importance Level | High, requires careful configuration |
autobalancer.controller.anomaly.detect.interval.ms
Item | Description |
---|---|
Configuration description | The minimum interval for the Controller to check if a data rebalance is needed. The actual time for the next rebalance also depends on the number of partitions that have been reassigned. Reducing the minimum check interval can increase the sensitivity of data rebalancing. This value should be greater than the broker metrics reporting interval to prevent the controller from missing recent reassignment results. |
Value Type | long, unit is milliseconds |
Default Value | 60000 |
Legal Input Range | [1, ...] |
Importance Level | High, requires careful configuration |
autobalancer.controller.exclude.topics
Item | Description |
---|---|
Configuration Description | List of Topics to be excluded from data rebalancing. |
Value Type | list |
Default Value | "" |
Valid Input Range | N/A |
Importance Level | High, requires careful configuration |
autobalancer.controller.exclude.broker.ids
Item | Description |
---|---|
Configuration Description | List of Broker Ids that should be excluded from data rebalancing. |
Value Type | list |
Default Value | "" |
Valid Input Range | N/A |
Importance Level | High, requires careful configuration |
Table Topic
Table Topic is a core feature designed by AutoMQ for modern data lake architectures, primarily aimed at achieving seamless integration between streaming data and static data lakes. This architectural innovation addresses traditional challenges such as the separation of streaming and batch processing, complex ETL processes, and high costs. For more details, please see Overview▸.
automq.table.topic.enable
Item | Description |
---|---|
Configuration Description | Enable or disable AutoMQ Table Topic. When enabled, an Iceberg Table will be created to store AutoMQ Table data. |
Value Type | boolean |
Default Value | false |
Valid Input Range | N/A |
Importance Level | High, requires careful configuration |
automq.table.topic.commit.interval.ms
Item | Description |
---|---|
Configuration Description | The commit interval for Table Topic data. A shorter commit interval results in higher data real-time performance, but also increases processing cost, and vice versa. |
Value Type | long |
Default Value | 300000, in ms |
Valid Input Range | Whole integers |
Importance Level | High, requires careful configuration |
automq.table.topic.upsert.enable
Item | Description |
---|---|
Configuration Description | Determines whether the Upsert feature for Table Topic is enabled. When enabled, the system will automatically discern whether to insert a new record or update an existing one based on the primary key. |
Value Type | boolean |
Default Value | false |
Valid Input Range | N/A |
Importance Level | High, requires careful configuration |
automq.table.topic.partition.by
Item | Description |
---|---|
Configuration Description | Defines the partitioning rules for a Table Topic, partitioning data by field or function. |
Value Type | string |
Default Value | null |
Valid Input Range | N/A |
Importance Level | High, requires careful configuration |
automq.table.topic.id.columns
Item | Description |
---|---|
Configuration Description | Specifies the unique primary key column for the table. |
Value Type | string |
Default Value | null |
Valid Input Range | N/A |
Importance Level | High, requires careful configuration |
automq.table.topic.cdc.field
Item | Description |
---|---|
Configuration Description | Specifies the field name that records the CDC (Change Data Capture) operation type. This is used to identify the type of database change operation, with a value as a single character: I , U , or D , corresponding to insert, update, and delete actions respectively. |
Value Type | string |
Default Value | null |
Valid Input Range | N/A |
Importance Level | High, requires careful configuration |
automq.table.topic.schema.type
Item | Description |
---|---|
Configuration Description | schema type. Supports two modes: schemaless (does not parse message content) and schema (requires the message value's Schema to be predefined in the schema registry and written into Iceberg based on that Schema). |
Value Type | string |
Default Value | schemaless |
Valid Input Range | N/A |
Importance Level | High, requires careful configuration |
automq.table.topic.namespace
Item | Description |
---|---|
Configuration Description | Namespace for the Table topic |
Value Type | string |
Default Value | null |
Valid Input Range | N/A |
Importance Level | High, requires careful configuration |
automq.table.topic.schema.registry.url
Item | Description |
---|---|
Configuration Description | The service URL of the Schema Registry |
Value Type | string |
Default Value | null |
Valid Input Range | N/A |
Importance Level | High, requires careful configuration |
automq.table.topic.catalog.type
Item | Description |
---|---|
Configuration Description | Specifies the type of Catalog. Currently supports five types: rest, glue, nessie, tablebucket, hive |
Value Type | string |
Default Value | null |
Valid Input Range | N/A |
Importance Level | High, requires careful configuration |
AutoMQ Table Topic supports various types of Iceberg Catalogs, each needing distinct configuration parameters. For additional information, refer to:
Rest
automq.table.topic.catalog.uri
Item | Description |
---|---|
Configuration Description | Specify the Iceberg REST Catalog service address |
Value Type | string |
Default Value | null |
Valid Input Range | N/A |
Importance Level | High, requires careful configuration |
automq.table.topic.catalog.warehouse
Item | Description |
---|---|
Configuration Description | Specify the S3 path of the Catalog repository |
Value Type | string |
Default Value | null |
Valid Input Range | N/A |
Importance Level | High, requires careful configuration |
automq.table.topic.catalog.oauth2-server-uri
Item | Description |
---|---|
Configuration Description | Specifies the address of the oauth authentication service |
Value Type | string |
Default Value | null |
Valid Input Range | N/A |
Importance Level | Optional |
automq.table.topic.catalog.credential
Item | Description |
---|---|
Configuration Description | Specify the certificate for Catalog |
Value Type | string |
Default Value | null |
Valid Input Range | N/A |
Importance Level | Optional |
automq.table.topic.catalog.token
Item | Description |
---|---|
Configuration Description | Specifies the token for the Catalog |
Value Type | string |
Default Value | null |
Valid Input Range | N/A |
Importance Level | Optional |
automq.table.topic.catalog.scope
Item | Description |
---|---|
Configuration Description | Specifies the Catalog authorization scope |
Value Type | string |
Default Value | null |
Valid Input Range | N/A |
Importance Level | Optional |
Glue
automq.table.topic.catalog.warehouse
Item | Description |
---|---|
Configuration Description | Specifies the S3 path for the Iceberg repository |
Value Type | string |
Default Value | null |
Valid Input Range | N/A |
Importance Level | High, requires careful configuration |
Nessie
automq.table.topic.catalog.uri
Item | Description |
---|---|
Configuration Description | Specifies the Iceberg Catalog service address |
Value Type | string |
Default Value | null |
Valid Input Range | N/A |
Importance Level | High, requires careful configuration |
automq.table.topic.catalog.warehouse
Item | Description |
---|---|
Configuration Description | Specifies the S3 path for the Iceberg repository |
Value Type | string |
Default Value | null |
Valid Input Range | N/A |
Importance Level | High, requires careful configuration |