Introduction
MinIO is a high-performance, distributed object storage system designed for operation on standard hardware, delivering an impressive cost-performance ratio and wide applicability. Ideal for high-performance private clouds, its simple yet effective architecture ensures superior performance while providing extensive object storage capabilities. Suitable for traditional applications such as secondary storage, disaster recovery, and archiving, as well as emerging areas like machine learning, big data, private cloud, and hybrid cloud scenarios, MinIO showcases its strong adaptability and excellence.
Leveraging MinIO's full compatibility with the S3 API, you can establish an AutoMQ cluster in your private data center to create a streaming system fully compatible with Kafka but offering better cost-efficiency, ultimate scalability, and single-digit millisecond latency. This article will guide you on deploying an AutoMQ cluster in your private data center's MinIO.
Prerequisites
A functioning MinIO environment. If you have not yet set up MinIO, follow the official website guidance for installation.
Prepare five hosts for deploying the AutoMQ cluster. We recommend selecting Linux amd64 hosts equipped with 2 cores and 16GB of RAM, and preparing two virtual storage volumes. Here's an example:
Role IP Node ID System volume Data volume CONTROLLER 192.168.0.1 0 EBS 20GB EBS 20GB CONTROLLER 192.168.0.2 1 EBS 20GB EBS 20GB CONTROLLER 192.168.0.3 2 EBS 20GB EBS 20GB BROKER 192.168.0.4 3 EBS 20GB EBS 20GB BROKER 192.168.0.5 4 EBS 20GB EBS 20GB Tips:
Ensure these machines are located within the same subnet and have the capability to communicate with each other.
In non-production settings, it's possible to deploy only one Controller, which will also act as a Broker by default.
Download the latest official binary installation package from AutoMQ Github Releases to install AutoMQ.
Create two custom-named object storage buckets on MinIO, called automq-data and automq-ops.
- Configure the AWS CLI with the necessary Access Key and Secret Key by setting environment variables.
export AWS_ACCESS_KEY_ID=minioadmin
export AWS_SECRET_ACCESS_KEY=minio-secret-key-CHANGE-ME- Create an S3 bucket using the AWS CLI.
aws s3api create-bucket --bucket automq-data --endpoint=http://10.1.0.240:9000
Install and initiate the AutoMQ cluster.
Step 1: Generate an S3 URL.
AutoMQ includes the automq-kafka-admin.sh tool, which facilitates the rapid startup of AutoMQ. Just provide an S3 URL with the required endpoint and authentication details to launch AutoMQ with a single click, eliminating the need for manual cluster ID creation or storage formatting.
bin/automq-kafka-admin.sh generate-s3-url \
--s3-access-key=xxx \
--s3-secret-key=yyy \
--s3-region=cn-northwest-1 \
--s3-endpoint=s3.cn-northwest-1.amazonaws.com.cn \
--s3-data-bucket=automq-data \
--s3-ops-bucket=automq-ops
When employing MinIO, use the following configuration to create a specific S3 URL.
Parameter Name | Default value | Description |
---|---|---|
--s3-access-key | minioadmin | Environment variable MINIO_ROOT_USER |
--s3-secret-key | minio-secret-key-CHANGE-ME | Environment variable MINIO_ROOT_PASSWORD |
--s3-region | us-west-2 | This parameter has no effect in MinIO and can be assigned any value, such as us-west-2. |
--s3-endpoint | http://10.1.0.240:9000 | The endpoint can be retrieved by executing the command `sudo systemctl status minio.service`. |
--s3-data-bucket | automq-data | - |
--s3-ops-bucket | automq-ops | - |
Output result
Once the command is executed, the process will automatically move through the following stages:
Discover the core features of S3 by supplying an accessKey and secretKey to test the compatibility between AutoMQ and S3.
Generate an s3url using credential and endpoint details.
Fetch the startup command for AutoMQ using the s3url. In the command, replace `--controller-list` and `--broker-list` with the actual CONTROLLER and BROKER required for deployment.
Here are the outcomes:
############ Ping s3 ########################
[ OK ] Write s3 object
[ OK ] Read s3 object
[ OK ] Delete s3 object
[ OK ] Write s3 object
[ OK ] Upload s3 multipart object
[ OK ] Read s3 multipart object
[ OK ] Delete s3 object
############ String of s3url ################
Your s3url is:
s3://s3.cn-northwest-1.amazonaws.com.cn?s3-access-key=xxx&s3-secret-key=yyy&s3-region=cn-northwest-1&s3-endpoint-protocol=https&s3-data-bucket=automq-data&s3-path-style=false&s3-ops-bucket=automq-ops&cluster-id=40ErA_nGQ_qNPDz0uodTEA
############ Usage of s3url ################
To start AutoMQ, generate the start commandline using s3url.
bin/automq-kafka-admin.sh generate-start-command \
--s3-url="s3://s3.cn-northwest-1.amazonaws.com.cn?s3-access-key=XXX&s3-secret-key=YYY&s3-region=cn-northwest-1&s3-endpoint-protocol=https&s3-data-bucket=automq-data&s3-path-style=false&s3-ops-bucket=automq-ops&cluster-id=40ErA_nGQ_qNPDz0uodTEA" \
--controller-list="192.168.0.1:9093;192.168.0.2:9093;192.168.0.3:9093" \
--broker-list="192.168.0.4:9092;192.168.0.5:9092"
TIPS: Please replace the controller-list and broker-list with your actual IP addresses.
Step 2: Create a list of startup commands
Update the --controller-list and --broker-list parameters in the commands generated from the previous step with your host details, specifically substituting them with the IP addresses of the 3 CONTROLLERS and 2 BROKERS outlined during the preparation phase, using the default ports 9092 and 9093.
bin/automq-kafka-admin.sh generate-start-command \
--s3-url="s3://s3.cn-northwest-1.amazonaws.com.cn?s3-access-key=XXX&s3-secret-key=YYY&s3-region=cn-northwest-1&s3-endpoint-protocol=https&s3-data-bucket=automq-data&s3-path-style=false&s3-ops-bucket=automq-ops&cluster-id=40ErA_nGQ_qNPDz0uodTEA" \
--controller-list="192.168.0.1:9093;192.168.0.2:9093;192.168.0.3:9093" \
--broker-list="192.168.0.4:9092;192.168.0.5:9092"
Parameter Description
Parameter Name | Required | Description |
---|---|---|
--s3-url | is | Created using the command line utility bin/automq-kafka-admin.sh generate-s3-url, which incorporates authentication, cluster ID, and additional parameters |
--controller-list | is | At least one address is required, serving as the IP and port list for the CONTROLLER host. The format should be IP1:PORT1; IP2:PORT2; IP3:PORT3 |
--broker-list | is | At least one address is required, serving as the IP and port list for the BROKER host. The format should be IP1:PORT1; IP2:PORT2; IP3:PORT3 |
--controller-only-mode | No | Determine if the CONTROLLER node is solely dedicated to the CONTROLLER role. By default, this setting is false, indicating that the deployed CONTROLLER node simultaneously acts as a BROKER role. |
Output result
After running the command, it will produce the necessary commands to initiate AutoMQ.
############ Start Commandline ##############
To start an AutoMQ Kafka server, please navigate to the directory where your AutoMQ tgz file is located and run the following command.
Before running the command, make sure that Java 17 is installed on your host. You can verify the Java version by executing 'java -version'.
bin/kafka-server-start.sh --s3-url="s3://s3.cn-northwest-1.amazonaws.com.cn?s3-access-key=XXX&s3-secret-key=YYY&s3-region=cn-northwest-1&s3-endpoint-protocol=https&s3-data-bucket=automq-data&s3-path-style=false&s3-ops-bucket=automq-ops&cluster-id=40ErA_nGQ_qNPDz0uodTEA" --override process.roles=broker,controller --override node.id=0 --override controller.quorum.voters=0@192.168.0.1:9093,1@192.168.0.2:9093,2@192.168.0.3:9093 --override listeners=PLAINTEXT://192.168.0.1:9092,CONTROLLER://192.168.0.1:9093 --override advertised.listeners=PLAINTEXT://192.168.0.1:9092
bin/kafka-server-start.sh --s3-url="s3://s3.cn-northwest-1.amazonaws.com.cn?s3-access-key=XXX&s3-secret-key=YYY&s3-region=cn-northwest-1&s3-endpoint-protocol=https&s3-data-bucket=automq-data&s3-path-style=false&s3-ops-bucket=automq-ops&cluster-id=40ErA_nGQ_qNPDz0uodTEA" --override process.roles=broker,controller --override node.id=1 --override controller.quorum.voters=0@192.168.0.1:9093,1@192.168.0.2:9093,2@192.168.0.3:9093 --override listeners=PLAINTEXT://192.168.0.2:9092,CONTROLLER://192.168.0.2:9093 --override advertised.listeners=PLAINTEXT://192.168.0.2:9092
bin/kafka-server-start.sh --s3-url="s3://s3.cn-northwest-1.amazonaws.com.cn?s3-access-key=XXX&s3-secret-key=YYY&s3-region=cn-northwest-1&s3-endpoint-protocol=https&s3-data-bucket=automq-data&s3-path-style=false&s3-ops-bucket=automq-ops&cluster-id=40ErA_nGQ_qNPDz0uodTEA" --override process.roles=broker,controller --override node.id=2 --override controller.quorum.voters=0@192.168.0.1:9093,1@192.168.0.2:9093,2@192.168.0.3:9093 --override listeners=PLAINTEXT://192.168.0.3:9092,CONTROLLER://192.168.0.3:9093 --override advertised.listeners=PLAINTEXT://192.168.0.3:9092
bin/kafka-server-start.sh --s3-url="s3://s3.cn-northwest-1.amazonaws.com.cn?s3-access-key=XXX&s3-secret-key=YYY&s3-region=cn-northwest-1&s3-endpoint-protocol=https&s3-data-bucket=automq-data&s3-path-style=false&s3-ops-bucket=automq-ops&cluster-id=40ErA_nGQ_qNPDz0uodTEA" --override process.roles=broker --override node.id=3 --override controller.quorum.voters=0@192.168.0.1:9093,1@192.168.0.2:9093,2@192.168.0.3:9093 --override listeners=PLAINTEXT://192.168.0.4:9092 --override advertised.listeners=PLAINTEXT://192.168.0.4:9092
bin/kafka-server-start.sh --s3-url="s3://s3.cn-northwest-1.amazonaws.com.cn?s3-access-key=XXX&s3-secret-key=YYY&s3-region=cn-northwest-1&s3-endpoint-protocol=https&s3-data-bucket=automq-data&s3-path-style=false&s3-ops-bucket=automq-ops&cluster-id=40ErA_nGQ_qNPDz0uodTEA" --override process.roles=broker --override node.id=4 --override controller.quorum.voters=0@192.168.0.1:9093,1@192.168.0.2:9093,2@192.168.0.3:9093 --override listeners=PLAINTEXT://192.168.0.5:9092 --override advertised.listeners=PLAINTEXT://192.168.0.5:9092
TIPS: Start controllers first and then the brokers.
By default, the node.id automatically starts at 0.
Step 3: Start AutoMQ
To initiate the cluster, sequentially execute the series of commands from the previous step on the designated CONTROLLER or BROKER host. For instance, to launch the first CONTAINER process at 192.168.0.1, run the first command from the generated startup command list.
bin/kafka-server-start.sh --s3-url="s3://s3.cn-northwest-1.amazonaws.com.cn?s3-access-key=XXX&s3-secret-key=YYY&s3-region=cn-northwest-1&s3-endpoint-protocol=https&s3-data-bucket=automq-data&s3-path-style=false&s3-ops-bucket=automq-ops&cluster-id=40ErA_nGQ_qNPDz0uodTEA" --override process.roles=broker,controller --override node.id=0 --override controller.quorum.voters=0@192.168.0.1:9093,1@192.168.0.2:9093,2@192.168.0.3:9093 --override listeners=PLAINTEXT://192.168.0.1:9092,CONTROLLER://192.168.0.1:9093 --override advertised.listeners=PLAINTEXT://192.168.0.1:9092
Parameter Description
When using the startup command, unspecified parameters will automatically use the default configuration of Apache Kafka®. For new parameters introduced by AutoMQ, AutoMQ's default values will be applied. To modify these defaults, append --override key=value parameters to the end of the command.
Parameter Name | Mandatory | Instructions |
---|---|---|
s3-url | Yes | Generated by the bin/automq-kafka-admin.sh generate-s3-url command-line tool, which includes information such as identity authentication and cluster ID. |
process.roles | Yes | The options are CONTROLLER or BROKER. If a host serves as both CONTROLLER and BROKER, the configuration value should be CONTROLLER, BROKER. |
node.id | Yes | An integer used to uniquely identify the BROKER or CONTROLLER within the Kafka cluster, which must remain unique within the cluster. |
controller.quorum.voters | Yes | The host information participating in the KRAFT election, includes nodeid, IP and port information, for example: 0@192.168.0.1:9093, 1@192.168.0.2:9093, 2@192.168.0.3:9093. |
listeners | Yes | Listening IP and Port |
advertised.listeners | Yes | The BROKER provides the access address for the Client. |
log.dirs | No | Directory for storing KRAFT and BROKER metadata. |
s3.wal.path | No | In a production environment, it is recommended to store AutoMQ WAL data on a separately mounted new raw device volume. This can yield better performance, as AutoMQ supports writing data to raw devices, thereby reducing latency. Please ensure to configure the correct path to store WAL data. |
autobalancer.controller.enable | No | The default value is false, traffic rebalancing is not enabled. Once traffic rebalancing is automatically turned on, the auto balancer component of AutoMQ will automatically migrate partitions to ensure overall traffic balance. |
Tips: If you need to enable continuous traffic rebalancing or run Example: Self-Balancing When Cluster Nodes Change, it is recommended to explicitly specify the parameter --override autobalancer.controller.enable=true when starting the Controller.
Running in the Background
To operate in background mode, append the following snippet at the end of your command:
command > /dev/null 2>&1 &
Data volume path
Use the Linux `lsblk` command to check local data volumes; unpartitioned block devices qualify as data volumes. Here, vdb represents an unpartitioned raw block device.
vda 253:0 0 20G 0 disk
├─vda1 253:1 0 2M 0 part
├─vda2 253:2 0 200M 0 part /boot/efi
└─vda3 253:3 0 19.8G 0 part /
vdb 253:16 0 20G 0 disk
By default, AutoMQ stores metadata and WAL data in the /tmp directory. It's crucial to recognize that if the /tmp directory is mounted on tmpfs, it is unsuitable for production environments.
For optimal performance in production or formal testing settings, adjust the settings as follows: redirect the metadata directory to `log.dirs` and the WAL data directory to `s3.wal.path` (applicable for raw write-data devices) to alternate locations.
bin/kafka-server-start.sh ...\
--override s3.telemetry.metrics.exporter.type=prometheus \
--override s3.metrics.exporter.prom.host=0.0.0.0 \
--override s3.metrics.exporter.prom.port=9090 \
--override log.dirs=/root/kraft-logs \
--override s3.wal.path=/dev/vdb \
> /dev/null 2>&1 &
Tips:
Please change s3.wal.path to the actual local raw device name. To set up AutoMQ's Write-Ahead-Log (WAL) on local SSD storage, you need to ensure that the specified file path is on an SSD disk with more than 10GB of available space. For instance, --override s3.wal.path=/home/admin/automq-wal.
When deploying AutoMQ in a private data center for production environments, ensure the reliability of the local SSD. For example, you can use RAID technology.
Thus, you have successfully set up an AutoMQ cluster using MinIO, creating an economical, low-latency, and nearly instantaneous elastic Kafka cluster. For additional insights into AutoMQ features like near-instantaneous partition reassignment and self-balancing, please consult the official example.