Skip to Main Content

Deploy Multi-Nodes Cluster on Kubernetes

This topic introduces how to deploy a multi-node AutoMQ cluster using Kubernetes, allowing users to validate cluster-related features such as partition reassignment and data auto-balancing in this development environment.

In addition to the Kubernetes deployment solution, users can refer to the following documentation to explore other deployment options:

Deploying AutoMQ and tuning parameters for production load is relatively complex. You can contact the AutoMQ team through this form to receive necessary assistance and best practices.

Moreover, if you wish to completely avoid the burden of installation and deployment, you can experience the fully managed cloud service provided by the AutoMQ team through the following link. Currently, all cloud markets offer a free 2-week trial.

Prerequisites

This document provides examples for deploying a 5-node AutoMQ cluster. In this setup, 3 nodes will run both the Controller and Broker, while the other 2 nodes will run only the Broker.

Please ensure the following conditions are met:

  • Prepare a Kubernetes cluster with at least 5 nodes, recommending network-optimized virtual machines with 4 cores and 16GB of RAM for subsequent Pod creation and other operations.

  • Helm chart requires version v3.8.0 or later. Refer to the Helm Chart Quickstart.

  • Utilize the Bitnami Helm repository. AutoMQ is fully compatible with Bitnami's Helm Charts, enabling you to customize the AutoMQ Kubernetes cluster using Bitnami's values.yaml.

  • Prepare 2 object storage buckets: one for storing message data and another for storing system logs and metric data.

Deploy AutoMQ Cluster

Step 1: Edit the Configuration File

Create an empty automq-values.yaml file, edit the file, and add specific parameters. You can refer to demo-values.yaml, and for more details, check README.md.

  • Replace ${ops-bucket}, ${region}, and ${endpoint} with the specific values of your object storage. For details, refer to Object Storage Configuration▸.

  • Replace ${access-key} and ${secret-key} with the actual values. You can also choose alternative authorization methods such as IAM Role.

  • For production-grade deployments, it is recommended to use dedicated nodes for AutoMQ (to avoid network bandwidth and other resource competition with other Pods). It is advisable to match using node affinity (nodeAffinity) and tolerations tags.

  • For multi-availability zone deployments, you can use the topologySpreadConstraints parameter to ensure Pods are evenly distributed across the specified availability zones.


controller:
topologySpreadConstraints:
- maxSkew: 1
topologyKey: "topology.kubernetes.io/zone"
whenUnsatisfiable: DoNotSchedule
labelSelector:
matchLabels:
app: automq

  • To avoid cross-availability zone traffic, brokerRackAssignment will eventually set the broker.rack for the AutoMQ Broker. Client-side configuration should be done accordingly Client Configuration▸, with the ultimate goal of eliminating cross-availability zone traffic charges. In AWS EKS, this can be configured as follows:

brokerRackAssignment: aws-az

Step 2: Install AutoMQ

Install or upgrade the AutoMQ Helm Chart using a custom YAML file: It is recommended to use the --version flag to specify the Bitnami Helm Chart version 31.x.x (31.1.0 ~ 31.5.0) when installing AutoMQ.


helm install automq-release oci://registry-1.docker.io/bitnamicharts/kafka -f automq-values.yaml --version 31.5.0 --namespace automq --create-namespace

Wait for the AutoMQ cluster to be ready:


kubectl --namespace automq rollout status statefulset --watch

When the AutoMQ cluster is ready, the output should look like the following:


statefulset rolling update complete 2 pods at revision automq-kafka-broker-6c756696dd...
statefulset rolling update complete 3 pods at revision automq-kafka-controller-c574d5fd5...

Check the Pod list:


~/.kube kubectl get pods
NAME READY STATUS RESTARTS AGE
data-automq-kafka-controller-0 1/1 Running 0 16m
data-automq-kafka-controller-1 1/1 Running 0 16m
data-automq-kafka-controller-2 1/1 Running 0 16m
data-automq-kafka-broker-0 1/1 Running 0 13m
data-automq-kafka-broker-0 1/1 Running 0 13m

Test Message Sending and Receiving:

After the Helm execution is complete, the access address for the cluster and the commands for testing message sending and receiving will be displayed. This allows you to perform Topic message sending and consumption tests using kafka-console-producer.sh and kafka-console-consumer.sh.

Stop and Uninstall the AutoMQ Cluster:

  • After completing the tests, AutoMQ clusters can be stopped and uninstalled using helm uninstall.

helm uninstall automq-release -n automq

  • If historical data is no longer needed, it's necessary to delete the cluster's PVC and Bucket data altogether to prevent leftover data from affecting the next deployment.

Precautions for Production Environment

Lock Chart Version

To avoid unexpected changes during deployment, it is recommended to lock the Helm Chart version. Locking the version means specifying an exact version at deployment instead of using the latest or unspecified version. Locking the Helm Chart version aids in:

  • Ensure Compatibility: Make sure the application's performance in the deployed environment aligns with its behavior during testing, even if new Chart versions are released.

  • Prevent Unintended Updates: Prevent automatic updates that may introduce changes incompatible with your current deployment or operational practices.

Name Override

When deploying multiple instances of the same Helm Chart within a Kubernetes cluster, name conflicts may occur. Use nameOverride and fullnameOverride to distinguish between different instances. For example, using distinct names for your production and staging environments can help avoid confusion.

  • Using nameOverride, the StatefulSet name will be <release-name>-<nameOverride>.

  • Using fullnameOverride, the StatefulSet name will be <fullnameOverride>.


nameOverride: 'automq-prod'
fullnameOverride: 'automq-instance-prod'

Docker Image

Bitnami provides the Docker image for AutoMQ deployment, with the default image being bitnami/kafka:latest. You should replace it with a custom AutoMQ image specifying a particular version:


global:
security:
allowInsecureImages: true
image:
registry: automqinc
repository: automq
tag: 1.5.0-bitnami
pullPolicy: Always

Scheduling Strategy

For AutoMQ, a refined scheduling strategy in Kubernetes can be implemented using node affinities and tolerations. We suggest that a production-level AutoMQ operates exclusively without co-locating with other applications. It is advisable to tailor label matching rules based on node types:

Tolerance

It is recommended to add a taint to the Kubernetes node group: key: "dedicated", operator: "Equal", value: "automq", effect: "NoSchedule".


controller:
tolerations:
- key: "dedicated"
operator: "Equal"
value: "automq"
effect: "NoSchedule"
broker:
tolerations:
- key: "dedicated"
operator: "Equal"
value: "automq"
effect: "NoSchedule"

Node Affinity

Override the default values in the controller/broker configuration to match the node labels (e.g., node-type: m7g.xlarge):


controller:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: "node-type"
operator: In
values: ["m7g.xlarge"]

Pod Anti-affinity

Ensure that the controller component and the broker component are not scheduled on the same node by using the podAntiAffinity parameter:


controller:
affinity:
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- key: app.kubernetes.io/instance
operator: In
values:
- automq
- key: app.kubernetes.io/component
operator: In
values:
- controller-eligible
- broker
topologyKey: kubernetes.io/hostname

Scaling

Controller

The number of instances is configured through controller.replicas, which supports horizontal scaling. By default, the cluster deploys 3 Controller Pods, but users can customize the number of Controller replicas.

Note: Once the cluster deployment is complete, adjusting the replicas for the Controller is not recommended to avoid unexpected risks.

Broker

The number of instances is configured through broker.replicas, which supports horizontal scaling.

AutoScaling(HPA)

HPA is disabled by default. To enable it, configure the parameters in controller/broker.autoscaling.hpa:


controller:
autoscaling:
hpa:
enabled: true # Enable HPA
minReplicas: "1" # Minimum Replicas
maxReplicas: "3" # Maximum Replicas
targetCPU: "60" # Target CPU Utilization Rate (%)
targetMemory: "" # Target Memory Utilization (% Optional)

broker:
autoscaling:
hpa:
enabled: true # Enable HPA
minReplicas: "1" # Minimum Replicas
maxReplicas: "3" # Maximum Replicas
targetCPU: "60" # Target CPU Utilization Rate (%)
targetMemory: "" # Target Memory Utilization (% Optional)

Resource Configuration

It is recommended that each Pod for AutoMQ runs on resources of 4Core16GB. Adjust resource parameters through the following configurations:


controller:
replicaCount: 3
resources:
requests:
cpu: "3000m"
memory: "12Gi"
limits:
cpu: "4000m"
memory: "16Gi"
heapOpts: -Xmx6g -Xms6g -XX:MaxDirectMemorySize=6g -XX:MetaspaceSize=96m


broker:
replicaCount: 2
resources:
requests:
cpu: "3000m"
memory: "12Gi"
limits:
cpu: "4000m"
memory: "16Gi"
heapOpts: -Xmx6g -Xms6g -XX:MaxDirectMemorySize=6g -XX:MetaspaceSize=96m

Security and Authentication

Each listener configured in Kafka can have a different authentication protocol. For instance, you can use sasl_tls authentication for client communications and tls for inter-Controller and Broker communications. The table below lists available protocols and their security features (see more details in Kafka Security Authentication):

MethodAuthentication MethodEncrypted via TLS
plaintextNoneNo
tlsNoneYes
mtlsYes (Mutual Authentication)Yes
saslYes (via SASL)No
sasl_tls
Yes (via SASL)
Yes

External Access

Additional listeners and advertised listeners must be configured, and a specific service must be created for each Kafka Pod.

There are three ways to configure external access: by using LoadBalancer services, NodePort services, or ClusterIP services. For more information, refer to the Kafka External Access section.

Monitoring

The primary focus is on the integration of this Chart with Prometheus. For more details, please refer to the section Enable Prometheus Metrics.

Table Topic Feature

The Table Topic feature enables seamless integration of streaming data with static data lakes. For instructions on enabling the Table Topic feature in an AutoMQ cluster, please refer to Overview▸.