Skip to Main Content

Biweekly #1: Cloud Disk Write Performance Optimization and Spot Instance Failover Recovery

The AutoMQ community officially went open source on November 4 this year, and in less than three weeks, it has received significant attention and strong support. "Bi-Weekly Highlights" will update every two weeks with brief content, introducing the dynamics and major events of our community development. We hope this will help community members and friends understand the project better, encouraging more developers to participate and contribute to building the AutoMQ community!

Overview of this issue

In the first bi-week of being open source, the AutoMQ team and the community collaboratively tackled numerous challenging tasks.

  1. AutoMQ for Kafka: Optimizations for cloud disk write performance, catch-up pre-reads, and resilience for forced reclamation of Spot instances.
  1. AutoMQ for RocketMQ: Enhanced observability, performance improvements, and stability enhancements, along with a refined quality assurance system.
  1. Update to AutoMQ for Kafka's quick local trial experience!

List of community contributors

New community contributor this week: Leizhiyuan from Tencent Cloud, who optimized the Docker Compose startup method for AutoMQ RocketMQ.

https://github.com/AutoMQ/automq-for-rocketmq/pull/605

👏 A big thank you to Leizhiyuan for his contributions, and we welcome more developers to join the open-source collaboration!

Featured updates for AutoMQ for Kafka

Optimizing Write Performance on Cloud Disks

To ensure stability in write latency while fully utilizing the bandwidth of cloud disks, and considering the IOPS limitations, data is batched by size and written to the disk periodically. On AWS GP3 disks with 3000 IOPS and 125MB/s, the average write latency for 4KB message sizes has reduced from 4ms to 2ms.

https://github.com/AutoMQ/automq-for-rocketmq/pull/645

Catch-up Read Optimization

Inspired by the Linux Page Cache mechanism, the prefetching system of S3Stream has been optimized to enhance Catch Up Read performance. Plans are in place to dynamically adjust the prefetch size based on the read rate next week, aiming to further improve cache utilization efficiency.

https://github.com/AutoMQ/automq-for-rocketmq/pull/657

Disaster Recovery for Forced Reclamation of Spot Instances

Spot instances, which can be up to 90% cheaper than on-demand instances, are subject to forced reclamation without notice. Through issue-447, AutoMQ for Kafka can still mount data volumes to surviving machines for partition reassignment in a few seconds in the event of forced reclamation of Spot instances.

This feature is implemented in three parts: Kafka control plane, S3Stream module disaster recovery, and AutoScaling multi-cloud operations layer, with current progress at 80%. Completion is expected by next week.

https://github.com/AutoMQ/automq-for-kafka/issues/447

Featured Updates for AutoMQ for RocketMQ

Enhanced Observability

Performance Optimization and Stability Improvements

With refined observability, we identified and resolved a range of stability and performance issues, making substantial optimizations to critical pathways such as message transmission and reception.

https://github.com/AutoMQ/automq-for-rocketmq/issues/591

Enhanced Quality Assurance System

We forked the E2E testing repository from Apache RocketMQ® and modified it to adapt to the new metadata storage in AutoMQ for RocketMQ. We integrated E2E tests into the development process using GitHub Action, setting it as a prerequisite checkpoint for PR merges to ensure full compatibility with the RocketMQ protocol.

https://github.com/AutoMQ/automq-for-rocketmq/pull/310

More Things

Quick Local Experience with AutoMQ for Kafka

The AutoMQ for Kafka local quick experience has been updated! Now, you can easily set up an AutoMQ for Kafka cluster locally and access the cluster using a client on the host machine. In this cluster, you can not only experience the capabilities of partition reassignment in a few seconds with AutoMQ for Kafka Partition, but also witness the partitions automatically reassign across the cluster in response to traffic flows.

https://docs.automq.com/docs/automq-s3kafka/VKpxwOPvciZmjGkHk5hcTz43nde

The above is the content of this issue of "Biweekly Highlights." Please follow our public account for regular updates on the progress of the AutoMQ community. We also warmly invite all open-source enthusiasts to continue following our community and join us in building cloud-native messaging middleware!

✨ GitHub address:

https://github.com/AutoMQ/automq-for-kafka

https://github.com/AutoMQ/automq-for-rocketmq