Prerequisites
Prepare StarRocks and Test Data
Ensure that a usable StarRocks cluster is already prepared. For demonstration purposes, we refer to Deploy StarRocks with Docker to install a demonstration cluster on a Linux machine. Create test tables for the database and primary key model:Prepare AutoMQ and Test Data
Refer to Deploy Multi-Nodes Cluster on Linux▸ to deploy AutoMQ and ensure network connectivity between AutoMQ and StarRocks. Quickly create a topic namedexample_topic
in AutoMQ and write a test JSON data to it following these steps.
Create Topic
Use the Apache Kafka® command-line tool to create a topic. Ensure you have access to the Kafka environment and that the Kafka service is running. Below is an example command to create a topic:When executing the command, replace
topic
and bootstrap-server
with the actual Kafka server address.Generate Test Data
Generate test data in JSON format that corresponds to the table mentioned earlier.Writing Test Data
Use Kafka’s command-line tools or programming methods to write test data into a Topic namedexample_topic
. Here is an example using the command-line tool:
When executing the command, replace
topic
and bootstrap-server
with the actual Kafka server address.Creating Routine Load Import Job
Create a Routine Load job in the StarRocks command line to continuously import data from the AutoMQ Kafka Topic.When executing the command, replace
kafka_broker_list
with the actual Kafka server address.Parameter Description
Data Format
The data format needs to be specified as JSON in thePROPERTIES
clause with "format" = "json"
.
Data Extraction and Transformation
If you need to specify a mapping and conversion relationship between the source data and the target table columns, you can configure theCOLUMNS
and jsonpaths
parameters. In COLUMNS
, the column names correspond to the column names of the target table, and the order of columns corresponds to the order of columns in source data. The jsonpaths
parameter is used to extract the necessary field data from the JSON data, similar to newly generated CSV data. Subsequently, the COLUMNS
parameter will temporarily name the fields in the order specified by jsonpaths
. For more information on data conversion, please refer to Data Conversion Implementation During Import.
If each line contains a JSON object where the names and number of keys correspond to the columns in the target table (order does not need to match), the
COLUMNS
configuration is not required.