This is a Dora program that uses the qwenvl2 model to realize the movement of the car. We want to use a camera installed on the car as the input of the large language model, and have the large language model output four instructions: forward, backward, turn left, and turn right to achieve autonomous driving of the car.
The mickrobot_chassis node is implemented based on C/C . This node gets keyboard value and controls the movement of the chassis
w a d x: indicates front, back, left, and right
up, down, left, and right (direction keys): indicates front, back, left, and right
s indicates stop
Number keys 1、2、 3 indicate speed 1m/s 、2m/s、 3m/s
==Note:== When using this node, you need to turn on the remote control switch in the upper left corner, which means setting the chassis to automatic driving mode.(使用该节点时候需要打开左上角遥控器开关,设置到自动驾驶模式)。
step1:you should check dora path in “dora_mickrobot/CMakeLists.txt" line 47、48、60 and line 61 for mickrobot_chassis node path. Then install dependency libraries using the following command
sudo apt-get install nlohmann-json3-dev
sudo apt-get install clang
and the ROS2 is necessary.
step2:unzip and build thridpart_lib serial
cd dora_mickrobot\thridpart_lib
cd serial
mkdir build
cmake ..
sudo make install
step3: build chassis node
cd dora_mickrobot
mkdir build
cmake ..
step4: Grant permissions to the serial port
sudo chmod 777 /dev/ttyUSB0
step5: start mickrobot_chassis node with Dora
dora start mickrobot_dataflow.yml --name test
dora logs test mickrobot_chassis
This mickrobot_chassis node receives the json string stream from CmdVelTwist and obtains the following data in the json string to control the chassis of the car
At the same time, the mickrobot_chassis node will publish the chassis status (x speed, y speed, rotational angular velocity) at a frequency of 100Hz. The name of the published Json string data stream is "Odometry"
# publish Odometry Json string
# chassis position
j_odom_pub["pose"]["position"]["x"] = position_x;
j_odom_pub["pose"]["position"]["y"] = position_y;
j_odom_pub["pose"]["position"]["z"] = 0;
j_odom_pub["pose"]["orientation"]["x"] = 0;
j_odom_pub["pose"]["orientation"]["y"] = 0;
j_odom_pub["pose"]["orientation"]["z"] = 0;
j_odom_pub["pose"]["orientation"]["w"] = 1;
# chassis speed
j_odom_pub["twist"]["linear"]["x"] = linear_x;
j_odom_pub["twist"]["linear"]["y"] = linear_y;
j_odom_pub["twist"]["linear"]["z"] = 0;
j_odom_pub["twist"]["angular"]["x"] = 0;
j_odom_pub["twist"]["angular"]["y"] = 0;
j_odom_pub["twist"]["angular"]["z"] = linear_w;
First try:
dora up
dora build keyboard_teleop_no_recording.yml
dora start keyboard_teleop_no_recording.yml
# 2 Ctrl-C for stopping
This is going to start a keyboard teleoperation session without camera
# Clone llama factory
cd ..
git clone --depth 1
cd ../dora_mickrobot
dora build qwenvl2_recorder.yml
dora start qwenvl2_recorder.yml
Within llama factory folder,
Install all dependency of llama factory
so that the dataset is the one you want to use,
- dataset: mllm_demo,identity # video: mllm_video_demo
dataset: dora_demo_107,dora_demo_108,identity`
- You can also choose the 2B model instead of the 7B model with
- model_name_or_path: Qwen/Qwen2-VL-7B-Instruct
model_name_or_path: Qwen/Qwen2-VL-2B-Instruct
- Then start finetuning with:
llamafactory-cli train examples/train_lora/qwen2vl_lora_sft.yaml
You can then
# Clone llama factory
cd ..
git clone --depth 1
cd ../dora_mickrobot
dora build qwenvl2.yml
dora start qwenvl2.yml
- For different camera devices, you need to modify the parameters of the camera node in the xxx.yml file
- The parameter LLAMA_FACTORY_ROOT_PATH in the yml file should give the absolute path