ROS 2 inference for whisper.cpp.
- Install
pyaudio
, see install instructions. - Build this repository, do
mkdir -p ros-ai/src && cd ros-ai/src && \
git clone https://github.com/gribes02/voice_command_robot.git && cd .. && \
colcon build --symlink-install --cmake-args -DWHISPER_CUDA=On --no-warn-unused-cli
Run the inference action server (this will download models to $HOME/.cache/whisper.cpp
):
ros2 launch whisper_bringup bringup.launch.py
Run a client node (activated on space bar press):
ros2 run whisper_demos whisper_on_key
Ensure you have turtlebot3 downloaded to run the following command
ros2 launch turtlebot3_gazebo turtlebot3_world.launch.py
Configure whisper
parameters in whisper.yaml.
Action server under topic inference
of type Inference.action.
- Encoder inference time: ggerganov/whisper.cpp#10 (comment)
- Compile with GPU support (might differ between platforms): https://github.com/ggerganov/whisper.cpp#nvidia-gpu-support-via-cublas WHISPER_CUBLAS=On