- Add VoskSTT class to audio_utils.py: offline Vosk STT backend as
low-latency CPU alternative to Whisper for Jetson deployments
- Update audio_pipeline_node.py: stt_backend param ("whisper"/"vosk"),
Vosk loading with Whisper fallback, CPU auto-detection for Whisper,
dual-backend _process_utterance dispatch, STT/<backend> log prefix
- Update audio_pipeline_params.yaml: add stt_backend and vosk_model_path
- Add test/test_audio_pipeline.py: 40 unit tests covering EnergyVAD,
PCM conversion, AudioBuffer, UtteranceSegmenter, VoskSTT, JabraAudioDevice,
AudioMetrics, AudioState
- Integrate into full_stack.launch.py: audio_pipeline at t=5s with
enable_audio_pipeline and audio_stt_backend args
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Audio Pipeline (Issue #503)
Comprehensive audio pipeline for Salty Bot with full voice interaction support.
Features
- Hardware: Jabra SPEAK 810 USB audio device integration
- Wake Word: openwakeword "Hey Salty" detection
- STT: whisper.cpp running on Jetson GPU (small/base/medium/large models)
- TTS: Piper synthesis with voice switching
- State Machine: listening → processing → speaking
- MQTT: Real-time status reporting
- Metrics: Latency tracking and performance monitoring
ROS2 Topics
Published:
/saltybot/speech/transcribed_text(String): Final STT output/saltybot/audio/state(String): Current audio state/saltybot/audio/status(String): JSON metrics with latencies
MQTT Topics
saltybot/audio/state: Current statesaltybot/audio/status: Complete status JSON
Launch
ros2 launch saltybot_audio_pipeline audio_pipeline.launch.py
Configuration
See config/audio_pipeline_params.yaml for tuning:
device_name: Jabra devicewake_word_threshold: 0.5 (0.0-1.0)whisper_model: small/base/medium/largemqtt_enabled: true/false