sl-jetson 14164089dc feat: Audio pipeline end-to-end (Issue #503)
- Add VoskSTT class to audio_utils.py: offline Vosk STT backend as
  low-latency CPU alternative to Whisper for Jetson deployments
- Update audio_pipeline_node.py: stt_backend param ("whisper"/"vosk"),
  Vosk loading with Whisper fallback, CPU auto-detection for Whisper,
  dual-backend _process_utterance dispatch, STT/<backend> log prefix
- Update audio_pipeline_params.yaml: add stt_backend and vosk_model_path
- Add test/test_audio_pipeline.py: 40 unit tests covering EnergyVAD,
  PCM conversion, AudioBuffer, UtteranceSegmenter, VoskSTT, JabraAudioDevice,
  AudioMetrics, AudioState
- Integrate into full_stack.launch.py: audio_pipeline at t=5s with
  enable_audio_pipeline and audio_stt_backend args

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-07 10:03:31 -05:00
..

Audio Pipeline (Issue #503)

Comprehensive audio pipeline for Salty Bot with full voice interaction support.

Features

  • Hardware: Jabra SPEAK 810 USB audio device integration
  • Wake Word: openwakeword "Hey Salty" detection
  • STT: whisper.cpp running on Jetson GPU (small/base/medium/large models)
  • TTS: Piper synthesis with voice switching
  • State Machine: listening → processing → speaking
  • MQTT: Real-time status reporting
  • Metrics: Latency tracking and performance monitoring

ROS2 Topics

Published:

  • /saltybot/speech/transcribed_text (String): Final STT output
  • /saltybot/audio/state (String): Current audio state
  • /saltybot/audio/status (String): JSON metrics with latencies

MQTT Topics

  • saltybot/audio/state: Current state
  • saltybot/audio/status: Complete status JSON

Launch

ros2 launch saltybot_audio_pipeline audio_pipeline.launch.py

Configuration

See config/audio_pipeline_params.yaml for tuning:

  • device_name: Jabra device
  • wake_word_threshold: 0.5 (0.0-1.0)
  • whisper_model: small/base/medium/large
  • mqtt_enabled: true/false