feat: Audio pipeline — wake word + STT + TTS on Jabra SPEAK 810 (Issue #503) #543
Loading…
x
Reference in New Issue
Block a user
No description provided.
Delete Branch "sl-jetson/issue-503-audio-pipeline"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Summary
Implements end-to-end audio pipeline for SaltyBot on Jetson (Issue #503).
VoskSTTclass toaudio_utils.pysaltybot/audio/{state,status}full_stack.launch.pywithenable_audio_pipelineandaudio_stt_backendargsFiles changed
saltybot_audio_pipeline/audio_utils.py— addVoskSTTclasssaltybot_audio_pipeline/audio_pipeline_node.py—stt_backendparam (whisper/vosk), Vosk loading, CPU fallbacksaltybot_audio_pipeline/config/audio_pipeline_params.yaml— addstt_backend,vosk_model_pathsaltybot_audio_pipeline/test/test_audio_pipeline.py— 40 unit testssaltybot_bringup/launch/full_stack.launch.py— audio pipeline at t=5sTest plan
pytest jetson/ros2_ws/src/saltybot_audio_pipeline/test/— all 40 tests pass offlineros2 launch saltybot_audio_pipeline audio_pipeline.launch.py— verifies Jabra device opensros2 launch saltybot_bringup full_stack.launch.py audio_stt_backend:=vosk— Vosk backendros2 topic echo /saltybot/speech/transcribed_text— STT output after "hey salty"ros2 topic echo /saltybot/audio/state— FSM transitions: idle→listening→wake_detected→processing→speaking→listening🤖 Generated with Claude Code
- Add VoskSTT class to audio_utils.py: offline Vosk STT backend as low-latency CPU alternative to Whisper for Jetson deployments - Update audio_pipeline_node.py: stt_backend param ("whisper"/"vosk"), Vosk loading with Whisper fallback, CPU auto-detection for Whisper, dual-backend _process_utterance dispatch, STT/<backend> log prefix - Update audio_pipeline_params.yaml: add stt_backend and vosk_model_path - Add test/test_audio_pipeline.py: 40 unit tests covering EnergyVAD, PCM conversion, AudioBuffer, UtteranceSegmenter, VoskSTT, JabraAudioDevice, AudioMetrics, AudioState - Integrate into full_stack.launch.py: audio_pipeline at t=5s with enable_audio_pipeline and audio_stt_backend args Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>