feat(perception): MFCC nearest-centroid audio scene classifier (Issue #353) #358

Merged
sl-jetson merged 1 commits from sl-perception/issue-353-audio-scene into main 2026-03-03 14:32:36 -05:00
Collaborator

Summary

  • Classifies ambient audio into indoor / outdoor / traffic / park at 1 Hz
  • Pure-numpy 16-d feature vector: 13 MFCC coefficients + spectral centroid + spectral rolloff (85%) + zero-crossing rate
  • Normalised nearest-centroid classifier; class centroids computed deterministically from seeded synthetic prototypes at import time
  • ROS2 node subscribes /audio/audio (audio_common_msgs/AudioData) with configurable sample rate, channel count, and clip duration; publishes saltybot_scene_msgs/AudioScene at 1 Hz
  • Gracefully handles missing audio_common_msgs (logs warning, no crash)

New files

File Purpose
saltybot_scene_msgs/msg/AudioScene.msg label + confidence + features[16]
_audio_scene.py Feature extraction + NearestCentroidClassifier
audio_scene_node.py ROS2 node, publishes /saltybot/audio_scene
test/test_audio_scene.py 53 tests, all passing

Test plan

  • python3 -m pytest test/test_audio_scene.py -v — 53/53 passed
  • 80 Hz sine → traffic, 440 Hz → indoor, 1+2 kHz → outdoor, 3.2+4.8 kHz → park
  • Each prototype signal self-classifies correctly
  • Silence and short clips do not crash

🤖 Generated with Claude Code

## Summary - Classifies ambient audio into **indoor / outdoor / traffic / park** at 1 Hz - Pure-numpy 16-d feature vector: 13 MFCC coefficients + spectral centroid + spectral rolloff (85%) + zero-crossing rate - Normalised nearest-centroid classifier; class centroids computed deterministically from seeded synthetic prototypes at import time - ROS2 node subscribes `/audio/audio` (`audio_common_msgs/AudioData`) with configurable sample rate, channel count, and clip duration; publishes `saltybot_scene_msgs/AudioScene` at 1 Hz - Gracefully handles missing `audio_common_msgs` (logs warning, no crash) ## New files | File | Purpose | |------|---------| | `saltybot_scene_msgs/msg/AudioScene.msg` | `label` + `confidence` + `features[16]` | | `_audio_scene.py` | Feature extraction + `NearestCentroidClassifier` | | `audio_scene_node.py` | ROS2 node, publishes `/saltybot/audio_scene` | | `test/test_audio_scene.py` | 53 tests, all passing | ## Test plan - [x] `python3 -m pytest test/test_audio_scene.py -v` — 53/53 passed - [x] 80 Hz sine → `traffic`, 440 Hz → `indoor`, 1+2 kHz → `outdoor`, 3.2+4.8 kHz → `park` - [x] Each prototype signal self-classifies correctly - [x] Silence and short clips do not crash 🤖 Generated with [Claude Code](https://claude.com/claude-code)
sl-perception added 1 commit 2026-03-03 14:03:33 -05:00
Classifies ambient audio into indoor/outdoor/traffic/park at 1 Hz using
a 16-d feature vector (13 MFCC + spectral centroid + rolloff + ZCR) with
a normalised nearest-centroid classifier. Centroids are computed at import
time from seeded synthetic prototypes, ensuring deterministic behaviour.

Changes
-------
- saltybot_scene_msgs/msg/AudioScene.msg  — label + confidence + features[16]
- saltybot_scene_msgs/CMakeLists.txt      — register AudioScene.msg
- _audio_scene.py  — pure-numpy feature extraction + NearestCentroidClassifier
- audio_scene_node.py  — subscribes /audio/audio, publishes /saltybot/audio_scene
- test/test_audio_scene.py  — 53 tests (all passing) with synthetic audio
- setup.py  — add audio_scene entry point

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
sl-jetson merged commit ae76697a1c into main 2026-03-03 14:32:36 -05:00
Sign in to join this conversation.
No Reviewers
No Label
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: seb/saltylab-firmware#358
No description provided.