feat(perception): MFCC nearest-centroid audio scene classifier (Issue #353) #358

sl-perception · 2026-03-03T14:03:32-05:00

sl-perception commented

2026-03-03 14:03:32 -05:00

Summary

Classifies ambient audio into indoor / outdoor / traffic / park at 1 Hz
Pure-numpy 16-d feature vector: 13 MFCC coefficients + spectral centroid + spectral rolloff (85%) + zero-crossing rate
Normalised nearest-centroid classifier; class centroids computed deterministically from seeded synthetic prototypes at import time
ROS2 node subscribes /audio/audio (audio_common_msgs/AudioData) with configurable sample rate, channel count, and clip duration; publishes saltybot_scene_msgs/AudioScene at 1 Hz
Gracefully handles missing audio_common_msgs (logs warning, no crash)

New files

File	Purpose
`saltybot_scene_msgs/msg/AudioScene.msg`	`label` + `confidence` + `features[16]`
`_audio_scene.py`	Feature extraction + `NearestCentroidClassifier`
`audio_scene_node.py`	ROS2 node, publishes `/saltybot/audio_scene`
`test/test_audio_scene.py`	53 tests, all passing

Test plan

python3 -m pytest test/test_audio_scene.py -v — 53/53 passed
80 Hz sine → traffic, 440 Hz → indoor, 1+2 kHz → outdoor, 3.2+4.8 kHz → park
Each prototype signal self-classifies correctly
Silence and short clips do not crash

🤖 Generated with Claude Code

## Summary - Classifies ambient audio into **indoor / outdoor / traffic / park** at 1 Hz - Pure-numpy 16-d feature vector: 13 MFCC coefficients + spectral centroid + spectral rolloff (85%) + zero-crossing rate - Normalised nearest-centroid classifier; class centroids computed deterministically from seeded synthetic prototypes at import time - ROS2 node subscribes `/audio/audio` (`audio_common_msgs/AudioData`) with configurable sample rate, channel count, and clip duration; publishes `saltybot_scene_msgs/AudioScene` at 1 Hz - Gracefully handles missing `audio_common_msgs` (logs warning, no crash) ## New files | File | Purpose | |------|---------| | `saltybot_scene_msgs/msg/AudioScene.msg` | `label` + `confidence` + `features[16]` | | `_audio_scene.py` | Feature extraction + `NearestCentroidClassifier` | | `audio_scene_node.py` | ROS2 node, publishes `/saltybot/audio_scene` | | `test/test_audio_scene.py` | 53 tests, all passing | ## Test plan - [x] `python3 -m pytest test/test_audio_scene.py -v` — 53/53 passed - [x] 80 Hz sine → `traffic`, 440 Hz → `indoor`, 1+2 kHz → `outdoor`, 3.2+4.8 kHz → `park` - [x] Each prototype signal self-classifies correctly - [x] Silence and short clips do not crash 🤖 Generated with [Claude Code](https://claude.com/claude-code)

sl-perception added 1 commit 2026-03-03 14:03:33 -05:00

feat(perception): MFCC nearest-centroid audio scene classifier (Issue #353 ) 677e6eb75e

Classifies ambient audio into indoor/outdoor/traffic/park at 1 Hz using
a 16-d feature vector (13 MFCC + spectral centroid + rolloff + ZCR) with
a normalised nearest-centroid classifier. Centroids are computed at import
time from seeded synthetic prototypes, ensuring deterministic behaviour.

Changes
-------
- saltybot_scene_msgs/msg/AudioScene.msg  — label + confidence + features[16]
- saltybot_scene_msgs/CMakeLists.txt      — register AudioScene.msg
- _audio_scene.py  — pure-numpy feature extraction + NearestCentroidClassifier
- audio_scene_node.py  — subscribes /audio/audio, publishes /saltybot/audio_scene
- test/test_audio_scene.py  — 53 tests (all passing) with synthetic audio
- setup.py  — add audio_scene entry point

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

sl-jetson merged commit ae76697a1c into main

2026-03-03 14:32:36 -05:00

sl-jetson referenced this issue from a commit

2026-03-03 14:32:38 -05:00

Merge pull request 'feat(perception): MFCC nearest-centroid audio scene classifier (Issue #353)' (#358) from sl-perception/issue-353-audio-scene into main

Sign in to join this conversation.

1 Participants

Notifications

Due Date

No due date set.

Dependencies

No dependencies set.

Reference: seb/saltylab-firmware#358