Implement state machine for detecting and enrolling unknown persons.
Manages workflow: DETECT → GREET → ASK_NAME → SMALL_TALK → ENROLL → FAREWELL
Features:
- Subscribes to /saltybot/person_tracker for unknown face detection
- Unknown person threshold configurable (default: 30% confidence)
- State machine with Piper TTS triggers for each state
- Captures STT responses for name and conversation context
- Publishes /social/orchestrator/state for coordination with other nodes
- Handles person interruptions gracefully (walks away)
- Auto-enrolls person to face gallery (configurable)
- Stores encounter data as JSON in /home/seb/encounter-queue/
- Tracks duration, responses, interests, and enrollment success
Encounter data structure:
{
person_id, timestamp, state, name, context, greeting_response,
interests[], enrollment_success, duration_sec, notes
}
Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
Creates log-mel spectrogram template for 'hey salty' wake word detection
using synthetic speech generation. Template generated from 5 synthetic
audio samples with varying pitch to improve robustness.
- generate_wake_word_template.py: Script to synthesize and generate template
- hey_salty.npy: 40-band log-mel template (40, 61) shape
- wake_word_params.yaml: Updated template_path
- README.md: Documentation for template usage and retraining procedures
The template is used by wake_word_node.py via cosine similarity matching
against incoming audio. Configurable sensitivity via match_threshold.
Future work: Collect real training recordings to improve accuracy.
Polls /proc/stat (CPU delta), /proc/meminfo (RAM), os.statvfs (disk),
/sys/devices/gpu.0/load (GPU), and thermal zone sysfs paths; publishes
JSON payload on /saltybot/system_resources at 1 Hz.
Pure helpers (parse_proc_stat, cpu_percent_from_stats, parse_meminfo,
compute_ram_stats, read_disk_usage, read_gpu_load, read_thermal_zones)
are all unit-tested offline. Injectable I/O on SysmonNode allows full
node tick tests without /proc or /sys. 67/67 tests passing.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Polls /dev/video* at 2 Hz, drives a three-state machine
(connected/disconnected/restarting) and publishes to
/saltybot/camera_status (std_msgs/String). Reconnects within
restart_grace_s (5 s) → 'restarting' held for restart_hold_s (2 s)
to signal downstream capture pipelines to restart. Scan function
is injected for offline testing. 82/82 tests pass.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Energy-gated log-mel + cosine-similarity wake-word node. Subscribes to
/social/speech/audio_raw (PCM-16 UInt8MultiArray), maintains a 1.5 s
sliding ring buffer, runs detection every 100 ms; fires Bool(True) on
/saltybot/wake_word_detected with 2 s cooldown. Template loaded from
.npy file; passive (no detections) when template_path is empty.
91/91 tests pass.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Adds face_track_servo_node to saltybot_social:
- Subscribes /social/faces/detected (FaceDetectionArray)
- Picks closest face by largest bbox area (proximity proxy)
- Computes pan/tilt error from bbox centre vs image centre using
configurable FOV (fov_h_deg=60°, fov_v_deg=45°)
- Independent PID controllers for pan and tilt (velocity/incremental
output with anti-windup); servo position integrates velocity*dt
- Clamps commands to ±pan_limit_deg / ±tilt_limit_deg
- Returns to centre at return_rate_deg_s when face lost >lost_timeout_s
- Dead zone suppresses jitter for small errors
- Publishes Float32 on /saltybot/head_pan and /saltybot/head_tilt
- 81/81 tests passing
Closes#279
Adds greeting_trigger_node to saltybot_social:
- Subscribes to /social/faces/detected (FaceDetectionArray) for face arrivals
- Subscribes to /social/person_states (PersonStateArray) to cache face_id→distance
- Fires greeting when face_id is within proximity_m (default 2m) and
not in per-face_id cooldown window (default 300s)
- Publishes JSON on /saltybot/greeting_trigger:
{face_id, person_name, distance_m, ts}
- unknown_distance param controls assumed distance for faces with no PersonState yet
- Thread-safe distance cache and greeted map
- 50/50 tests passing
Closes#270
Adds ambient_sound_node to saltybot_social:
- Accumulates 1 s of PCM-16 audio from /social/speech/audio_raw
- Extracts mel-spectrogram feature vector (energy_db, zcr, mel_centroid,
mel_flatness, low_ratio, high_ratio) using pure numpy (no torch/onnx)
- Priority-cascade classifier: silence → music → speech → crowd → outdoor → alarm
- Publishes label as std_msgs/String on /saltybot/ambient_sound on each buffer fill
- All 11 thresholds exposed as ROS parameters (yaml + launch file)
- numpy-free energy-only fallback for edge environments
- 77/77 tests passing
Closes#252
Add vad_node to saltybot_social: subscribes to /social/speech/audio_raw
(UInt8MultiArray PCM-16), computes RMS energy (dBFS) and zero-crossing
rate per chunk, applies onset/offset hysteresis (VadStateMachine), and
publishes /social/speech/is_speaking (Bool) and /social/speech/energy
(Float32 linear RMS). All thresholds configurable via ROS params:
rms_threshold_db=-35.0, zcr_min=0.01, zcr_max=0.40, onset_frames=2,
offset_frames=8, audio_topic. 69/69 tests passing.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
saltybot_social_msgs:
- Add PointingTarget.msg: origin (INDEX_MCP), direction (unit vec), target,
range_m, person_id, confidence, coarse_direction, is_active
- Register in CMakeLists.txt
saltybot_social:
- _pointing_ray.py (pure Python, no rclpy): unproject(), sample_depth()
(median with outlier rejection), compute_pointing_ray() — reprojects
INDEX_MCP and INDEX_TIP into 3-D using D435i depth; falls back to image-
plane direction when both depths are equal; gracefully handles one-sided
missing depth
- pointing_node.py: subscribes /social/gestures + synced D435i colour+depth;
re-runs MediaPipe Hands when a 'point' gesture is cached (within
gesture_timeout_s); picks closest hand to gesture anchor; publishes
PointingTarget on /saltybot/pointing_target at 5 Hz
- setup.py: adds pointing_node entry point
- 18/18 unit tests pass
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
New MeshPeer.msg (1 Hz DDS heartbeat: robot_id, social_state, active persons,
greeted names) and MeshHandoff.msg (person context transfer on STATE_LEAVING).
mesh_comms_node subscribes to person_states and orchestrator/state, publishes
announce heartbeat, triggers handoff on LEAVING, tracks peers with timeout
cleanup, and propagates mesh-wide greeting deduplication via /social/mesh/greeted.
73/73 tests passing.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Add SpeechTranscript.language (BCP-47), ConversationResponse.language fields
- speech_pipeline_node: whisper_language param (""=auto-detect via Whisper LID);
detected language published in every transcript
- conversation_node: track per-speaker language; inject "[Please respond in X.]"
hint for non-English speakers; propagate language to ConversationResponse.
_LANG_NAMES: 24 BCP-47 codes -> English names. Also adds Issue #161 emotion
context plumbing (co-located in same branch for clean merge)
- tts_node: voice_map_json param (JSON BCP-47->ONNX path); lazy voice loading
per language; playback queue now carries (text, lang) tuples for voice routing
- speech_params.yaml, tts_params.yaml: new language params with docs
- 47/47 tests pass (test_multilang.py)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>