History

sl-jetson 71d6ce610b feat(social): speech pipeline + LLM conversation + TTS + orchestrator (#81 #83 #85 #89 )

Issue #81 — Speech pipeline:
- speech_pipeline_node.py: OpenWakeWord "hey_salty" → Silero VAD → faster-whisper
  STT (Orin GPU, <500ms wake-to-transcript) → ECAPA-TDNN speaker diarization
- speech_utils.py: pcm16↔float32, EnergyVad, UtteranceSegmenter (pre-roll, max-
  duration), cosine speaker identification — all pure Python, no ROS2/GPU needed
- Publishes /social/speech/transcript (SpeechTranscript) + /social/speech/vad_state

Issue #83 — Conversation engine:
- conversation_node.py: llama-cpp-python GGUF (Phi-3-mini Q4_K_M, 20 GPU layers),
  streaming token output, per-person sliding-window context (4K tokens), summary
  compression, SOUL.md system prompt, group mode
- llm_context.py: PersonContext, ContextStore (JSON persistence), build_llama_prompt
  (ChatML format), context compression via LLM summarization
- Publishes /social/conversation/response (ConversationResponse, partial + final)

Issue #85 — Streaming TTS:
- tts_node.py: Piper ONNX streaming synthesis, sentence-by-sentence first-chunk
  streaming (<200ms to first audio), sounddevice USB speaker playback, volume control
- tts_utils.py: split_sentences, pcm16_to_wav_bytes, chunk_pcm, apply_volume, strip_ssml

Issue #89 — Pipeline orchestrator:
- orchestrator_node.py: IDLE→LISTENING→THINKING→SPEAKING state machine, GPU memory
  watchdog (throttle at <2GB free), rolling latency stats (p50/p95 per stage),
  VAD watchdog (alert if speech pipeline hangs), /social/orchestrator/state JSON pub
- social_bot.launch.py: brings up all 4 nodes with TimerAction delays

New messages: SpeechTranscript.msg, VadState.msg, ConversationResponse.msg
Config YAMLs: speech_params, conversation_params, tts_params, orchestrator_params
Tests: 58 tests (28 speech_utils + 30 llm_context/tts_utils), all passing

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

2026-03-02 08:17:35 -05:00

config

feat: 4x IMX219 surround vision + Nav2 camera obstacle layer (Phase 2c)

2026-02-28 23:19:23 -05:00

docs

feat: Orin Nano Super platform update + 4x IMX219 CSI cameras

2026-02-28 22:59:13 -05:00

ros2_ws/src

feat(social): speech pipeline + LLM conversation + TTS + orchestrator (#81 #83 #85 #89 )

2026-03-02 08:17:35 -05:00

scripts

feat: Orin Nano Super platform update + 4x IMX219 CSI cameras

2026-02-28 22:59:13 -05:00

docker-compose.yml

feat(safety): remote e-stop over 4G MQTT (Issue #63 )

2026-03-01 04:55:54 -05:00

Dockerfile

feat: update SLAM stack for Jetson Orin Nano Super (67 TOPS, JetPack 6)

2026-02-28 21:46:27 -05:00

README.md

feat: Jetson Nano platform setup and Docker env (bd-1hcg)

2026-02-28 12:46:14 -05:00

README.md

Jetson Nano — AI/SLAM Platform Setup

Self-balancing robot: Jetson Nano dev environment for ROS2 Humble + SLAM stack.

Stack

Component	Version / Part
Platform	Jetson Nano 4GB
JetPack	4.6 (L4T R32.6.1, CUDA 10.2)
ROS2	Humble Hawksbill
DDS	CycloneDDS
SLAM	slam_toolbox
Nav	Nav2
Depth camera	Intel RealSense D435i
LiDAR	RPLIDAR A1M8
MCU bridge	STM32F722 (USB CDC @ 921600)

Quick Start

# 1. Host setup (once, on fresh JetPack 4.6)
sudo bash scripts/setup-jetson.sh

# 2. Build Docker image
bash scripts/build-and-run.sh build

# 3. Start full stack
bash scripts/build-and-run.sh up

# 4. Open ROS2 shell
bash scripts/build-and-run.sh shell

Docs

docs/pinout.md — GPIO/I2C/UART pinout for all peripherals
docs/power-budget.md — 10W power envelope analysis

Files

jetson/
├── Dockerfile              # L4T base + ROS2 Humble + SLAM packages
├── docker-compose.yml      # Multi-service stack (ROS2, RPLIDAR, D435i, STM32)
├── README.md               # This file
├── docs/
│   ├── pinout.md           # GPIO/I2C/UART pinout reference
│   └── power-budget.md     # Power budget analysis (10W envelope)
└── scripts/
    ├── entrypoint.sh       # Docker container entrypoint
    ├── setup-jetson.sh     # Host setup (udev, Docker, nvpmodel)
    └── build-and-run.sh    # Build/run helper

Power Budget (Summary)

Scenario	Total
Idle	2.9W
Nominal (SLAM active)	~10.2W
Peak	15.4W

Target: 10W (MAXN nvpmodel). Use RPLIDAR standby + 640p D435i for compliance. See docs/power-budget.md for full analysis.