feat: VESC CAN health monitor (Issue #651) #666

Merged
sl-jetson merged 1 commits from sl-jetson/issue-651-vesc-health into main 2026-03-18 08:03:32 -04:00
Collaborator

Summary

  • recovery_fsm.py — pure state machine (no ROS2/CAN deps; unit-tested in isolation)

    • States: HEALTHY → DEGRADED (>500 ms) → ESTOP (>2 s unresponsive) / BUS_OFF
    • Recovery: SEND_ALIVE (GET_VALUES) frames at 200 ms intervals in DEGRADED; escalates to TRIGGER_ESTOP at 2 s
    • on_frame() resets any fault state back to HEALTHY; on_bus_off/ok handles CAN bus-off
    • HealthFsm wrapper manages both VESCs together
  • health_monitor_node.py — ROS2 node

    • Subscribes /vesc/left/state + /vesc/right/state (JSON from vesc_telemetry)
    • Sends GET_VALUES alive frames via SocketCAN during recovery
    • Publishes /vesc/health (JSON, 10 Hz): state, elapsed_s, bus_off, estop_active, recent faults
    • Publishes /diagnostics (DiagnosticArray OK/WARN/ERROR per VESC)
    • Publishes /estop (JSON event) + zero /cmd_vel on e-stop trigger and clear
    • Polls ip link for bus-off (1 Hz)
    • 200-entry in-memory fault event log
  • test/test_vesc_health.py — 39 unit tests, all passing, no hardware needed

Test plan

  • python3 -m pytest jetson/ros2_ws/src/saltybot_vesc_health/test/ — 39 passed
  • On Jetson with VESCs powered: ros2 topic echo /vesc/health — state=healthy
  • Disconnect one VESC: confirm WARN at 500 ms, ESTOP at 2 s, zero /cmd_vel
  • Reconnect VESC: confirm state returns to healthy, /estop event estop_cleared
  • ros2 topic echo /diagnostics — OK/WARN/ERROR levels match VESC states
  • journalctl -f — fault events logged correctly

Closes #651

🤖 Generated with Claude Code

## Summary - **`recovery_fsm.py`** — pure state machine (no ROS2/CAN deps; unit-tested in isolation) - States: `HEALTHY → DEGRADED` (>500 ms) `→ ESTOP` (>2 s unresponsive) / `BUS_OFF` - Recovery: `SEND_ALIVE` (GET_VALUES) frames at 200 ms intervals in DEGRADED; escalates to `TRIGGER_ESTOP` at 2 s - `on_frame()` resets any fault state back to HEALTHY; `on_bus_off/ok` handles CAN bus-off - `HealthFsm` wrapper manages both VESCs together - **`health_monitor_node.py`** — ROS2 node - Subscribes `/vesc/left/state` + `/vesc/right/state` (JSON from `vesc_telemetry`) - Sends GET_VALUES alive frames via SocketCAN during recovery - Publishes `/vesc/health` (JSON, 10 Hz): state, elapsed_s, bus_off, estop_active, recent faults - Publishes `/diagnostics` (DiagnosticArray OK/WARN/ERROR per VESC) - Publishes `/estop` (JSON event) + zero `/cmd_vel` on e-stop trigger and clear - Polls `ip link` for bus-off (1 Hz) - 200-entry in-memory fault event log - **`test/test_vesc_health.py`** — 39 unit tests, all passing, no hardware needed ## Test plan - [ ] `python3 -m pytest jetson/ros2_ws/src/saltybot_vesc_health/test/` — 39 passed - [ ] On Jetson with VESCs powered: `ros2 topic echo /vesc/health` — state=healthy - [ ] Disconnect one VESC: confirm WARN at 500 ms, ESTOP at 2 s, zero `/cmd_vel` - [ ] Reconnect VESC: confirm state returns to `healthy`, `/estop` event `estop_cleared` - [ ] `ros2 topic echo /diagnostics` — OK/WARN/ERROR levels match VESC states - [ ] `journalctl -f` — fault events logged correctly Closes #651 🤖 Generated with [Claude Code](https://claude.com/claude-code)
sl-jetson added 1 commit 2026-03-17 11:45:24 -04:00
New package: saltybot_vesc_health

- recovery_fsm.py: pure state machine (no ROS2/CAN deps; fully unit-tested)
  - VescHealthState: HEALTHY → DEGRADED (>500 ms) → ESTOP (>2 s) / BUS_OFF
  - VescMonitor.tick(): drives recovery sequence per VESC; startup-safe
  - VescMonitor.on_frame(): resets state on CAN frame arrival
  - VescMonitor.on_bus_off/on_bus_ok(): bus-off override + recovery
  - HealthFsm: dual-VESC wrapper aggregating both monitors

- health_monitor_node.py: ROS2 node
  - Subscribes /vesc/left/state + /vesc/right/state (JSON from vesc_telemetry)
  - Sends GET_VALUES alive frames via SocketCAN on DEGRADED state
  - Publishes /vesc/health (JSON, 10 Hz) — state, elapsed, recent faults
  - Publishes /diagnostics (DiagnosticArray, OK/WARN/ERROR per VESC)
  - Publishes /estop (JSON event) + zero /cmd_vel on e-stop trigger/clear
  - Polls ip link for bus-off state (1 Hz)
  - 200-entry fault event log included in /vesc/health

- test/test_vesc_health.py: 39 unit tests, all passing, no hardware needed
- config/vesc_health_params.yaml, launch/vesc_health.launch.py

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
sl-jetson force-pushed sl-jetson/issue-651-vesc-health from eab26c35c5 to d57c0bd51d 2026-03-18 08:03:27 -04:00 Compare
sl-jetson merged commit 9e8ea3c411 into main 2026-03-18 08:03:32 -04:00
Sign in to join this conversation.
No Reviewers
No Label
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: seb/saltylab-firmware#666
No description provided.