[P1] ROS2 system health monitor — node heartbeats + auto-restart #408

Closed
opened 2026-03-04 15:46:39 -05:00 by sl-jetson · 0 comments
Collaborator

Goal

Central health monitoring node that tracks all SaltyBot ROS2 nodes and auto-restarts crashed ones.

Requirements

  • Subscribe to all node heartbeat topics (convention: /saltybot//heartbeat)
  • Track expected nodes list from config (health_monitor_params.yaml)
  • If heartbeat missing for >5s, mark node as DEAD
  • Attempt auto-restart via ros2 launch for dead nodes
  • Publish /saltybot/system_health (JSON: node states, uptime, restart counts)
  • Face display: show Alert expression if any critical node is down
  • Log all state transitions for debugging
  • Configurable critical vs non-critical nodes (critical = stop robot if down)
## Goal Central health monitoring node that tracks all SaltyBot ROS2 nodes and auto-restarts crashed ones. ## Requirements - Subscribe to all node heartbeat topics (convention: /saltybot/<node>/heartbeat) - Track expected nodes list from config (health_monitor_params.yaml) - If heartbeat missing for >5s, mark node as DEAD - Attempt auto-restart via ros2 launch for dead nodes - Publish /saltybot/system_health (JSON: node states, uptime, restart counts) - Face display: show Alert expression if any critical node is down - Log all state transitions for debugging - Configurable critical vs non-critical nodes (critical = stop robot if down)
Sign in to join this conversation.
No Label
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: seb/saltylab-firmware#408
No description provided.