feat: STM32 watchdog and fault recovery handler (Issue #565) #583

Merged
sl-jetson merged 1 commits from sl-firmware/issue-565-fault-handler into main 2026-03-14 13:54:23 -04:00
Collaborator

Summary

  • fault_handler.c/h: Complete Cortex-M7 fault detection and recovery subsystem
    • Naked HardFault/MemManage/BusFault/UsageFault ISR stubs with auto-pushed stack-frame capture
    • .noinit SRAM capture ring tagged with magic; survives NVIC_SystemReset(); persisted to flash sector 7 (8×64-byte slots at 0x08060000) on subsequent boot
    • MPU Region 0 stack guard (32 B no-access at __stack_end) → MemManage → FAULT_STACK_OVF
    • Brownout detect via RCC_CSR_BORRSTFFAULT_BROWNOUT
    • LED2 blink codes for 10 s post-recovery: HARDFAULT=3 fast, WATCHDOG=2, BROWNOUT=1 long, STACK_OVF=4
    • Public API: fault_handler_init(), fault_led_tick(), fault_log_read(), fault_get_last_type(), FAULT_ASSERT() macro
  • jlink.h/jlink.c: JLINK_CMD_FAULT_LOG_GET (0x0F) → JLINK_TLM_FAULT_LOG (0x86, 20 bytes)
  • main.c: fault_handler_init() first in main(); boot-time fault log TLM; fault_led_tick() in loop; fault_log_req handler

Test plan

  • Build: pio run -e f722 — no errors
  • HardFault injection → register dump, reset, flash log, LED blink
  • Stack overflow → MemManage → FAULT_STACK_OVF blink code
  • Brownout via RCC_CSR_BORRSTF → FAULT_BROWNOUT on next boot
  • JLink FAULT_LOG_GET (0x0F) → FAULT_LOG TLM (0x86) with PC/CFSR
  • fault_log_clear() → sector erased, PID store restored

Closes #565

🤖 Generated with Claude Code

## Summary - **fault_handler.c/h**: Complete Cortex-M7 fault detection and recovery subsystem - Naked HardFault/MemManage/BusFault/UsageFault ISR stubs with auto-pushed stack-frame capture - `.noinit` SRAM capture ring tagged with magic; survives `NVIC_SystemReset()`; persisted to flash sector 7 (8×64-byte slots at 0x08060000) on subsequent boot - MPU Region 0 stack guard (32 B no-access at `__stack_end`) → MemManage → `FAULT_STACK_OVF` - Brownout detect via `RCC_CSR_BORRSTF` → `FAULT_BROWNOUT` - LED2 blink codes for 10 s post-recovery: HARDFAULT=3 fast, WATCHDOG=2, BROWNOUT=1 long, STACK_OVF=4 - Public API: `fault_handler_init()`, `fault_led_tick()`, `fault_log_read()`, `fault_get_last_type()`, `FAULT_ASSERT()` macro - **jlink.h/jlink.c**: `JLINK_CMD_FAULT_LOG_GET` (0x0F) → `JLINK_TLM_FAULT_LOG` (0x86, 20 bytes) - **main.c**: `fault_handler_init()` first in `main()`; boot-time fault log TLM; `fault_led_tick()` in loop; `fault_log_req` handler ## Test plan - [ ] Build: `pio run -e f722` — no errors - [ ] HardFault injection → register dump, reset, flash log, LED blink - [ ] Stack overflow → MemManage → FAULT_STACK_OVF blink code - [ ] Brownout via RCC_CSR_BORRSTF → FAULT_BROWNOUT on next boot - [ ] JLink `FAULT_LOG_GET` (0x0F) → `FAULT_LOG` TLM (0x86) with PC/CFSR - [ ] `fault_log_clear()` → sector erased, PID store restored Closes #565 🤖 Generated with [Claude Code](https://claude.com/claude-code)
sl-jetson added 1 commit 2026-03-14 12:08:51 -04:00
- New src/fault_handler.c + include/fault_handler.h:
  - HardFault/MemManage/BusFault/UsageFault naked ISR stubs with
    Cortex-M7 stack-frame capture (R0-R3, LR, PC, xPSR, CFSR, HFSR,
    MMFAR, BFAR, SP) and NVIC_SystemReset()
  - .noinit SRAM capture ring survives soft reset; persisted to flash
    sector 7 (0x08060000, 8x64-byte slots) on subsequent boot
  - MPU Region 0 stack guard (32 B at __stack_end, no-access) ->
    MemManage fault detected as FAULT_STACK_OVF
  - Brownout detect via RCC_CSR_BORRSTF on boot -> FAULT_BROWNOUT
  - Watchdog reset detection delegates to existing watchdog.c
  - LED blink codes on LED2 (PC14, active-low) for 10 s post-recovery:
    HARDFAULT=3, WATCHDOG=2, BROWNOUT=1, STACK_OVF=4 fast blinks
  - fault_led_tick(), fault_log_read(), fault_log_get_count(),
    fault_get_last_type(), fault_log_clear(), FAULT_ASSERT() macro
- jlink.h: add JLINK_CMD_FAULT_LOG_GET (0x0F), JLINK_TLM_FAULT_LOG
  (0x86), jlink_tlm_fault_log_t (20 bytes), fault_log_req in JLinkState,
  jlink_send_fault_log() declaration
- jlink.c: dispatch JLINK_CMD_FAULT_LOG_GET; implement
  jlink_send_fault_log() (26-byte CRC16-XModem framed response)
- main.c: call fault_handler_init() first in main(); send fault log
  TLM on boot if prior fault recorded; fault_led_tick() in main loop;
  handle fault_log_req flag to respond to Jetson queries

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
sl-jetson force-pushed sl-firmware/issue-565-fault-handler from 13dd30c44c to 8fbe7c0033 2026-03-14 13:37:19 -04:00 Compare
sl-jetson merged commit 061189670a into main 2026-03-14 13:54:23 -04:00
Sign in to join this conversation.
No Reviewers
No Label
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: seb/saltylab-firmware#583
No description provided.