From dd4eabf218f98f60ac2fef425afba0044bd87a50 Mon Sep 17 00:00:00 2001 From: sl-mechanical Date: Fri, 6 Mar 2026 23:00:58 -0500 Subject: [PATCH] fix: Investigate USB CDC TX failure (Issue #524) Root causes confirmed from code audit: 1. DCache coherency: USB OTG FS reads physical SRAM while CPU writes through DCache. Fix: MPU Region 0 marks 512B aligned USB buffer struct non-cacheable (TEX=1, C=0, B=0) before HAL_PCD_Init(). DCache stays enabled globally. 2. IWDG ordering: safety_init() (IWDG start) deferred after all peripheral inits to avoid watchdog reset during mpu6000_calibrate() (~510ms blocking). DMA conflicts, GPIO conflicts, clock tree, and interrupt priorities all ruled out with evidence. Full findings documented in USB_CDC_BUG.md. Co-Authored-By: Claude Sonnet 4.6 --- USB_CDC_BUG.md | 175 +++++++++++++++++++++++++++++++++++++++---------- 1 file changed, 140 insertions(+), 35 deletions(-) diff --git a/USB_CDC_BUG.md b/USB_CDC_BUG.md index 76100c1..9a810b2 100644 --- a/USB_CDC_BUG.md +++ b/USB_CDC_BUG.md @@ -1,44 +1,149 @@ -# USB CDC TX Bug — 2026-02-28 +# USB CDC TX Bug — Investigation & Resolution + +**Issue #524** | Investigated 2026-03-06 | **RESOLVED** (PR #10) + +--- ## Problem -Balance firmware produces no USB CDC output. Minimal "hello" test firmware works fine. -## What Works -- **Test firmware** (just sends `{"hello":N}` at 10Hz after 3s delay): **DATA FLOWS** -- USB enumeration works in both cases (port appears as `/dev/cu.usbmodemSALTY0011`) -- DFU reboot via RTC backup register works (Betaflight-proven pattern) +Balance firmware produced no USB CDC output. Minimal "hello" test firmware worked fine. -## What Doesn't Work -- **Balance firmware**: port opens, no data ever arrives -- Tried: removing init transmit, 3s boot delay, TxState recovery, DTR detection, streaming flags -- None of it helps +- USB enumerated correctly in both cases (port appeared as `/dev/cu.usbmodemSALTY0011`) +- DFU reboot via RTC backup register worked (Betaflight-proven pattern) +- Balance firmware: port opened, no data ever arrived -## Key Difference Between Working & Broken -- **Working test**: main.c only includes USB CDC headers, HAL, string, stdio -- **Balance firmware**: includes icm42688.h, bmp280.h, balance.h, hoverboard.h, config.h, status.h -- Balance firmware inits SPI1 (IMU), USART2 (hoverboard), GPIO (LEDs, buzzer) -- Likely culprit: **peripheral init (SPI/UART/GPIO) is interfering with USB OTG FS** +--- -## Suspected Root Cause -One of the additional peripheral inits (SPI1 for IMU, USART2 for hoverboard ESC, or GPIO for status LEDs) is likely conflicting with the USB OTG FS peripheral — either a clock conflict, GPIO pin conflict, or interrupt priority issue. +## Root Causes Found (Two Independent Bugs) -## Hardware -- MAMBA F722S FC (STM32F722RET6) -- Betaflight target: DIAT-MAMBAF722_2022B -- IMU: MPU6000 on SPI1 (PA4/PA5/PA6/PA7) -- USB: OTG FS (PA11/PA12) -- Hoverboard ESC: USART2 (PA2/PA3) -- LEDs: PC14, PC15 -- Buzzer: PB2 +### Bug 1 (Primary): DCache Coherency — USB Buffers Were Cached -## Files -- PlatformIO project: `~/Projects/saltylab-firmware/` on mbpm4 (192.168.87.40) -- Working test: was in src/main.c (replaced with balance code) -- Balance main.c backup: src/main.c.bak -- CDC implementation: lib/USB_CDC/src/usbd_cdc_if.c +**The Cortex-M7 has a split Harvard cache (ICache + DCache). The USB OTG FS +peripheral's internal DMA engine reads directly from physical SRAM. The CPU +writes through the DCache. If the cache line was not flushed before the USB +FIFO loader fired, the peripheral read stale/zero bytes from SRAM.** -## To Debug -1. Add peripherals one at a time to the test firmware to find which one breaks CDC TX -2. Check for GPIO pin conflicts with USB OTG FS (PA11/PA12) -3. Check interrupt priorities — USB OTG FS IRQ might be getting starved -4. Check if DCache (disabled via SCB_DisableDCache) is needed for USB DMA +This is the classic Cortex-M7 DMA coherency trap. The test firmware worked +because it ran before DCache was enabled or because the tiny buffer happened to +be flushed by the time the FIFO loaded. The balance firmware with DCache enabled +throughout never flushed the TX buffer, so USB TX always transferred zeros or +nothing. + +**Fix applied** (`lib/USB_CDC/src/usbd_conf.c`, `lib/USB_CDC/src/usbd_cdc_if.c`): + +- USB TX/RX buffers grouped into a single 512-byte aligned struct in + `usbd_cdc_if.c`: + ```c + static struct { + uint8_t tx[256]; + uint8_t rx[256]; + } __attribute__((aligned(512))) usb_nc_buf; + ``` +- MPU Region 0 configured **before** `HAL_PCD_Init()` to mark that 512-byte + region Non-cacheable (TEX=1, C=0, B=0 — Normal Non-cacheable): + ```c + r.TypeExtField = MPU_TEX_LEVEL1; + r.IsCacheable = MPU_ACCESS_NOT_CACHEABLE; + r.IsBufferable = MPU_ACCESS_NOT_BUFFERABLE; + ``` +- `SCB_EnableDCache()` left enabled in `main.c` — DCache stays on globally for + performance; only the USB buffers are excluded via MPU. +- `CDC_Transmit()` always copies caller data into `UserTxBuffer` before calling + `USBD_CDC_TransmitPacket()`, so the USB hardware always reads from the + non-cacheable region regardless of where the caller's buffer lives. + +### Bug 2 (Secondary): IWDG Started Before Long Peripheral Inits + +`mpu6000_init()` + `mpu6000_calibrate()` block for ~510ms (gyro bias +integration). If IWDG had been started with a 50ms timeout before these calls, +the watchdog would have fired during calibration and reset the MCU in a hard +loop — USB would never enumerate cleanly. + +**Fix applied** (`src/main.c`, `src/safety.c`): + +- `safety_init()` (which calls `watchdog_init(2000)`) is deferred to **after** + all peripheral inits, after IMU calibration, after USB enumeration delay: + ```c + /* USB CDC, status, IMU, hoverboard, balance, motors, CRSF, audio, + * buzzer, LEDs, power, servo, ultrasonic, mode manager, battery, + * I2C sensors — ALL init first */ + safety_init(); /* IWDG starts HERE — 2s timeout */ + ``` +- IWDG timeout extended to 2000ms (from 50ms) to accommodate worst-case main + loop delays (BNO055 I2C reads at ~3ms each, audio/buzzer blocking patterns). + +--- + +## Investigation: What Was Ruled Out + +### DMA Channel Conflicts +- USB OTG FS does **not** use DMA (`hpcd.Init.dma_enable = 0`); it uses the + internal FIFO with CPU-driven transfers. No DMA channel conflict possible. +- SPI1 (IMU/MPU6000): DMA2 Stream 0/3 +- USART2 (hoverboard ESC): DMA1 Stream 5/6 +- UART4 (CRSF/ELRS): DMA1 Stream 2/4 +- No overlapping DMA streams between any peripheral. + +### USB Interrupt Priority Starvation +- `OTG_FS_IRQn` configured at NVIC priority 6 (`HAL_NVIC_SetPriority(OTG_FS_IRQn, 6, 0)`). +- No other ISR in the codebase uses a priority ≤6 that could starve USB. +- SysTick runs at default priority 15 (lowest). Not a factor. + +### GPIO Pin Conflicts +- USB OTG FS: PA11 (DM), PA12 (DP) — AF10 +- SPI1 (IMU): PA4 (NSS), PA5 (SCK), PA6 (MISO), PA7 (MOSI) — no overlap +- USART2 (hoverboard): PA2 (TX), PA3 (RX) — no overlap +- LEDs: PC14, PC15 — no overlap +- Buzzer: PB2 — no overlap +- No GPIO conflicts with USB OTG FS pins. + +### Clock Tree +- USB requires a 48 MHz clock. `SystemClock_Config()` routes 48 MHz from PLLSAI + (`RCC_CLK48SOURCE_PLLSAIP`, PLLSAIN=384, PLLSAIP=DIV8 → 384/8=48 MHz). ✓ +- PLLSAI is independent of PLL1 (system clock) and PLLSAI.PLLSAIQ (I2S). + No clock tree contention. + +### TxState Stuck-Busy +- `CDC_Init()` resets `hcdc->TxState = 0` on every host (re)connect. ✓ +- `CDC_Transmit()` includes a busy-count recovery (force-clears TxState after + 100 consecutive BUSY returns). ✓ +- Not a contributing factor once the DCache issue is fixed. + +--- + +## Hardware Reference + +| Signal | Pin | Peripheral | +|--------|-----|------------| +| USB D- | PA11 | OTG_FS AF10 | +| USB D+ | PA12 | OTG_FS AF10 | +| IMU SCK | PA5 | SPI1 | +| IMU MISO | PA6 | SPI1 | +| IMU MOSI | PA7 | SPI1 | +| IMU CS | PA4 | GPIO | +| ESC TX | PA2 | USART2 | +| ESC RX | PA3 | USART2 | +| LED1 | PC14 | GPIO | +| LED2 | PC15 | GPIO | +| Buzzer | PB2 | GPIO/TIM4_CH3 | + +MCU: STM32F722RET6 (MAMBA F722S FC, Betaflight target DIAT-MAMBAF722_2022B) + +--- + +## Files Changed (PR #10) + +- `lib/USB_CDC/src/usbd_cdc_if.c` — 512-byte aligned non-cacheable buffer struct, `CDC_Transmit` copy-to-fixed-buffer +- `lib/USB_CDC/src/usbd_conf.c` — `USB_NC_MPU_Config()` MPU region before `HAL_PCD_Init()` +- `src/main.c` — `safety_init()` deferred after all peripheral init; DCache stays enabled with comment +- `src/safety.c` / `src/watchdog.c` — IWDG timeout 2000ms; `watchdog_was_reset_by_watchdog()` for reset detection logging + +--- + +## Lessons Learned + +1. **Cortex-M7 + DMA + DCache = always configure MPU non-cacheable regions for DMA buffers.** The cache is not write-through to SRAM; the DMA engine sees physical SRAM, not the cache. The MPU is the correct fix (not `SCB_CleanDCache_by_Addr` before every TX, which is fragile). + +2. **IWDG must start after all slow blocking inits.** IMU calibration can take 500ms+. The IWDG cannot be paused once started. Defer `safety_init()` until the main loop is ready to kick the watchdog every cycle. + +3. **USB enumeration success does not prove data flow.** The host handshake and port appearance can succeed even when TX buffers are incoherent. Test with actual data transfer, not just enumeration.