Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
13 changes: 13 additions & 0 deletions .wolfssl_known_macro_extras
Original file line number Diff line number Diff line change
@@ -1,4 +1,6 @@
AES_CR_CCFC
AES_GCM_GMULT_NCT
AES_ICR_CCF
AFX_RESOURCE_DLL
AFX_TARG_ENU
ALLOW_BINARY_MISMATCH_INTROSPECTION
Expand Down Expand Up @@ -265,7 +267,11 @@ HARDWARE_CACHE_COHERENCY
HASH_AlgoMode_HASH
HASH_AlgoMode_HMAC
HASH_BYTE_SWAP
HASH_CR_ALGO_1
HASH_CR_DATATYPE_0
HASH_CR_DATATYPE_1
HASH_CR_LKEY
HASH_CR_MODE
HASH_DIGEST
HASH_DataType_8b
HASH_IMR_DCIE
Expand Down Expand Up @@ -495,6 +501,12 @@ PTHREAD_STACK_MIN
QAT_ENABLE_HASH
QAT_ENABLE_RNG
QAT_USE_POLLING_CHECK
RCC_AHB1ENR_PKAEN
RCC_AHB2ENR1_AESEN
RCC_AHB2ENR1_HASHEN
RCC_AHB2ENR1_PKAEN
RCC_AHB2ENR_HASHEN
RCC_AHB2ENR_PKAEN
RC_NO_RNG
REDIRECTION_IN3_KEYELMID
REDIRECTION_IN3_KEYID
Expand Down Expand Up @@ -917,6 +929,7 @@ WOLFSSL_SP_INT_SQR_VOLATILE
WOLFSSL_STACK_CHECK
WOLFSSL_STM32F427_RNG
WOLFSSL_STM32U5_DHUK
WOLFSSL_STM32_BARE
WOLFSSL_STRONGEST_HASH_SIG
WOLFSSL_STSAFE_TAKES_SLOT
WOLFSSL_TELIT_M2MB
Expand Down
201 changes: 201 additions & 0 deletions STM32_BARE_BOARD_STATUS.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,201 @@
# STM32 Bare-Metal (`WOLFSSL_STM32_BARE`) Board Status

Generated 2026-05-04. Tracks the boards exercised by the `STM32_Bare_Test`
multi-board example in `wolfssl-examples-stm32` against the corresponding
direct-register support in `wolfssl/wolfcrypt/src/port/st/stm32.c`.

Columns:

- **HASH HW** — chip has a HASH peripheral (MD5/SHA-1/SHA-2/...). "yes" =
the BARE driver routes `wc_Sha*` to the HASH IP. "-" = no HASH silicon;
SHA falls back to software in all configs.
- **AES HW** — chip has an AES/CRYP peripheral. "CRYP" = FIFO-based AES
on F4/F7/H7/MP13 (`wc_Stm32_Aes_*` -> CRYP HW). "TinyAES" = single-reg
AES on L4/L5/U3/U5/H5/G4/WB/WL/G0/WBA. "-" = no AES; software path.
- **PKA HW** — chip has a public-key accelerator. "yes" + Tested = the
bare-metal PKA driver (added 2026-05-04) is wired up and validated end
to end. "yes" + Untested = silicon present but no validation flash this
session. "-" = no PKA silicon.
- **Status** — Validated = `make BOARD=<x> CONFIG=bare TARGET=test` runs
the full `wolfcrypt_test` and exits with `Result: 0 (PASS)` on real
hardware in this session.

## Validated boards

| BOARD | Chip | Cortex / Clock | HASH HW | AES HW | PKA HW | Status |
|--------|---------------|--------------------|---------|----------|----------------|------------|
| `h7` | STM32H753ZI | M7F / 480 MHz PLL | yes | CRYP | - | Validated |
| `f439` | STM32F439ZI | M4F / 144 MHz PLL | yes | CRYP | - | Validated |
| `wb55` | STM32WB55RG | M4F / 64 MHz PLL | - | TinyAES | yes (Tested V1)| Validated |
| `u3` | STM32U385RG | M33 / 96 MHz | yes | TinyAES | yes (Tested V2)| Validated |
| `u5` | STM32U575ZI | M33 / 160 MHz | yes | - | - | Validated |
| `h5` | STM32H563ZI | M33 / 250 MHz | yes | - | yes (Compile) | Build OK\* |
| `g491` | STM32G491RE | M4F / 170 MHz PLL | - | - | - | Validated |

\* H5 PKA driver is enabled for `BUILD_BARE` and **builds cleanly**.
**Runtime validation is blocked by a flash ECC fault.** See the
"H5 reproduction steps" section below for the full repro recipe.

\*\* U5 is the STM32U575 NUCLEO -- that silicon does **not** have PKA
(only U585+ does). The HASH + RNG bare-metal paths are validated.
For PKA validation on U5 we'd need a NUCLEO-U585AI-Q.

## Bench HW used

These results are from `make BOARD=<x> CONFIG=bare TARGET=bench`. Numbers
are from the wolfcrypt `benchmark.c` block-1024 default. Best column wins
each row.

| Board | Clock | AES-128-CBC enc (BARE) | SHA-256 (BARE) | ECDHE secp256r1 (BARE) |
|--------|---------|------------------------|----------------|------------------------|
| h7 | 480 MHz | **19.165 MiB/s** | **25.928 MiB/s**| (no PKA HW; SP-SW) |
| f439 | 144 MHz | 11.401 MiB/s | 25.757 MiB/s | (no PKA HW; SP-SW) |
| g491 | 170 MHz | 1.017 MiB/s (sw) | 3.037 MiB/s | 11.8 ops/s (sw) |
| wb55 | 64 MHz | 7.237 MiB/s | 1.243 MiB/s sw | 4.83 ops/s (PKA HW)\** |
| u3 | 96 MHz | (TinyAES BARE -- prior)| HASH HW (prior)| 1.115 ops/s (PKA HW)\**|

\** WB55 and U3 PKA HW perform similarly to (or slightly slower than)
the SP-ECC software path at P-256 on these clocks. Both ST docs and
direct measurement (U3 SW = 1.106 vs PKA = 1.115 ops/s) confirm the
PKA HW is correctness-only at P-256 on these specific chips. Larger
curves (P-384/521) where SP-ECC scales worse, and faster-clocked PKA
(H5 at 250 MHz, eventual U585), should let PKA pull meaningfully
ahead. Driver covers V1 (WB) and V2 (U3 / H5 / U5 / WBA / G4A1)
register layouts; the V2 path is exercised end-to-end on U3.

## TODO -- not yet wired up

| BOARD candidate | Chip | Cortex / Clock max | What lights up | Notes |
|-----------------|---------------|--------------------|------------------------|----------------------------------------------|
| `f437` | STM32F437IIHx | M4F / 168 MHz | CRYP + HASH + RNG | STM32439I-EVAL. Parity check vs F439 |
| `f767` / `f779` | STM32F767ZI | M7F / 216 MHz | CRYP + HASH + RNG | NUCLEO-F767ZI |
| `mp135` | STM32MP135F | A7 / 650 MHz | CRYP + HASH + RNG + PKA| STM32MP135F-DK. Linux/bare-metal split |
| `l4r5` | STM32L4R5ZI | M4F / 120 MHz | TinyAES + HASH + RNG | NUCLEO-L4R5ZI |
| `l552` | STM32L552ZE | M33 / 110 MHz | TinyAES + HASH + RNG + SAES | NUCLEO-L552ZE-Q |
| `h573` / `h533` | STM32H573ZI | M33 / 250 MHz | TinyAES + HASH + RNG + SAES | NUCLEO-H573ZI -- H5 with AES added |
| `u585` | STM32U585AI | M33 / 160 MHz | TinyAES + HASH + RNG + SAES + PKA | NUCLEO-U585AI-Q |
| `wba` | STM32WBA52CG | M33 / 100 MHz | TinyAES + HASH + RNG + PKA | NUCLEO-WBA52CG. Same V2 PKA layout |
| `wl55` | STM32WL55JC | M4F / 48 MHz | TinyAES + RNG | NUCLEO-WL55JC. Sub-GHz radio |
| `g0b1` | STM32G0B1RE | M0+ / 64 MHz | TinyAES + RNG | NUCLEO-G0B1RE |
| `g474` / `g484` | STM32G474RE | M4F / 170 MHz | TinyAES + RNG | NUCLEO-G474RE -- G4 sibling that DOES have AES |
| `g4a1` | STM32G4A1RE | M4F / 170 MHz | TinyAES + RNG + PKA + AES | G491 sibling that has the full crypto block |
| `c5a3` | STM32C5A3ZG | M0+ / ~48 MHz | - | NUCLEO-C5A3ZG -- entry-level; software only |

The bare-metal PKA driver in `wolfcrypt/src/port/st/stm32.c` already
covers the V1 (WB) and V2 (H5/U3/U5/G4/WBA) PKA register layouts. New
boards that have PKA need only board bring-up files (startup, linker,
hw_init, system_*.c) plus `WOLFSSL_STM32_PKA` in `user_settings.h` --
no driver changes.

## Repository checkpoints (this session)

`wolfssl@stm32_bare`:
- `7a8ee7d` H7 PLL bring-up to 480 MHz
- `06530195b` WB55 AES1 + CCF macro abstraction
- `8e838294b` G4 family clock-enable maps
- `112e7f929` PKA BARE driver (V1+V2 register layouts; WB55 validated)
- `8383907c1` H5 HASH digest read fix (`HRA` not `HR`)

`wolfssl-examples-stm32@stm32_bare`:
- H7 480 MHz hw_init + benches in README
- WB55 PLL64 + bench
- G491 board files + bench + README correction (G491RE has no PKA)
- WB55 PKA enable
- H5 cube path wildcard

## H5 reproduction steps (NUCLEO-H563ZI bare-metal flash ECC fault)

### Symptom

After flashing the wolfcrypt test build to NUCLEO-H563ZI, the board
emits zero bytes on USART3 (PD8 / ST-LINK VCP at 115200 8N1) and
the CPU spins inside the default NMI handler (`Infinite_Loop` /
`b .`). Halting via SWD shows xPSR.IPSR = 2 (NMI active).

### Root cause

Flash ECC double-bit detection fires on read of flash address
**0x08002000**. The status latches in `FLASH_ECCDETR`:

```
FLASH_ECCDETR = 0x80000200
^ bit 31 ECCD = 1 (uncorrectable error)
^^^ bits[15:0] ADDR_ECC = 0x0200
```

ADDR_ECC is in 16-byte (128-bit quad-word) units: `0x200 * 16 =
0x2000`, so the failing flash word is at `0x08002000`. The H5
flash interface raises NMI on uncorrectable ECC errors.

### Reproducer

```sh
cd ~/GitHub/wolfssl-examples-stm32/STM32_Bare_Test
make BOARD=h5 CONFIG=bare TARGET=test
PROG=/opt/st/stm32cubeide_*/plugins/com.st.stm32cube.ide.mcu.externaltools.cubeprogrammer.linux64_*/tools/bin/STM32_Programmer_CLI
$PROG -c port=SWD reset=HWrst -e all -d build/h5-test-bare/app.bin 0x08000000 -v -rst
# UART log will be empty (0 bytes) at /dev/ttyACM<n>
```

To inspect the latched ECC state via OpenOCD:

```sh
OPENOCD=/home/davidgarske/GitHub/OpenOCD/src/openocd
SCRIPTS=/home/davidgarske/GitHub/OpenOCD/tcl
$OPENOCD -s $SCRIPTS -f interface/stlink-dap.cfg \
-f target/stm32h5x.cfg \
-c "init; halt" \
-c "echo {ECCDETR}; mdw 0x40022104" \
-c "echo {ECCCORR}; mdw 0x40022100" \
-c "shutdown"
# Expected output:
# ECCDETR
# 0x40022104: 80000200
# ECCCORR
# 0x40022100: 00000000
```

### What I tried that did NOT help

- Mass erase + reprogram via STM32_Programmer_CLI (`-e all -d ...`)
- Mass erase + reprogram via OpenOCD `flash erase_sector ; flash write_image`
- Padding the `.bin` to 16-byte (128-bit quad-word) alignment
- Two physical NUCLEO-H563ZI boards (different STLINK serials)
- Option-byte verification: TZEN = 0xC3 (TZ disabled), SRAM2/3 ECC
disabled, HDP1_STRT/END set such that no HDP region is configured
(STRT=1 > END=0, i.e. RM-documented "no protected area"), no WRP
- Clearing `FLASH_ECCDETR` via openocd write to bit 31 -- value is
re-latched as soon as the CPU runs again
- Building `CONFIG=c` (pure software, no BARE drivers, no PKA, no
wolfssl HW paths) -- same fault, so it is not a wolfssl regression
- Replacing `printf` with direct USART writes inside `main()` --
same fault, so it is not a newlib stdio init issue per se

### What does work on the same hardware

- A standalone direct-USART "Hello %d" program (built with the
same `--specs=nano.specs --specs=nosys.specs`, ~151 KB) boots
and prints. The wolfssl-linked test (~260 KB) does not.
- The build is correct: `STM32_Programmer_CLI -r32 0x08002000 16`
reads flash content that matches the bin byte-for-byte.

### Hypothesis

Either the H5 flash interface stages an ECC error at a fixed
quad-word in this code-size range that neither programmer is
clearing, or the chip-erase sequence as currently invoked leaves
that quad-word's ECC bits in an inconsistent state that subsequent
programs do not refresh. The same code path validates end-to-end
on STM32U385 (V2 PKA, identical register sequence) so the wolfssl
PKA driver itself is not the cause.

### What likely fixes it

- Test with a different programming tool (J-Link, STM32CubeIDE GUI)
to rule out CLI / OpenOCD behavior
- Try writing the same image to BANK2 (0x08100000) and switching
SWAP_BANK -- if the fault follows the bank, it is silicon; if it
follows the address, it is the programmer
- Try a smaller wolfssl build that does not cross 0x08002000 to
confirm the dependency is on physical flash address rather than
on what is at it
57 changes: 56 additions & 1 deletion wolfcrypt/src/aes.c
Original file line number Diff line number Diff line change
Expand Up @@ -227,6 +227,10 @@ block cipher mechanism that uses n-bit binary string parameter key with 128-bits
static WARN_UNUSED_RESULT int wc_AesEncrypt(
Aes* aes, const byte* inBlock, byte* outBlock)
{
#ifdef WOLFSSL_STM32_BARE
/* Bare-metal driver handles mutex, clock and key/IV internally. */
return wc_Stm32_Aes_Ecb(aes, outBlock, inBlock, WC_AES_BLOCK_SIZE, 1);
#else
int ret = 0;
#ifdef WOLFSSL_STM32_CUBEMX
CRYP_HandleTypeDef hcryp;
Expand Down Expand Up @@ -367,6 +371,7 @@ block cipher mechanism that uses n-bit binary string parameter key with 128-bits
wc_Stm32_Aes_Cleanup();

return ret;
#endif /* !WOLFSSL_STM32_BARE */
}
#endif /* WOLFSSL_AES_DIRECT || HAVE_AESGCM || HAVE_AESCCM */

Expand All @@ -375,6 +380,9 @@ block cipher mechanism that uses n-bit binary string parameter key with 128-bits
static WARN_UNUSED_RESULT int wc_AesDecrypt(
Aes* aes, const byte* inBlock, byte* outBlock)
{
#ifdef WOLFSSL_STM32_BARE
return wc_Stm32_Aes_Ecb(aes, outBlock, inBlock, WC_AES_BLOCK_SIZE, 0);
#else
int ret = 0;
#ifdef WOLFSSL_STM32_CUBEMX
CRYP_HandleTypeDef hcryp;
Expand Down Expand Up @@ -521,6 +529,7 @@ block cipher mechanism that uses n-bit binary string parameter key with 128-bits
wc_Stm32_Aes_Cleanup();

return ret;
#endif /* !WOLFSSL_STM32_BARE */
}
#endif /* WOLFSSL_AES_DIRECT */
#endif /* HAVE_AES_DECRYPT */
Expand Down Expand Up @@ -5575,7 +5584,34 @@ int wc_AesSetIV(Aes* aes, const byte* iv)
#ifdef HAVE_AES_CBC
#if defined(STM32_CRYPTO)

#ifdef WOLFSSL_STM32U5_DHUK
#ifdef WOLFSSL_STM32_BARE
int wc_AesCbcEncrypt(Aes* aes, byte* out, const byte* in, word32 sz)
{
#ifdef WOLFSSL_AES_CBC_LENGTH_CHECKS
if (sz % WC_AES_BLOCK_SIZE) {
return BAD_LENGTH_E;
}
#endif
if (sz == 0) {
return 0;
}
return wc_Stm32_Aes_Cbc(aes, out, in, sz, 1);
}
#ifdef HAVE_AES_DECRYPT
int wc_AesCbcDecrypt(Aes* aes, byte* out, const byte* in, word32 sz)
{
#ifdef WOLFSSL_AES_CBC_LENGTH_CHECKS
if (sz % WC_AES_BLOCK_SIZE) {
return BAD_LENGTH_E;
}
#endif
if (sz == 0) {
return 0;
}
return wc_Stm32_Aes_Cbc(aes, out, in, sz, 0);
}
#endif /* HAVE_AES_DECRYPT */
#elif defined(WOLFSSL_STM32U5_DHUK)
int wc_AesCbcEncrypt(Aes* aes, byte* out, const byte* in, word32 sz)
{
int ret = 0;
Expand Down Expand Up @@ -6955,6 +6991,11 @@ int wc_AesCbcEncrypt(Aes* aes, byte* out, const byte* in, word32 sz)

int wc_AesCtrEncryptBlock(Aes* aes, byte* out, const byte* in)
{
#ifdef WOLFSSL_STM32_BARE
/* CTR per-block transform: ECB-encrypt the counter (passed in
* 'in'); aes.c handles counter increment and XOR with plaintext. */
return wc_Stm32_Aes_Ecb(aes, out, in, WC_AES_BLOCK_SIZE, 1);
#else
int ret = 0;
#ifdef WOLFSSL_STM32_CUBEMX
CRYP_HandleTypeDef hcryp;
Expand Down Expand Up @@ -7065,6 +7106,7 @@ int wc_AesCbcEncrypt(Aes* aes, byte* out, const byte* in, word32 sz)
wolfSSL_CryptHwMutexUnLock();
wc_Stm32_Aes_Cleanup();
return ret;
#endif /* !WOLFSSL_STM32_BARE */
}


Expand Down Expand Up @@ -10141,6 +10183,15 @@ int wc_AesGcmEncrypt(Aes* aes, byte* out, const byte* in, word32 sz,
authIn, authInSz);
#endif

#if defined(WOLFSSL_STM32_BARE) && defined(STM32_CRYPTO)
ret = wc_Stm32_Aes_Gcm(aes, out, in, sz, iv, ivSz,
authTag, authTagSz,
authIn, authInSz, 1 /* enc */);
if (ret != WC_NO_ERR_TRACE(CRYPTOCB_UNAVAILABLE))
return ret;
/* fall through to SW GCM (still uses HW AES via wc_AesEncrypt) */
#endif /* WOLFSSL_STM32_BARE && STM32_CRYPTO */

#ifdef STM32_CRYPTO_AES_GCM
return wc_AesGcmEncrypt_STM32(
aes, out, in, sz, iv, ivSz,
Expand Down Expand Up @@ -10870,6 +10921,10 @@ int wc_AesGcmDecrypt(Aes* aes, byte* out, const byte* in, word32 sz,

#endif

/* BARE: GCM decrypt always uses SW path (with HW AES blocks via
* wc_AesEncrypt). Encrypt is HW-accelerated above; decrypt + tag
* verification stays in well-tested SW for now. */

#ifdef STM32_CRYPTO_AES_GCM
/* The STM standard peripheral library API's doesn't support partial blocks */
return wc_AesGcmDecrypt_STM32(
Expand Down
Loading
Loading