Skip to content

Add bindings/c-ffi + hardening (panic catching, APIError detail, mutex recovery)#40

Merged
Jainakin merged 6 commits into
UTEXO-Protocol:feat/external-signerfrom
Jainakin:chore/c-ffi-hardening
May 21, 2026
Merged

Add bindings/c-ffi + hardening (panic catching, APIError detail, mutex recovery)#40
Jainakin merged 6 commits into
UTEXO-Protocol:feat/external-signerfrom
Jainakin:chore/c-ffi-hardening

Conversation

@Jainakin
Copy link
Copy Markdown

@Jainakin Jainakin commented May 20, 2026

Adds a bindings/c-ffi/ sub-crate that exposes the SdkNode API over a C ABI, plus the robustness work needed to make it safe to embed in long-lived host processes (RN Bare worklet, Node napi binding, mobile apps).

This PR consolidates what was previously two stacked PRs: #25 (the original c-ffi crate introduction) and the hardening work. Supersedes and closes #25.

What's in the c-ffi crate

bindings/c-ffi/ is a thin extern-"C" shim on top of the existing UniFFI-exposed API in src/uniffi_api/ — same async bridge (block_on_sdk in src/uniffi_api/state.rs), same SdkNode struct, no changes to the daemon's core surface. Lets non-Rust consumers (C/C++, napi-rs, the Bare-runtime addon) link the static lib (librlncffi.a) and call into RLN without depending on uniffi's runtime bindgen layer.

Consumers in this stack:

  • @utexo/rgb-lightning-node-bare (RN Bare worklet)
  • @utexo/rgb-lightning-node-nodejs (napi-rs over the same crate)

Both build against the same bindings/c-ffi/ static lib so the host-platform delta is one addon shim, not a duplicated FFI surface.

Hardening on top

Three robustness improvements that surfaced as real failure modes while integrating with the bindings above:

(1) Panic catching across the extern "C" boundary

Every rln_* entry point is now wrapped in catch_panic (bindings/c-ffi/src/utils.rs + lib.rs). Without this, a .unwrap() panic from any depth (rgb-lib, LDK, BDK) unwinds across the extern "C" frame and the Rust runtime aborts via panic_cannot_unwind, taking the host process with it.

catch_panic is annotated #[inline(never)] and takes the closure as &mut dyn FnMut() -> _. In earlier iterations the compiler inlined catch_panic + the closure into every entry point and eliminated the __rust_try landing pad as nounwind — the wrapper looked like it was protecting the boundary but real panics still aborted. The two safeguards (#[inline(never)] plus vtable indirection) keep the catch_unwind machinery intact regardless of LLVM optimisation level.

Surfaces as Error::Panic(label, payload) containing the FFI entry label and the formatted panic payload.

(2) APIError detail propagation

src/uniffi_api/state.rs: per-thread LAST_API_ERROR_DETAIL slot. When map_api_error collapses a rich APIError into the coarse RlnError enum that crosses the FFI boundary, the original detail string is stashed in the thread-local. The c-ffi's Result -> CResult conversion drains the slot and appends the detail so consumers see e.g.

Rln(Conflict): Unsupported in external signer mode: asset issuance is not supported in external signer mode

instead of just Rln(Conflict).

Thread-local rather than process-global: map_api_error runs synchronously on the caller thread, and the C-FFI conversion drains the slot on the same thread. A Mutex<Option<String>> would let concurrent calls clobber each other's detail between stash and drain.

The new take_last_api_error_detail symbol is re-exported through src/uniffi_api/mod.rs so c-ffi can consume it.

(3) RGB wallet mutex poisoning recovery

src/rgb.rs::get_rgb_wallet() previously called self.wallet.lock().unwrap(). If any thread panicked while holding the lock the mutex was left poisoned and every subsequent operation aborted the entire process — a single panic anywhere cascaded into permanent outage. Now recover the inner guard, log a tracing::error! so the originating panic stays investigable, and continue.

Same hardening on the WalletSource impl for RgbLibWalletWrapper: list_confirmed_utxos, get_change_script, sign_psbt had bare .unwrap() calls that would poison the wallet mutex on any internal failure. All paths now return an early Err(()) with a tracing::error! so failures stay observable but isolated.

(4) Cargo.toml: add rlib crate-type

bindings/c-ffi/Cargo.toml: crate-type = ["staticlib", "rlib"]. staticlib is what mobile + napi consumers link against. rlib lets Rust-side consumers (the napi-rs crate in rgb-lightning-node-nodejs) call our wrappers without duplicating the extern-"C" surface declaration on their side.

Test coverage

Built on top of feat/external-signer. Validated by running the autonomous E2E suite (54 cases) from three places — iOS sim, Android emulator, and pure Node — against the same regtest stack (bitcoind + electrs + rgb-proxy + a peer RLN daemon). All three return identical reports: 45 pass / 2 fail / 5 skip / 2 expected-fail. The 2 fails are a cosmetic LDK noise-handshake race and an rgb-lib UTXO selector edge case on fresh wallets — neither is a regression from this PR.

@Jainakin Jainakin changed the title c-ffi: catch panics, propagate APIError details, harden RGB wallet locks Add bindings/c-ffi + hardening (panic catching, APIError detail, mutex recovery) May 20, 2026
Jainakin added 5 commits May 20, 2026 15:36
Thin extern-"C" shim on top of the existing UniFFI-exposed SdkNode API
(reuses block_on_sdk; no changes under src/). Mirrors the rgb-lib c-ffi
pattern: opaque COpaqueStruct handle, CResult / CResultString tagged
unions, and JSON strings for all complex inputs/outputs. Serde-friendly
mirror types in json_types.rs convert to/from the UniFFI typed wrappers
so no Serialize/Deserialize derives are needed upstream in
src/uniffi_api/types.rs.

Exposes 61 extern "C" functions covering the full SdkNode surface plus
lifecycle, namespace-level helpers, and free primitives. Build emits
librlncffi.{a,dylib} and a committed cbindgen-generated rln.h / rln.hpp
so consumers (Bare native addons, N-API wrappers, ctypes, etc.) can
embed without running build.rs. Includes a smoke-test example.c.

The sub-crate is its own Cargo workspace and mirrors the parent's
[patch.crates-io] table to keep registry lightning crates unified onto
the in-tree rust-lightning submodule. Cargo.lock is committed and pins
rgb-lib to the same revision as the parent's Cargo.lock so a fresh
checkout builds with --locked.
The c-ffi sub-crate previously declared crate-type = ["staticlib",
"cdylib"]. When cargo builds the rgb-lightning-node lib dep on iOS,
the cdylib link step pulls in aws-lc-sys (transitively via the LDK /
electrum stack) and fails with:

    Undefined symbols for architecture arm64:
      "___chkstk_darwin"

That symbol lives in the iOS runtime and isn't satisfied at static
link time. We don't need the cdylib output for the bare native addon
consumer (which links the staticlib only), so drop cdylib from the
crate-type list. Static lib builds clean for darwin-arm64 and all
three iOS triples.
…ootstrap-only paths)

Adds C wrappers for the external-signer methods Roman's PR UTEXO-Protocol#27 introduced
on `SdkNode` and the `NativeExternalSigner` UniFFI object, so the bare
addon (and any other C consumer) can drive RLN without RLN owning the
seed.

Native signer (recommended path for WDK):
  - rln_native_external_signer_new(seed_hex, network, permissive_policy)
  - rln_native_external_signer_bootstrap(signer)
  - rln_sdk_node_init_with_native_external_signer(node, signer)
  - rln_sdk_node_attach_native_external_signer(node, signer)
  - rln_sdk_node_unlock_with_native_external_signer(node, signer, req)
  - free_native_external_signer(signer)

Bootstrap-only (host-implemented signer placeholder — callback transport
not yet exposed through this C FFI; callers can still pre-init with a
bootstrap dict and unlock against an attached host wired in elsewhere):
  - rln_sdk_node_init_with_external_signer(node, bootstrap_json)
  - rln_sdk_node_detach_external_signer(node)
  - rln_sdk_node_unlock_with_attached_external_signer(node, req)

Mechanics:
  - Adds `vls` to the c-ffi's dep features so `NativeExternalSigner` and
    its convenience methods (`init_with_native_external_signer`,
    `attach_native_external_signer`,
    `unlock_with_native_external_signer`) are in scope.
  - Implements `CReturnType` for `Arc<NativeExternalSigner>` + a
    `require_signer` helper, mirroring the existing type-tagged
    `SdkNode` handle pattern so signer handles can't be accidentally
    passed where node handles are expected (and vice versa).
  - Adds two JSON mirror types in `json_types.rs`:
    `JsonSdkExternalSignerBootstrap` (bidirectional —
    serialize on output from native signer, deserialize on input for
    host-implemented signer) and `JsonSdkExternalUnlockRequest` (same
    as the normal unlock request minus `password`, which doesn't apply
    in external-signer mode).
  - Updates `JsonSdkInitRequest` for the new `lsp_base_url` /
    `lsp_bearer_token` fields landed on `dev`.

Reproducibility:
  - Bumps c-ffi/Cargo.lock with `rgb-lib` pinned to the same commit the
    parent rgb-lightning-node crate uses (5aabef63...). Without this,
    the c-ffi workspace's git resolution drifts past the
    rust-lightning submodule's compatible API surface
    (Wallet::go_online signature change in newer rgb-lib).
Refine bindings/c-ffi `[patch.crates-io]` so registry crates resolve to
the parent's submodule revisions; update signer-external to the current
`feat/rgb-compatibility` tip (b89d44c7) which adds the
`FindDerivationMatches` / `DerivedAddressMatch` surface the parent now
references.
Three independent robustness improvements driven by integrating the c-ffi
crate with the napi-rs Node binding (`@utexo/rgb-lightning-node-nodejs`)
and the React-Native Bare addon (`@utexo/rgb-lightning-node-bare`).

(1) Panic catching across the extern "C" boundary
--------------------------------------------------
`bindings/c-ffi/src/utils.rs` + `lib.rs`: every `rln_*` FFI entry point
is now wrapped in `catch_panic`. Without this, a `.unwrap()` panic from
any depth (rgb-lib, LDK, BDK) unwinds across the `extern "C"` frame and
the Rust runtime aborts via `panic_cannot_unwind`, taking the entire
host process with it — a non-recoverable mode that was masking real bugs.

`catch_panic` is annotated `#[inline(never)]` and takes the closure as
`&mut dyn FnMut() -> _` for a deliberate reason: in earlier iterations
the compiler inlined `catch_panic` AND the closure into every `rln_*`
entry point and then eliminated the `__rust_try` landing pad as
nounwind. Net effect: the wrapper appeared to be protecting the
boundary but real panics still aborted. The two safeguards
(`#[inline(never)]` plus vtable indirection) keep the `catch_unwind`
machinery intact regardless of LLVM optimisation level.

Surfaces as `Error::Panic(label, payload)` containing the FFI entry
label and the formatted panic payload.

(2) APIError detail propagation
-------------------------------
`src/uniffi_api/state.rs`: introduce a per-thread `LAST_API_ERROR_DETAIL`
slot. When `map_api_error` collapses a rich `APIError` into the coarse
`RlnError` enum that crosses the FFI boundary, the original detail
string is stashed in the thread-local. The c-ffi's
`Result -> CResult` conversion drains the slot and appends the detail
so consumers see e.g.
  `Rln(Conflict): Unsupported in external signer mode: ...`
instead of just `Rln(Conflict)`.

Thread-local rather than process-global: `map_api_error` runs
synchronously on the caller thread, and the C-FFI conversion drains
the slot on the same thread. A `Mutex<Option<String>>` would let
concurrent calls clobber each other's detail between stash and drain.

The new `take_last_api_error_detail` symbol is re-exported through
`src/uniffi_api/mod.rs` so c-ffi can consume it.

(3) RGB wallet mutex poisoning recovery
---------------------------------------
`src/rgb.rs::get_rgb_wallet()`: previously called
`self.wallet.lock().unwrap()`. If any thread panicked while holding the
lock, the mutex was left poisoned and every subsequent operation on
the wallet aborted the entire process. Now recover the inner guard,
log a `tracing::error!` so the originating panic stays investigable
in trace output, and continue.

Same hardening applied to the `WalletSource` impl on
`RgbLibWalletWrapper`: `list_confirmed_utxos`, `get_change_script`,
`sign_psbt` previously had bare `.unwrap()` calls that would poison
the wallet mutex on any internal failure. All paths now return an
early `Err(())` with a `tracing::error!` so failures stay observable
but isolated.

(4) Cargo.toml: add `rlib` crate-type
-------------------------------------
`bindings/c-ffi/Cargo.toml`: `crate-type = ["staticlib", "rlib"]`.
`staticlib` is what mobile + napi consumers link against. `rlib` lets
Rust-side consumers (the napi-rs crate in `rgb-lightning-node-nodejs`)
call our wrappers without duplicating the extern-"C" surface
declaration on their side.
@Jainakin Jainakin force-pushed the chore/c-ffi-hardening branch from d747d7b to 478bc3a Compare May 20, 2026 10:07
@Jainakin Jainakin merged commit 53655f7 into UTEXO-Protocol:feat/external-signer May 21, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants