-
Notifications
You must be signed in to change notification settings - Fork 187
Description
Summary
bootc install to-disk --block-setup tpm2-luks hangs indefinitely at cryptsetup luksOpen. The root cause is a SysV semaphore deadlock between libdevmapper and udevd across the container's isolated IPC namespace.
This affects all tpm2-luks installs regardless of TPM2 hardware, PCR configuration, or token state. The hang occurs on any LUKS device-mapper activation inside bootc's container environment when the container uses a separate IPC namespace (the podman default).
Users who happened to pass --ipc=host to podman would not have hit this bug, which may explain why some users reported TPM2-related issues (#421, #476, #477, #561) without mentioning the install hang -- they may have bypassed the semaphore deadlock without realizing it.
Root cause
libdevmapper uses SysV semaphores ("udev cookies") to synchronize device-mapper operations with udevd. When bootc runs inside a container (the standard podman run --privileged --pid=host invocation), the container has an isolated IPC namespace by default. udevd runs on the host in the host's IPC namespace.
The sequence:
cryptsetup luksFormatcreates a LUKS volume. libdevmapper creates a SysV semaphore in the container's IPC namespace.cryptsetup luksOpentries to activate a dm-crypt mapping. libdevmapper attempts to acquire the udev cookie semaphore and waits for udevd to signal completion.- udevd (on the host) cannot see the semaphore because it is in a different IPC namespace. The semaphore is never released.
luksOpenblocks forever onsemop().
Kernel stack trace of the hanging process:
[<0>] __do_semtimedop+0x3a8/0xd50
[<0>] do_semtimedop+0x15e/0x1a0
[<0>] do_syscall_64+0x7e/0x6b0
[<0>] entry_SYSCALL_64_after_hwframe+0x76/0x7e
SysV semaphore visible inside the container's IPC namespace:
$ nsenter -t <cryptsetup_pid> --ipc ipcs -s
key semid owner perms nsems
0x0d4d3b80 0 root 600 1
Reproduction
# This hangs (default IPC namespace):
podman run --rm --privileged --pid=host \
--security-opt label=type:unconfined_t \
-v /dev:/dev \
-v /var/lib/containers:/var/lib/containers \
quay.io/fedora/fedora-bootc:42 \
bootc install to-disk --wipe --block-setup tpm2-luks \
--filesystem xfs /dev/sdXFix
Set DM_DISABLE_UDEV=1 in the environment before cryptsetup device-mapper operations. This tells libdevmapper to skip udev synchronization.
What DM_DISABLE_UDEV=1 does:
- Tells libdevmapper to skip creating/waiting on udev cookie semaphores for dm operations
- The kernel still creates device nodes (
/dev/mapper/root) via devtmpfs -- this does not depend on udev - udev rules for dm events do not fire, so
/dev/disk/by-uuid/and/dev/disk/by-id/dm-uuid-*symlinks for the dm-crypt device are not created during install
Why this is safe during installation:
- bootc references the dm device directly as
/dev/mapper/root(baseline.rs line 380), never via udev symlinks - No other code in the install path depends on udev symlinks for the dm-crypt device
udev_settle()for partition discovery (line 332) is unaffected -- it operates on partition devices, not device-mapper- There are no concurrent device-mapper consumers during install
- The env var is scoped to the
Tpm2Lukscode path only, not set globally for all bootc operations
What DM_DISABLE_UDEV=1 does NOT affect:
- Partition device nodes (
/dev/sdb1-4) -- created by kernel, managed by udev in the host namespace, unrelated to dm cookies udev_settle()after partitioning -- still works normally- The installed system's boot-time LUKS unlock --
DM_DISABLE_UDEVis only set in the install container's environment, not persisted to the installed system
Two workarounds confirm the diagnosis:
# Workaround 1: DM_DISABLE_UDEV=1 (skips udev sync)
podman run ... -e DM_DISABLE_UDEV=1 ... bootc install to-disk ...
# Result: Installation complete!
# Workaround 2: --ipc=host (shares IPC namespace with host)
podman run ... --ipc=host ... bootc install to-disk ...
# Result: Installation complete!Note on architecture
Running cryptsetup and device-mapper operations from inside a container is inherently fragile due to namespace isolation issues like this one. As @jmpolom noted in #421, an external workflow that prepares disks (including LUKS) before invoking bootc install to-filesystem would avoid this entire class of problems. See also #542 for the broader proposal of bootc-as-library for installer applications. This fix addresses the immediate hang for users of the existing install to-disk path.
Note on systemd 258 PCR default change
Separately from this hang, systemd-cryptenroll changed its default --tpm2-pcrs from PCR 7 (systemd <=257) to no PCRs (systemd >=258, commit 4b840414). bootc calls systemd-cryptenroll --tpm2-device=auto without specifying --tpm2-pcrs, so the PCR binding behavior depends on which systemd version is in the container image. This is tracked in #476 and #561 but worth noting as a separate concern.
Test environment
- Fedora 42 (Cloud Edition), kernel 6.19.7-100.fc42.x86_64
- systemd 257.11-1.fc42
- cryptsetup 2.8.4
- podman 5.8.0
- GCP n2-standard-8 with 20GB PD-SSD target disk
- Tested with stock bootc from quay.io/fedora/fedora-bootc:42
Related issues
- install to-disk with LUKS + TPM broken #421 -- original LUKS+TPM2 bug report (the hang described there is this same semaphore deadlock, not a TPM2 issue)
- Add config option to configure systemd-cryptenroll PCRs #476 -- PCR configuration (separate boot-time concern, still valid)
- LUKS volumes need configurable password and/or recovery keys #477 -- recovery keys / passwords (separate boot-time concern, still valid)
- check shim version before installing with LUKS root #561 -- shim version check (separate boot-time concern, still valid)