lkl: per-thread irqs_enabled fixes IRQ-leak hangs under real drivers#638
Open
josephnef wants to merge 3 commits into
Open
lkl: per-thread irqs_enabled fixes IRQ-leak hangs under real drivers#638josephnef wants to merge 3 commits into
josephnef wants to merge 3 commits into
Conversation
added 3 commits
May 27, 2026 15:34
arch/lkl/kernel/irq.c keeps `irqs_enabled` as a single `static bool` global, with no save/restore in __switch_to. Any kernel path that does spin_lock_irqsave and schedules before the matching restore leaks the DISABLED value to whatever thread runs next. The concrete consequence is that lkl_trigger_irq, observing the leaked DISABLED, silently pends every IRQ — including the timer tick — and jiffies stops advancing. Real drivers (e.g. rtw88_8812au's chip-init code path) trip this within ~50 USB control transfers and hang the kernel forever; the problem isn't theoretical. Add a small KUnit suite, gated on a new CONFIG_LKL_IRQ_KUNIT_TEST, that reproduces the kernel-side leg of the bug without needing a real driver: take a spin_lock_irqsave in one kernel thread, yield via schedule_timeout while holding the lock, and from a sibling kernel thread that runs during the yield, observe its own arch_local_save_flags. With the current global, the observer reads ARCH_IRQ_DISABLED — the lock holder's leaked state. With a per-thread irqs_enabled (the subsequent patch in this series), the observer reads its own ARCH_IRQ_ENABLED. This commit only adds the test. It fails on master; it passes after the per-thread irqs_enabled patch that follows. Wire-up: - arch/lkl/kernel/irq_test.c — the KUnit suite (.name = "lkl_irq") - arch/lkl/kernel/Makefile — build it when CONFIG_LKL_IRQ_KUNIT_TEST=y - arch/lkl/Kconfig — new boolean depends on KUNIT - tools/lkl/Makefile.autoconf — kunit=yes enables it alongside the existing LKL_PCI_KUNIT_TEST - tools/lkl/tests/boot.c — lkl_test_kunit_irq parses the boot log for "ok N lkl_irq", mirroring lkl_test_kunit_pci Signed-off-by: Joseph <joseph@josephnef.dev>
irqs_enabled lived in arch/lkl/kernel/irq.c as a single `static bool` global, with no save/restore in __switch_to. Any kernel path that did spin_lock_irqsave and scheduled before the matching restore (the canonical pattern around wait_event_*) leaked the DISABLED value to whichever thread ran next. lkl_trigger_irq saw the leaked DISABLED and silently pended every IRQ — including the timer tick. jiffies stopped advancing; every msleep-based kthread hung. The LKL irqs_enabled-KUnit test added in the previous commit reproduces the failure on this commit's parent: an observer kthread, scheduled in while a sibling holds spin_lock_irqsave, reads the spilled DISABLED state. This commit moves irqs_enabled into struct thread_info and accesses it via current_thread_info(). The fix doesn't add any explicit per-thread save/restore code: the existing _current_thread_info = task_thread_info(next); in __switch_to (arch/lkl/kernel/threads.c) is the entire mechanism. Each thread's irqs_enabled travels with its thread_info, the same way a real CPU's register file follows the thread. - arch/lkl/include/asm/thread_info.h: add `unsigned long irqs_enabled` field; INIT_THREAD_INFO sets it to 1 (ARCH_IRQ_ENABLED) so the init task starts with IRQs enabled. - arch/lkl/kernel/irq.c: drop the `static bool irqs_enabled` global. arch_local_save_flags, arch_local_irq_restore, and the lkl_trigger_irq pending-check all go through current_thread_info(). - arch/lkl/kernel/threads.c: init_ti sets ti->irqs_enabled = ARCH_IRQ_ENABLED for freshly-allocated kernel threads. With this commit applied on top of the previous one, the LKL_IRQ_KUNIT_TEST suite passes (ok 1 lkl_irq). Tested against: - The lkl-wifi-poc project (https://github.com/josephnef/lkl-wifi-poc), where the rtw88_8812au USB Wi-Fi driver running under LKL hung within ~50 control transfers on the global-flag kernel; under this patch it completes ~8000 transfers and brings wlan0 up cleanly. - The new KUnit suite (CONFIG_LKL_IRQ_KUNIT_TEST=y), which is what the lkl_irq test in boot.c covers. Signed-off-by: Joseph <joseph@josephnef.dev>
lkl_trigger_irq is documented as callable "from arbitrary host threads" (the comment block above the function). True host pthreads — libusb completion threads, glibc SIGEV_THREAD timer callbacks, anything created by a backend library that ends up posting an IRQ into the LKL kernel — acquire the LKL CPU via lkl_cpu_get but never go through __switch_to. _current_thread_info therefore still points at whichever kernel task last ran (often the idle task), and that task's irqs_enabled field may be ARCH_IRQ_DISABLED. Honoring that stale flag for host-thread callers was a silent IRQ- pending hang: the IRQ got marked pending, but nothing on the kernel side would notice until the matching irqrestore in the original kernel context — which often never came, because the kernel had already moved on. Drivers that post IRQs from libusb's event thread (rtw88's USB completion path is the original report; any virtio backend with a thread-based notification scheme has the same exposure) hung the kernel within tens of operations. Detect host-thread callers by comparing thread_self() to the thread_info owner's tid via lkl_ops->thread_equal. When they differ, the caller is not the kernel thread that owns this thread_info; the stale irqs_enabled field has no claim on us, and we should deliver the IRQ. Kernel-thread callers — including the recursive "lkl_trigger_irq -> IRQ -> softirq -> lkl_trigger_irq" path called out in the original comment — continue to honor their own per-thread irqs_enabled (set by the previous patch). Tested against the lkl-wifi-poc project: rtw88_8812au's libusb-fed URB completion path ran ~50 transfers before this fix and ~8000 after, with no observed regression in the existing test suite. Signed-off-by: Joseph <joseph@josephnef.dev>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
There are two related correctness bugs in
arch/lkl/kernel/irq.c'shandling of IRQ-enable state. Both manifest as silent timer-IRQ
stalls (jiffies stops advancing, msleep-based kthreads freeze)
under workloads that the existing test suite doesn't exercise but
that real Linux drivers running in LKL hit immediately.
Bug 1: global flag, no per-thread save/restore.
irqs_enabledisa single
static boolglobal;__switch_todoesn't save/restore itacross context switches. Any kernel path that does
spin_lock_irqsaveand schedules before the matching_irqrestore(the canonical pattern around
wait_event_*) leaks the DISABLEDvalue to whatever thread runs next.
lkl_trigger_irqsees theleaked DISABLED and silently pends every IRQ including the timer.
Bug 2: host-thread callers honoring stale thread_info.
lkl_trigger_irqis documented as callable "from arbitrary hostthreads." Host pthreads (libusb completion callbacks, glibc
SIGEV_THREADtimer callbacks) acquire the LKL CPU vialkl_cpu_getbut never go through__switch_to, so_current_thread_infostill points at whichever kernel task lastran (often the idle task, with IRQs DISABLED). Checking that stale
field for host callers silently pends every host-injected IRQ.
Context where this was hit
A proof-of-concept that runs the mainline
rtw88_8812auUSB Wi-Fidriver entirely in userspace via LKL. The host program links
liblkl.a, registers a virtual USB host controller into thein-process kernel (a ~470-line HCD shim under
drivers/usb/host/-style that translatesstruct urbto a flatview and back), and forwards URBs to libusb running on the host
process. The kernel side sees a normal USB Wi-Fi device;
rtw88_8812auprobes verbatim, downloads firmware, bringswlan0up; the host then calls
lkl_if_up(wlan0)and opens an AF_PACKETsocket to capture 802.11 frames with radiotap headers. No kernel
module on the host, no CAP_NET_ADMIN — the kernel driver source is
reused as-is.
This is heavily threaded: libusb runs its own event thread that
posts URB completions back into the LKL kernel (Bug 2's path),
kernel-side polling kthreads drain those completions, the rtw88
driver uses the standard
spin_lock_irqsave+wait_event_*/msleeppatterns throughout firmware download and chip init(Bug 1's path). Both bugs reproduce reliably: probe hangs in ~50
USB control transfers on master; with this series applied, the
same probe runs ~8000 control transfers and
wlan0comes upcleanly.
What the series does
Three commits, in order:
lkl: add KUnit test for per-thread
irqs_enabledisolation —New
arch/lkl/kernel/irq_test.cKUnit suite, gated on a newCONFIG_LKL_IRQ_KUNIT_TEST, wired into the existingkunit=yesbuild config in
tools/lkl/Makefile.autoconfand intotools/lkl/tests/boot.caskunit_irq(mirrors the existingkunit_pci/kunit_mmu). The test takesspin_lock_irqsaveinone kthread, yields via
schedule_timeoutwhile the lock is held,and from a sibling kthread observes
arch_local_save_flags.Fails on master; passes after commit 2.
lkl: make
irqs_enabledper-thread viacurrent_thread_info()— Move
irqs_enabledfrom astatic boolglobal in irq.c into afield on
struct thread_info; access viacurrent_thread_info()from
arch_local_save_flags/arch_local_irq_restore/lkl_trigger_irq. No explicit save/restore code is added —the existing
_current_thread_info = task_thread_info(next)linein
__switch_tois the entire mechanism. IRQ-enable state moveswith its thread, the same way a real CPU's register file does.
lkl: deliver IRQs from host-thread callers regardless of
irqs_enabled— Inlkl_trigger_irq, detect host-threadcallers by
lkl_ops->thread_equal(ti->tid, lkl_ops->thread_self())and deliver unconditionally; kernel-thread callers continue to
honor their own per-thread
irqs_enabled.The split is for review tractability — commit 2 is a pure refactor
that changes a global into a per-thread field; commit 3 is the
semantic change for host callers (commented inline as the root cause
of the host-thread mode).
Test plan
kunitlane (tools/lkl/Makefile.autoconfdefineskunit_test_enable— this series extends it to also setLKL_IRQ_KUNIT_TEST=y). On the patched branch, thelkl_test_kunit_irqentry inboot.c'stests[]should pass viathe standard
ok N lkl_irqline in the boot log.linux/windows-2022/clang-build/mmu_kasanlanes— no behavioural changes from this series with
kunit=no. Thecurrent_thread_info()->irqs_enabledfield is initialized inINIT_THREAD_INFOandinit_ti; pre-existing tests should beunaffected.
kunitlane — the
kunit_irqtest should reportnot ok(proves thetest exercises the bug, not just a tautology against the fix).
described in the Summary: rtw88-family driver (aircrack-ng's
88XXaucovers all three chips) validated locally on RTL8812AU(
0bda:8812), RTL8821AU (2357:0120), and RTL8814AU (0bda:8813)—
wlan0appears in 30-60 s, AF_PACKET capture returns frames.All hang on the un-patched kernel.
Why no automated test for commit 3
The KUnit suite in commit 1 reproduces the kernel-thread mode of the
bug. The host-thread mode (commit 3 fix) requires a real host pthread
to acquire the LKL CPU and invoke
lkl_trigger_irq— easily exposedby a libusb-style backend that posts URB completions from its event
thread, but not appropriate to wire into an in-tree test. If
maintainers want a KUnit-level test for commit 3, one could be added
that uses
lkl_host_ops->thread_createto spawn a host pthread, hasit
lkl_cpu_get+lkl_trigger_irq, and checks that an IRQ handlerruns synchronously. Happy to add this if requested.
Notes
The patch series is small (~70 lines net of code, plus the KUnit
test and its wiring). Commit messages on each patch are self-contained
and explain the rationale. Four save/restore variants were tried
before settling on the
current_thread_info()shape used here — v1(save outgoing + load incoming), v2 (load incoming only), v3 (force-
enable on switch), v4 (this one, no explicit save/restore — the
thread_info field handles it). Variants v1-v3 either regressed the
existing test suite or only partially fixed the bug. v4 is the
shape that is both minimal and complete.