Lock down benchmark manager memory duing kernel compile by ngc92 · Pull Request #38 · gpu-mode/pygpubench

ngc92 · 2026-03-21T21:48:51Z

This moves kernel compilation to a separate thread. While that thread is running, we can make the BenchmarkManager's memory completely inaccessible, and ensure with seccomp that is stays that way. Because seccomp is per thread, the original thread (withno user code running yet) can unprotect the memory before it is being used.

We should be able to set up a similar scheme also for warmup. For the actual measurements, I suspect we need to be less strict because we do need to record information to memory.

Copilot

Pull request overview

This PR hardens the benchmark runner against in-process tampering during kernel import/compilation by moving kernel import into a separate thread and temporarily making the BenchmarkManager memory inaccessible, backed by a per-thread seccomp filter to prevent re-mapping/protection changes to that memory. It also adds an optional “exploit canary” and several exploit scripts to validate the hardening.

Changes:

Add BenchmarkManager::get_kernel() which mprotect(PROT_NONE)s the manager’s pages during kernel import in a separate thread, and installs a thread-specific seccomp filter for that page range.
Introduce csrc/seccomp.cpp implementing a small BPF builder + seccomp_protect_page_range().
Add ENABLE_EXPLOIT_TARGET build option, canary field, and exploit submissions to test writeability/overwrite attempts.

Reviewed changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated 6 comments.

Show a summary per file

File	Description
`csrc/manager.cpp`	Runs kernel import in a separate thread while `BenchmarkManager` pages are `PROT_NONE`; adds exploit-canary-driven output behavior.
`csrc/manager.h`	Declares `get_kernel()` and (optionally) an exploit canary member.
`csrc/seccomp.cpp`	New seccomp BPF filter generator to block memory syscalls targeting a protected address range.
`CMakeLists.txt`	Adds `ENABLE_EXPLOIT_TARGET` option and compiles `csrc/seccomp.cpp`.
`exploits/submission_test_canary_direct.py`	New exploit script attempting direct overwrite of the canary via scanning writable mappings.
`exploits/submission_test_canary_make_writeable.py`	New exploit script attempting `mprotect` then overwrite of the canary.
`exploits/submission_event_array.py`	New exploit script attempting event-handle tampering via scanning memory for vectors and overwriting handles.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

csrc/manager.cpp

Copilot · 2026-03-21T21:53:38Z

csrc/manager.cpp

+    nb::callable kernel;
+    // make the BenchmarkManager inaccessible
+    protect_range(reinterpret_cast<void*>(lo), hi - lo, PROT_NONE);
+    // TODO make stack inaccessible (may be impossible) or read-only during the call
+    // call the python kernel generation function from a different thread.
+
+    std::thread make_kernel_thread([&]() {
+        // new thread, new seccomp.
+        seccomp_protect_page_range(lo, hi - lo);
+        nb::gil_scoped_acquire guard;
+        kernel = kernel_from_qualname(qualname);
+    });
+
+    make_kernel_thread.join();
+    // make it accessible again. This is in the original thread, so the tightened seccomp
+    // policy does not apply here.
+    protect_range(reinterpret_cast<void*>(lo), hi - lo, PROT_READ | PROT_WRITE);
+    return kernel;


get_kernel() is not exception-safe w.r.t. memory protections: after mprotect(..., PROT_NONE), any failure (e.g., std::thread constructor throwing, or an early exception path you add later) can leave the BenchmarkManager pages permanently PROT_NONE, leading to hard-to-debug crashes. Use an RAII guard/scope-exit to always restore the original protections (and consider restoring before rethrowing thread errors).

csrc/manager.cpp

Copilot · 2026-03-21T21:53:38Z

csrc/seccomp.cpp

+    // prevent messing with the protected page range
+    b.block_syscall_if_arg_in_range(__NR_mprotect,        0, lo, hi);
+    b.block_syscall_if_arg_in_range(__NR_mmap,            0, lo, hi);
+    b.block_syscall_if_arg_in_range(__NR_mremap,          0, lo, hi);
+    b.block_syscall_if_arg_in_range(__NR_munmap,          0, lo, hi);
+    b.block_syscall_if_arg_in_range(__NR_madvise,         0, lo, hi);
+    b.block_syscall_if_arg_in_range(__NR_remap_file_pages,0, lo, hi);
+


The seccomp checks only test whether the start address argument falls in [lo, hi). For range-based syscalls like mprotect(addr, len, ...), munmap(addr, len), madvise(addr, len, ...), an attacker can bypass this by choosing addr < lo with a len large enough to overlap into the protected pages (e.g., mprotect(lo - 4096, 8192, ...)). Update the filter logic to block any syscall whose affected range overlaps [lo, hi), i.e., (addr < hi) && (addr+len > lo), using BPF ALU ops / additional arg checks as needed.

Copilot · 2026-03-21T21:53:38Z

csrc/seccomp.cpp

+    auto prog = b.build();
+    prctl(PR_SET_NO_NEW_PRIVS, 1, 0, 0, 0);
+    syscall(__NR_seccomp, SECCOMP_SET_MODE_FILTER, 0, &prog);
+}


seccomp_protect_page_range() ignores failures from prctl(PR_SET_NO_NEW_PRIVS, ...) and the seccomp syscall. If either call fails, the thread will continue without the intended protection. Check return codes and throw/abort with a clear error so the caller can fail closed rather than silently running unprotected.

Copilot · 2026-03-21T21:53:38Z

csrc/seccomp.cpp

+    }
+
+    struct sock_fprog build() {
+        return { .len = (unsigned short)Instructions.size(), .filter = Instructions.data() };


BpfBuilder::build() uses C++20 designated initializers ({ .len = ..., .filter = ... }), but the project is configured for C++17 (CMAKE_CXX_STANDARD 17). This will fail to compile on conforming C++17 toolchains. Use aggregate initialization without designators (in field order) or explicitly assign into a sock_fprog local before returning.

Suggested change

return { .len = (unsigned short)Instructions.size(), .filter = Instructions.data() };

return { static_cast<unsigned short>(Instructions.size()), Instructions.data() };

ngc92 added 3 commits March 21, 2026 05:56

overwrite the array of cuda events with custom ones

d6ff96c

protect benchmark manager during compile phase

f1a4376

add dedicated canary for memory protection tests

5df7839

ngc92 requested a review from Copilot March 21, 2026 21:48

Copilot started reviewing on behalf of ngc92 March 21, 2026 21:49 View session

Copilot AI reviewed Mar 21, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Lock down benchmark manager memory duing kernel compile#38

Lock down benchmark manager memory duing kernel compile#38
ngc92 wants to merge 3 commits intomasterfrom
stack-exploit

ngc92 commented Mar 21, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Copilot AI Mar 21, 2026

Uh oh!

Uh oh!

Copilot AI Mar 21, 2026

Uh oh!

Copilot AI Mar 21, 2026

Uh oh!

Copilot AI Mar 21, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

	return { .len = (unsigned short)Instructions.size(), .filter = Instructions.data() };
	return { static_cast<unsigned short>(Instructions.size()), Instructions.data() };

Conversation

ngc92 commented Mar 21, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Copilot AI Mar 21, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Copilot AI Mar 21, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 21, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 21, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants