-
Notifications
You must be signed in to change notification settings - Fork 105
Bpftool sync 2025-12-17 #231
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
qmonnet
merged 7 commits into
libbpf:main
from
qmonnet:bpftool-sync-2025-12-17T00-49-32.899Z
Dec 17, 2025
Merged
Bpftool sync 2025-12-17 #231
qmonnet
merged 7 commits into
libbpf:main
from
qmonnet:bpftool-sync-2025-12-17T00-49-32.899Z
Dec 17, 2025
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
ERR_get_error_all()[1] is a openssl v3 API, so to make code compatible with openssl v1 utilize ERR_get_err_line_data instead. Since openssl is already a build requirement for the kernel (minimum requirement openssl 1.0.0), this will allow bpftool to compile where opensslv3 is not available. Signing-related BPF selftests pass with openssl v1. [1] https://docs.openssl.org/3.4/man3/ERR_get_error/ Fixes: 40863f4d6ef2 ("bpftool: Add support for signing BPF programs") Signed-off-by: Alan Maguire <alan.maguire@oracle.com> Acked-by: Song Liu <song@kernel.org> Acked-by: Quentin Monnet <qmo@kernel.org> Link: https://lore.kernel.org/r/20251120084754.640405-2-alan.maguire@oracle.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
If a socket has sk->sk_bypass_prot_mem flagged, the socket opts out
of the global protocol memory accounting.
This is easily controlled by net.core.bypass_prot_mem sysctl, but it
lacks flexibility.
Let's support flagging (and clearing) sk->sk_bypass_prot_mem via
bpf_setsockopt() at the BPF_CGROUP_INET_SOCK_CREATE hook.
int val = 1;
bpf_setsockopt(ctx, SOL_SOCKET, SK_BPF_BYPASS_PROT_MEM,
&val, sizeof(val));
As with net.core.bypass_prot_mem, this is inherited to child sockets,
and BPF always takes precedence over sysctl at socket(2) and accept(2).
SK_BPF_BYPASS_PROT_MEM is only supported at BPF_CGROUP_INET_SOCK_CREATE
and not supported on other hooks for some reasons:
1. UDP charges memory under sk->sk_receive_queue.lock instead
of lock_sock()
2. Modifying the flag after skb is charged to sk requires such
adjustment during bpf_setsockopt() and complicates the logic
unnecessarily
We can support other hooks later if a real use case justifies that.
Most changes are inline and hard to trace, but a microbenchmark on
__sk_mem_raise_allocated() during neper/tcp_stream showed that more
samples completed faster with sk->sk_bypass_prot_mem == 1. This will
be more visible under tcp_mem pressure (but it's not a fair comparison).
# bpftrace -e 'kprobe:__sk_mem_raise_allocated { @start[tid] = nsecs; }
kretprobe:__sk_mem_raise_allocated /@start[tid]/
{ @EnD[tid] = nsecs - @start[tid]; @times = hist(@EnD[tid]); delete(@start[tid]); }'
# tcp_stream -6 -F 1000 -N -T 256
Without bpf prog:
[128, 256) 3846 | |
[256, 512) 1505326 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@|
[512, 1K) 1371006 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ |
[1K, 2K) 198207 |@@@@@@ |
[2K, 4K) 31199 |@ |
With bpf prog in the next patch:
(must be attached before tcp_stream)
# bpftool prog load sk_bypass_prot_mem.bpf.o /sys/fs/bpf/test type cgroup/sock_create
# bpftool cgroup attach /sys/fs/cgroup/test cgroup_inet_sock_create pinned /sys/fs/bpf/test
[128, 256) 6413 | |
[256, 512) 1868425 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@|
[512, 1K) 1101697 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ |
[1K, 2K) 117031 |@@@@ |
[2K, 4K) 11773 | |
Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Acked-by: Roman Gushchin <roman.gushchin@linux.dev>
Link: https://patch.msgid.link/20251014235604.3057003-6-kuniyu@google.com
Add a new BPF command BPF_PROG_ASSOC_STRUCT_OPS to allow associating a BPF program with a struct_ops map. This command takes a file descriptor of a struct_ops map and a BPF program and set prog->aux->st_ops_assoc to the kdata of the struct_ops map. The command does not accept a struct_ops program nor a non-struct_ops map. Programs of a struct_ops map is automatically associated with the map during map update. If a program is shared between two struct_ops maps, prog->aux->st_ops_assoc will be poisoned to indicate that the associated struct_ops is ambiguous. The pointer, once poisoned, cannot be reset since we have lost track of associated struct_ops. For other program types, the associated struct_ops map, once set, cannot be changed later. This restriction may be lifted in the future if there is a use case. A kernel helper bpf_prog_get_assoc_struct_ops() can be used to retrieve the associated struct_ops pointer. The returned pointer, if not NULL, is guaranteed to be valid and point to a fully updated struct_ops struct. For struct_ops program reused in multiple struct_ops map, the return will be NULL. prog->aux->st_ops_assoc is protected by bumping the refcount for non-struct_ops programs and RCU for struct_ops programs. Since it would be inefficient to track programs associated with a struct_ops map, every non-struct_ops program will bump the refcount of the map to make sure st_ops_assoc stays valid. For a struct_ops program, it is protected by RCU as map_free will wait for an RCU grace period before disassociating the program with the map. The helper must be called in BPF program context or RCU read-side critical section. struct_ops implementers should note that the struct_ops returned may not be initialized nor attached yet. The struct_ops implementer will be responsible for tracking and checking the state of the associated struct_ops map if the use case expects an initialized or attached struct_ops. Signed-off-by: Amery Hung <ameryhung@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://lore.kernel.org/bpf/20251203233748.668365-3-ameryhung@gmail.com
Pull latest libbpf from mirror. Libbpf version: 1.7.0 Libbpf commit: afb8b17bc50b0b7606ad4ea468cbc9f5aede8dae Signed-off-by: Quentin Monnet <qmo@kernel.org>
Add support for deferred userspace unwind to perf. Where perf currently relies on in-place stack unwinding; from NMI context and all that. This moves the userspace part of the unwind to right before the return-to-userspace. This has two distinct benefits, the biggest is that it moves the unwind to a faultable context. It becomes possible to fault in debug info (.eh_frame, SFrame etc.) that might not otherwise be readily available. And secondly, it de-duplicates the user callchain where multiple samples happen during the same kernel entry. To facilitate this the perf interface is extended with a new record type: PERF_RECORD_CALLCHAIN_DEFERRED and two new attribute flags: perf_event_attr::defer_callchain - to request the user unwind be deferred perf_event_attr::defer_output - to request PERF_RECORD_CALLCHAIN_DEFERRED records The existing PERF_RECORD_SAMPLE callchain section gets a new context type: PERF_CONTEXT_USER_DEFERRED After which will come a single entry, denoting the 'cookie' of the deferred callchain that should be attached here, matching the 'cookie' field of the above mentioned PERF_RECORD_CALLCHAIN_DEFERRED. The 'defer_callchain' flag is expected on all events with PERF_SAMPLE_CALLCHAIN. The 'defer_output' flag is expect on the event responsible for collecting side-band events (like mmap, comm etc.). Setting 'defer_output' on multiple events will get you duplicated PERF_RECORD_CALLCHAIN_DEFERRED records. Based on earlier patches by Josh and Steven. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://patch.msgid.link/20251023150002.GR4067720@noisy.programming.kicks-ass.net
The kernel is now built with -fms-extensions. Anonymous structs or
unions permitted by these extensions have been used in several places,
and can end up in the generated vmlinux.h file, for example:
struct ns_tree {
[...]
};
[...]
struct ns_common {
[...]
union {
struct ns_tree;
struct callback_head ns_rcu;
};
};
Trying to include this header for compiling a tool may result in build
warnings, if the compiler does not expect these extensions. This is the
case, for example, with bpftool:
In file included from skeleton/pid_iter.bpf.c:3:
.../tools/testing/selftests/bpf/tools/build/bpftool/vmlinux.h:64057:3:
warning: declaration does not declare anything
[-Wmissing-declarations]
64057 | struct ns_tree;
| ^~~~~~~~~~~~~~
Fix these build warnings in bpftool by turning on Microsoft extensions
when compiling the two BPF programs that rely on vmlinux.h.
Reported-by: Alexei Starovoitov <ast@kernel.org>
Closes: https://lore.kernel.org/bpf/CAADnVQK9ZkPC7+R5VXKHVdtj8tumpMXm7BTp0u9CoiFLz_aPTg@mail.gmail.com/
Signed-off-by: Quentin Monnet <qmo@kernel.org>
Link: https://lore.kernel.org/r/20251208130748.68371-1-qmo@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Syncing latest bpftool commits from kernel repository. Baseline bpf-next commit: f8c67d8550ee69ce684c7015b2c8c63cda24bbfb Checkpoint bpf-next commit: 6f0b824a61f212e9707ff68abcabfdfa4724b811 Baseline bpf commit: e427054ae7bc8b1268cf1989381a43885795616f Checkpoint bpf commit: 1d528e794f3db5d32279123a89957c44c4406a09 Alan Maguire (1): bpftool: Allow bpftool to build with openssl < 3 Amery Hung (1): bpf: Support associating BPF program with struct_ops Kuniyuki Iwashima (1): bpf: Introduce SK_BPF_BYPASS_PROT_MEM. Peter Zijlstra (1): perf: Support deferred user unwind Quentin Monnet (1): bpftool: Fix build warnings due to MS extensions include/uapi/linux/bpf.h | 18 ++++++++++++++++++ include/uapi/linux/perf_event.h | 21 ++++++++++++++++++++- src/Makefile | 2 ++ src/sign.c | 6 ++++++ 4 files changed, 46 insertions(+), 1 deletion(-) Signed-off-by: Quentin Monnet <qmo@kernel.org>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Pull latest libbpf from mirror and sync bpftool repo with kernel, up to the commits used for libbpf sync. This is an automatic update performed by calling the sync script from this repo: