net: add a new flag NET_FLAG_INCLUDE_VNET_HEADER #493

akerouanton · 2025-12-22T06:05:15Z

This flag should be used to indicate to libkrun that downstream network backend wants to receive and transmit the virtio-net header along with Ethernet frames.

Network backends using this flag can then forward unmodified headers to another VM or build a sensible virtio_net_hdr (e.g. with GSO fields correctly set) such that receiving VM handles GSO'd frames properly.

mtjhrc

Thanks for PR, it certainly makes sense to expose the header to interested backends.
I am curious though, could you share more detail, what backend you would use with this?

Please fix the mentioned confusion between stream/datagram sockets and features/flags. Also I guess it would be nice to also implement this for stream sockets too, but not strictly necessary.

mtjhrc · 2025-12-22T14:29:43Z

src/libkrun/src/lib.rs

+    if (features & !(NET_ALL_FEATURES | NET_FLAG_INCLUDE_VNET_HEADER)) != 0 {
        return -libc::EINVAL;
    }


This check is incorrect on 2 levels:

as it stands this PR only adds support for this feature for unix datagrams, not streams (which this function is configuring) .

the NET_FLAG_INCLUDE_VNET_HEADER is a flag not a feature (as correctly in the krun_add_net_unixgram).

mtjhrc · 2025-12-22T15:44:58Z

src/libkrun/src/lib.rs

        return -libc::EINVAL;
    }

+    let include_vnet_header: bool = flags & NET_FLAG_INCLUDE_VNET_HEADER != 0;


As implied above, include_vnet_header, should always be false here, because it's not implemented for the stream socket.

mtjhrc · 2025-12-22T15:48:32Z

src/libkrun/src/lib.rs

    let backend = if let Some(path) = path {
        VirtioNetBackend::UnixgramPath(path, send_vfkit_magic)
    } else {
        VirtioNetBackend::UnixgramFd(fd)
    };


If you only implement the flag for the unixgram sockets, I think it would be better to put the bool inside the enum (like send_vfkit_magic is).

Note: I am not sure why send_vfkit_magic would not passed into VirtioNetBackend::UnixgramFd, that seems like a bug.

akerouanton · 2025-12-23T10:20:43Z

I am curious though, could you share more detail, what backend you would use with this?

I tried for a couple hours to get a kernel stacktrace to pinpoint exactly where's the issue and why this PR fixes it, but to no avail. Let me describe that with plain words instead, but let me know if you want something more concrete.

I don't plan to use this flag with any existing backend right now. I'm working on a container runtime that leverages https://github.com/containerd/nerdbox and has its own virtual switch that connects multiple VMs. Every VM gets one or more virtio-net devices, and each interface is connected to a bridge interface (i.e. ip link set master). That bridge has a veth connected to it too (and at some point, it'll get multiple of them but nerdbox doesn't support multiple-containers-per-VM yet).

With TSO enabled, VM-to-VM connections are working correctly — packets are GSO'd on the sending side, and GRO'd on the receiving side. However, container-to-container connections are broken. After some debugging, I realized that disabling GSO on the sending veth fixed the issue and that was probably caused by invalid GSO fields in virtio-net headers.

I believe that with the above topology GRO is supposed to happen at the veth level (although I might be wrong), and missing GSO fields would cause skbs to have some fields unset, preventing GRO to happen.

With this change, my virtual switch is forwarding virtio-net headers sent by the sending side without recomputing L4 checksums (which was required for VM-to-VM scenarios to work), and it fixes container-to-container communications too. I also think that this should help with host-to-[VM|container] communications as it'd allow to send large frames with partial checksums.

Please fix the mentioned confusion between stream/datagram sockets and features/flags. Also I guess it would be nice to also implement this for stream sockets too, but not strictly necessary.

Sorry for the nasty mistakes, let me fix them!

This flag should be used to indicate to libkrun that downstream network backend wants to receive and transmit the virtio-net header along with Ethernet frames. Network backends using this flag can then forward unmodified headers to another VM or build a sensible virtio_net_hdr (e.g. with GSO fields correctly set) such that receiving VM handles GSO'd frames properly. Signed-off-by: Albin Kerouanton <albin.kerouanton@docker.com>

akerouanton · 2025-12-23T17:57:37Z

@mtjhrc I added include_vnet_header to all network backends but to keep the bespoke 'frame length' header used by the unixstream backend, I had to extend NetWorker's TX buffer by 4 bytes. The vnet header is now located at tx_frame_buf[4..14].

Unixstream either writes the frame length to 0..4 or 10..14 dependening on whether the vnet header is included. Other backends have been modified to ignore the first four bytes.

The Tap backend also takes an include_vnet_header to keep it aligned with other backends. But since vnet headers have always been included, there's no way to make that configurable without introducing a breaking change in the API. Maybe something to fix in a major API version, or via a new API method, or maybe it's not worth it (in that case I can remove my change if you prefer).

Anyway, this should be in a better shape. PTAL.

akerouanton requested review from MatiasVara, jakecorrenti, mtjhrc, slp and tylerfanelli as code owners December 22, 2025 06:05

akerouanton mentioned this pull request Dec 22, 2025

vm networking: add flag vnet_hdr containerd/nerdbox#78

Draft

mtjhrc requested changes Dec 22, 2025

View reviewed changes

akerouanton force-pushed the pass-virtionet-hdr-to-backend branch 2 times, most recently from db5988b to 420da29 Compare December 23, 2025 17:38

akerouanton force-pushed the pass-virtionet-hdr-to-backend branch from 420da29 to ab2fe8b Compare December 23, 2025 17:56

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

net: add a new flag NET_FLAG_INCLUDE_VNET_HEADER #493

net: add a new flag NET_FLAG_INCLUDE_VNET_HEADER #493

Uh oh!

akerouanton commented Dec 22, 2025

Uh oh!

mtjhrc left a comment

Uh oh!

mtjhrc Dec 22, 2025

Uh oh!

mtjhrc Dec 22, 2025

Uh oh!

mtjhrc Dec 22, 2025

Uh oh!

akerouanton commented Dec 23, 2025

Uh oh!

akerouanton commented Dec 23, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

net: add a new flag NET_FLAG_INCLUDE_VNET_HEADER #493

Are you sure you want to change the base?

net: add a new flag NET_FLAG_INCLUDE_VNET_HEADER #493

Uh oh!

Conversation

akerouanton commented Dec 22, 2025

Uh oh!

mtjhrc left a comment

Choose a reason for hiding this comment

Uh oh!

mtjhrc Dec 22, 2025

Choose a reason for hiding this comment

Uh oh!

mtjhrc Dec 22, 2025

Choose a reason for hiding this comment

Uh oh!

mtjhrc Dec 22, 2025

Choose a reason for hiding this comment

Uh oh!

akerouanton commented Dec 23, 2025

Uh oh!

akerouanton commented Dec 23, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants