Skip to content

Add minimal plumbing to serialize sysfs information for ib_write_bw#13297

Open
atoniolo76 wants to merge 1 commit into
google:masterfrom
modal-labs:alessio/serialize-rdma-sysfs-boot-minimal
Open

Add minimal plumbing to serialize sysfs information for ib_write_bw#13297
atoniolo76 wants to merge 1 commit into
google:masterfrom
modal-labs:alessio/serialize-rdma-sysfs-boot-minimal

Conversation

@atoniolo76
Copy link
Copy Markdown

@atoniolo76 atoniolo76 commented May 27, 2026

Based on #13114, this PR provides the minimal devices + sysfs information necessary to get ib_write_bw working between two gVisor containers (relies on later rdmaproxy PRs):

Screenshot 2026-05-27 at 12 21 09 PM

The serialized JSON fields per device contain basic device identity information like name, abi_version, node_type, node_guid, dev, ibdev, fw_ver, and sys_image_guid. Modalias provides the PCI information that libibverbs uses to match a device to a user space driver plugin (e.g. mlx5). PCI attributes about the NIC device include pci_slot_name, pci_driver, pci_class, pci_vendor, pci_device, pci_subsys_vendor, and pci_subsys_device. numa_node and local_cpulist detail which NUMA node and CPUs the NIC is closest to. Finally, the ports array includes various per-port information.

verbs_abi_version tells libibverbs which version of the kernel uverbs API is supported. This file lives globally in sys/class/infiniband_verbs/abi_version.

Files added:

  • pkg/sentry/fsimpl/sys/rdma.go contain the RDMA-specific data-types, host sysfs collection, and interface for serialization -> deserialization -> reconstruction of virtual sysfs tree.
  • runsc/specutils/rdma.go gates the entire serialization process on the presence of /dev/infiniband/uverbs* in the OCI spec.

Files modified:

  • runsc/cmd/chroot.go executes collection of RDMA information from sysfs before pivot_root. Data is serialized to /var/lib/gvisor/rdma_data.json inside the chroot.
  • runsc/cmd/boot.go deserializes the collected data from /var/lib/gvisor/rdma_data.json and is passed into the boot args RDMADevices.
  • runsc/boot/loader.go stores the RDMADevices args on the Loader struct.
  • runsc/boot/vfs.go passes rdmaDevices into sys.InternalData.
  • pkg/sentry/fsimpl/sys/sys.go reads sys.InternalData and calls rdma.go to build the /sys/class/infiniband_verbs/ and /sys/class/infiniband/ trees.
  • runsc/container/container.go tells the gofer to create RDMA devices when the OCI spec lists uverbs devices.
  • runsc/cmd/sandboxsetup/gofer_mount.go adds ShouldExposeRDMADevice to the gofer's device filer for bind-mounting the uverbs device nodes when /dev/infiniband/uverbs* is included in the OCI spec.

@google-cla
Copy link
Copy Markdown

google-cla Bot commented May 27, 2026

Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

View this failed invocation of the CLA check for more information.

For the most up to date status, view the checks section at the bottom of the pull request.

@trantoji trantoji self-requested a review May 27, 2026 18:38
@trantoji
Copy link
Copy Markdown
Contributor

Very Nice, thank you.

@trantoji trantoji requested a review from manninglucas May 27, 2026 18:41
… ib_write_bw and reconstruct in container. Gate on RDMAProxy config flag.
@atoniolo76 atoniolo76 force-pushed the alessio/serialize-rdma-sysfs-boot-minimal branch from 27977f3 to d34105a Compare May 27, 2026 20:10
@atoniolo76 atoniolo76 marked this pull request as ready for review May 29, 2026 19:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants