Skip to content

dkms: remove CONFIG_DMABUF_MOVENOTIFY gate for P2P enablement#210

Open
dbsanfte wants to merge 1 commit intoROCm:masterfrom
dbsanfte:fix/pex-pcie-switch-p2p-enable
Open

dkms: remove CONFIG_DMABUF_MOVENOTIFY gate for P2P enablement#210
dbsanfte wants to merge 1 commit intoROCm:masterfrom
dbsanfte:fix/pex-pcie-switch-p2p-enable

Conversation

@dbsanfte
Copy link
Copy Markdown

@dbsanfte dbsanfte commented Mar 29, 2026

Motivation

This PR is to remove a config gate that blocks P2P enablement for GPUs behind PCIe switches on mainline kernels (e.g. Ubuntu).

I found this was necessary to enable P2P communication between my Mi50s on Ubuntu 24.04.

Technical Details

NOTE: Summary by Opus 4.6

The CONFIG_DMABUF_MOVENOTIFY kernel config option was an AMD out-of-tree config that existed in older patched kernels. Since mainline kernel ~5.12, the DMA-buf move_notify callback is built-in unconditionally (part of struct dma_buf_ops) with no separate Kconfig gate.

On mainline kernels (tested on 6.14), CONFIG_DMABUF_MOVENOTIFY is never defined, which causes CONFIG_HSA_AMD_P2P to be disabled even when CONFIG_PCI_P2PDMA=y. This prevents GPU-to-GPU P2P access through PCIe switches (e.g., Broadcom PEX88096) because the IOMMU remap check in amdgpu_device_is_peer_accessible() is compiled out entirely.

Without CONFIG_HSA_AMD_P2P, the driver falls back to raw DMA mask address checking which fails for GPUs behind PCIe switches where BAR addresses (e.g., 62 TiB) exceed the GPU's 44-bit DMA mask, even though IOMMU remapping would make P2P work correctly.

The fix removes the CONFIG_DMABUF_MOVENOTIFY inner check, keeping only the CONFIG_PCI_P2PDMA gate which is the actual functional requirement for PCIe peer-to-peer DMA support.

Test Plan

Tested on:

  • 2x AMD Instinct MI50 32GB behind Broadcom PEX88096 Gen4 switch
  • Kernel 6.14.0-37-generic (Ubuntu mainline)
  • ROCm 6.4.2 with amdgpu DKMS 6.12.12

Test Result

Verified: KFD p2p_links, hipDeviceCanAccessPeer, P2P memcpy, rocm-bandwidth-test bidirectional P2P all functional after fix

Submission Checklist

The CONFIG_DMABUF_MOVENOTIFY kernel config option was an AMD out-of-tree
config that existed in older patched kernels. Since mainline kernel ~5.12,
the DMA-buf move_notify callback is built-in unconditionally (part of
struct dma_buf_ops) with no separate Kconfig gate.

On mainline kernels (tested on 6.14), CONFIG_DMABUF_MOVENOTIFY is never
defined, which causes CONFIG_HSA_AMD_P2P to be disabled even when
CONFIG_PCI_P2PDMA=y. This prevents GPU-to-GPU P2P access through PCIe
switches (e.g., Broadcom PEX88096) because the IOMMU remap check in
amdgpu_device_is_peer_accessible() is compiled out entirely.

Without CONFIG_HSA_AMD_P2P, the driver falls back to raw DMA mask address
checking which fails for GPUs behind PCIe switches where BAR addresses
(e.g., 62 TiB) exceed the GPU's 44-bit DMA mask, even though IOMMU
remapping would make P2P work correctly.

The fix removes the CONFIG_DMABUF_MOVENOTIFY inner check, keeping only
the CONFIG_PCI_P2PDMA gate which is the actual functional requirement
for PCIe peer-to-peer DMA support.

Tested on:
- 2x AMD Instinct MI50 32GB behind Broadcom PEX88096 Gen4 switch
- Kernel 6.14.0-37-generic (Ubuntu mainline)
- ROCm 6.4.2 with amdgpu DKMS 6.12.12
- Verified: KFD p2p_links, hipDeviceCanAccessPeer, P2P memcpy,
  rocm-bandwidth-test bidirectional P2P all functional after fix

Signed-off-by: Daniel Sanfte <dbsanfte@users.noreply.github.com>
@kentrussell
Copy link
Copy Markdown
Collaborator

Looks like the fix got dropped during one of the kernel rebases. Amazing that no one has noticed it since it got missed back in 6.10. It should be MOVE_NOTIFY. I've reached out to the KCL team to get this fix brought back in. You can try to apply it yourself by changing CONFIG_DMABUF_MOVENOTIFY to CONFIG_DMABUF_MOVE_NOTIFY, and that should get you unblocked.

I'll leave this open until we get a release done that fixes it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants