libnvme: NVMe-oF controller ownership registry#3400
Conversation
|
just a quick thoughs as discussed on the matrix channel:
|
|
@igaw - Thanks for the review! On the context/owner naming: agreed, they are the same concept. I already have a follow-up planned that addresses this, so I'd like to handle it there rather than here to avoid a mid-flight rename. On the parallel access test and plugin conversion: both are good ideas. I'd like to keep this PR focused on the core feature introduction and handle those in a follow-up as well, if that's acceptable to you. The concurrent registry access should be safe, but I agree that a test that proves it is worth having. I see that there are conflicts with the PR that I need to resolve. Once I'm done fixing the conflicts, would you be OK merging this as-is with those items tracked as follow-up work? |
Introduce a cooperative registry that records which orchestrator (e.g. nvme-stas, nbft, nvme-discoverd) owns each connected NVMe-oF controller. The registry lives under /run/nvme/registry/ as one JSON file per live controller (e.g. nvme3.json). New public API (fabrics + json-c only): libnvmf_ctx_set_owner() - set the owner name on the global context libnvmf_registry_retrieve() - read a field from a controller's entry libnvmf_registry_update() - write/update a field atomically libnvmf_registry_delete() - remove a controller's entry Internal API (to be called from the connect path in fabrics.c): libnvmf_registry_create() - create a fresh entry on connect Writes are atomic: a temp file is created with mkstemp(), synced, then renamed over the destination in a single syscall. registry_dir() is a weak symbol so test binaries can redirect I/O to a mkdtemp() directory without any runtime conditionals in production code. The owner field on struct libnvme_global_ctx is guarded by CONFIG_FABRICS and freed in libnvme_free_global_ctx(). Signed-off-by: Martin Belanger <martin.belanger@dell.com> Assisted-by: Claude:claude-sonnet-4-6 [Claude Code]
When an owner name is set in the global context, a registry entry is automatically written to /run/nvme/registry/ after every successful fabrics connect. Signed-off-by: Martin Belanger <martin.belanger@dell.com> Assisted-by: Claude:claude-sonnet-4-6 [Claude Code]
Add four new built-in commands for managing the NVMeoF controller ownership registry: registry-list List all live registry entries registry-retrieve Read a field from a controller's entry registry-update Write a field to a controller's entry registry-delete Remove a controller's entry Modify disconnect-all to be ownership-aware by default: only controllers with no registry entry are disconnected. --owner NAME limits disconnection to controllers owned by NAME. --force restores the original unconditional behavior. --force and --owner are mutually exclusive. Signed-off-by: Martin Belanger <martin.belanger@dell.com> Assisted-by: Claude:claude-sonnet-4-6 [Claude Code]
Extend GlobalCtx with an owner parameter and read-only property. Add four standalone module-level functions: registry_retrieve(device) read entry as a dict; None if absent registry_update(device, key, value) write a field (creates entry if absent) registry_delete(device) remove an entry registry_entries() generator yielding live entries as dicts Signed-off-by: Martin Belanger <martin.belanger@dell.com> Assisted-by: Claude:claude-sonnet-4-6 [Claude Code]
Add an Orchestrator column to the controllers section of 'nvme list -v'. The column shows which orchestrator manages each controller: kernel PCIe and apple-nvme controllers (kernel-enumerated) stas/nbft/… Fabrics controllers registered in the ownership registry - Unowned fabrics controllers (not in the registry) Signed-off-by: Martin Belanger <martin.belanger@dell.com> Assisted-by: Claude:claude-sonnet-4-6 [Claude Code]
When the kernel removes an NVMe-oF controller, we need a way to remove entries from the registry to prevent accumulation of stale entries. The rule is gated on want_fabrics: it is absent in no-fabrics builds since only NVMe-oF controllers ever have registry entries. Signed-off-by: Martin Belanger <martin.belanger@dell.com> Assisted-by: Claude:claude-sonnet-4-6 [Claude Code]
Add man pages for the four new registry subcommands: nvme-registry-list(1) nvme-registry-retrieve(1) nvme-registry-update(1) nvme-registry-delete(1) Update nvme-disconnect-all(1) to document the new ownership-aware default behavior and the --owner, --force, and --transport options. The --transport option was already implemented but undocumented. Signed-off-by: Martin Belanger <martin.belanger@dell.com> Assisted-by: Claude:claude-sonnet-4-6 [Claude Code]
|
yes, I don't have an issue to go ahead as soon the core concept and API are sound. Adding tests cases later shouldn't really be a problem. |
This PR addresses: #2913
Multiple independent orchestrators (nvme-stas, NBFT, nvme-discoverd, manual nvme-cli) can establish NVMe-oF controller connections on the same host with no coordination.
nvme disconnect-allis indiscriminate and dangerous in such environments.This series introduces a cooperative userspace ownership registry so orchestrators can declare and respect ownership boundaries. Full design rationale and decisions are documented in: nvme-registry-plan.md.
What's new
libnvmf_registry_retrieve/update/delete/for_each), automatic registration on connect whenlibnvmf_context_set_owner()is set, Python bindings, unit tests70-nvmf-registry.rulesremoves stale entries on controller removal--ownerand--forceoptions"nbft"Commits
libnvme: add NVMe controller ownership registry(78c9055)libnvmf: register controller ownership on successful connect(6892470)nvme: add registry CLI commands and ownership-aware disconnect-all(1c4dbdf)libnvme: expose registry API and owner in Python bindings(43818fb)nvme: show orchestrator ownership in verbose list output(dc0823e)libnvme: install udev rule to clean up registry on controller removal(fa9e34b)nvme: add man pages for registry commands and update disconnect-all(e5f500c)