Add support for MutableCSINodeAllocatableCount#1066
Conversation
|
Skipping CI for Draft Pull Request. |
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
c9ac70a to
127f8d7
Compare
|
/kind enhancement /hold waiting for IaaS to clarify some API endpoints |
The CSI list's all PCIe devices that are not of type VIRTIO_BLOCK_DEVICE and subtracts them from the theoretically maximum, so kubernetes can report a correct dynamic max volume count that can be attached for each node. Signed-off-by: Niclas Schad <niclas.schad@stackit.cloud>
b5f9b20 to
5fe6f7a
Compare
Signed-off-by: Niclas Schad <niclas.schad@stackit.cloud>
Signed-off-by: Niclas Schad <niclas.schad@stackit.cloud>
breuerfelix
left a comment
There was a problem hiding this comment.
This is a really complex PR, few lines of code, but you really have to understand why and what happens here.
Does it make sense to simulate this setup in an integration test with multiple different flavors ?
| klog.V(4).Infof("Determined node to support %d volumes", maxVolumesPerNode) | ||
|
|
||
| // always subtract one for every SKE node, because they always have a root partition | ||
| maxVolumesPerNode -= 1 |
There was a problem hiding this comment.
previously we substracted 2. one for root disk and one for configDrive/spare.
Why are we not substracting one for the configDrive / spare anymore?
| } | ||
| } | ||
| } else { | ||
| klog.V(4).Infof("skipping class %s: path: %s", class, devPath) |
There was a problem hiding this comment.
invert the check above and continue in case it does not have the prefix
How to categorize this PR?
/kind enhancement
What this PR does / why we need it:
This PR updates the CSI driver to dynamically calculate the maximum allocatable volume count when the MutableCSINodeAllocatableCount feature gate is enabled.
Instead of relying on a static limit, the driver now scans the system for existing PCIe devices and subtracts them from the instance type's theoretical maximum PCIe lane capacity. This ensures that hardware overhead—such as additional Network Interface Cards (NICs)—is accounted for, preventing volume attachment failures due to exhausted root ports.
This will also lead to better scheduling decision by the kube-scheduler.
Which issue(s) this PR fixes:
Fixes #
Special notes for your reviewer:
Breaking changes: