Skip to content

Conversation

@welteki
Copy link
Member

@welteki welteki commented Dec 8, 2025

What type of PR is this?

/kind feature

What this PR does / why we need it:

Add support for custom CA bundles when communicating with Slicer APIs
to enable secure connections with self-signed certificates or internal
certificate authorities.

alexellis and others added 2 commits November 27, 2025 17:06
Slicer launches microVMs on KVM-enabled hosts with K3s and a
join token to add and remove new nodes dynamically, which are
fully isolated through Firecracker.

Signed-off-by: Alex Ellis (OpenFaaS Ltd) <alexellis2@gmail.com>
Add support for custom CA bundles when communicating with Slicer APIs
to enable secure connections with self-signed certificates or internal
certificate authorities.

Signed-off-by: Han Verstraete (OpenFaaS Ltd) <han@openfaas.com>
@reviewfn
Copy link

reviewfn bot commented Dec 8, 2025

AI Pull Request Overview

Summary

  • Adds support for Slicer cloud provider to cluster-autoscaler
  • Implements NodeGroup and CloudProvider interfaces for Slicer microVMs
  • Adds custom CA bundle configuration for secure API communication
  • Includes documentation and Helm chart updates
  • Introduces dependency on github.com/slicervm/sdk
  • Updates build tags and provider registration
  • Adds debug logging and temporary code that should be removed

Summary per file

Summary per file
File path Summary
cluster-autoscaler/README.md Adds Slicer to list of supported cloud providers
cluster-autoscaler/charts/cluster-autoscaler/Chart.yaml Bumps chart version to 9.52.2
cluster-autoscaler/charts/cluster-autoscaler/templates/clusterrole.yaml Adds slicer to providers requiring node delete permissions
cluster-autoscaler/cloudprovider/builder/builder_all.go Includes slicer in all-build with exclusion tag
cluster-autoscaler/cloudprovider/builder/builder_slicer.go New builder for slicer-only builds
cluster-autoscaler/cloudprovider/cloud_provider.go Adds SlicerProviderName constant
cluster-autoscaler/cloudprovider/slicer/README.md New documentation for Slicer cloud provider
cluster-autoscaler/cloudprovider/slicer/slicer_cloud_provider.go Main implementation of CloudProvider interface
cluster-autoscaler/cloudprovider/slicer/slicer_config.go Configuration parsing for cloud config file
cluster-autoscaler/cloudprovider/slicer/slicer_node_group.go Implementation of NodeGroup interface
cluster-autoscaler/core/static_autoscaler.go Adds debug print statement for target size
cluster-autoscaler/go.mod Adds github.com/slicervm/sdk dependency
cluster-autoscaler/go.sum Updates go.sum with new dependency

Overall Assessment

Overall Assessment

The PR successfully adds a new cloud provider for Slicer, enabling autoscaling on microVMs. The implementation follows the established patterns in the codebase. However, there are several issues that need addressing: inconsistent node sizing between creation and template, excessive logging at Info level, debug code left in production, and lack of tests. The CA bundle feature is a good security addition but should validate the file path. Overall, the PR is functional but requires cleanup and fixes before merging.

Detailed Review

Detailed Review

cluster-autoscaler/cloudprovider/slicer/slicer_cloud_provider.go

  • Logging verbosity: NodeGroupForNode (lines 161-196) and HasInstance methods contain excessive Info-level logging that should be changed to V(4) for debug purposes to avoid log spam in production.
  • CA bundle validation: In createHTTPClientWithCABundle, add validation to check if the CA bundle file exists and is readable before attempting to read it, to provide better error messages.
  • Architecture precedence: The arch override logic (lines 98-101) should be documented - config value takes precedence over API value, but this isn't clear from comments.

cluster-autoscaler/cloudprovider/slicer/slicer_node_group.go

  • Node sizing inconsistency: IncreaseSize hardcodes RAMGB=8 and CPUs=4, but TemplateNodeInfo fetches dynamic values from the API. This inconsistency could lead to mismatched resource allocation. Use the same sizing logic in both places.
  • API call efficiency: TemplateNodeInfo fetches hostgroup info on every call - consider caching this data since hostgroup details don't change frequently.
  • Hostname fallback: In DeleteNodes, using node.Name as fallback for slicerHostname is incorrect since node.Name is dynamic. Always require the slicer/hostname label and fail if missing.
  • Hardcoded constants: slicerRAMGB and slicerCPUs are hardcoded - these should be configurable or at least match the API values.

cluster-autoscaler/core/static_autoscaler.go

  • Debug code: Remove the debug print statement added (lines 300-305) as it's not suitable for production code.

Testing

  • Missing tests: No unit tests added for the new provider. At minimum, add tests for config parsing in slicer_config.go and basic functionality.

Security

  • Path validation: Ensure CA bundle path doesn't allow directory traversal attacks by validating it's an absolute path or within expected directories.

Performance

  • API call frequency: Nodes() method makes API calls for each node group - ensure this is acceptable for the expected scale.

Documentation

  • Configuration clarity: In slicer/README.md, clarify that ca-bundle applies globally to all nodegroups, not per-nodegroup.

AI agent details.

Agent processing time: 1m29.252s
Environment preparation time: 11.932s
Total time from webhook: 2m10.558s

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants