Skip to content

Adding AAA TACACS Support#176

Open
weneghawi wants to merge 4 commits intomainfrom
feature/aaa-tacacs-support
Open

Adding AAA TACACS Support#176
weneghawi wants to merge 4 commits intomainfrom
feature/aaa-tacacs-support

Conversation

@weneghawi
Copy link
Contributor

@weneghawi weneghawi commented Feb 4, 2026

Summary

Restructured the core AAA API to align with the OpenConfig system/aaa YANG model, making it vendor-agnostic and suitable for multi-vendor support (Nokia, Juniper, Arista, etc.). All Cisco NX-OS specific configuration has been moved to a dedicated AAAConfig provider CRD. RADIUS server group support has been added alongside the existing TACACS+ implementation.

Core API Changes (api/core/v1alpha1/aaa_types.go)

  • Server Groups: Replaced flat TACACSServers + TACACSGroup with ServerGroups []AAAServerGroup — protocol-agnostic containers with nested servers, following OpenConfig /system/aaa/server-groups/server-group. Supports both TACACS and RADIUS group types.
  • RADIUS Support: Added AAAServerRADIUS struct with authPort (default 1812), acctPort (default 1813), and keySecretRef
  • Method Lists: Flattened Authentication, Authorization, and Accounting to simple method lists (removed NX-OS specific nesting like Login.Default/Console and ConfigCommands)
  • Field Renames: VRF -> VrfName, SourceInterface -> SourceInterfaceName (leaves room for future object references)
  • Removed Cisco-specific fields: KeyEncryption, LoginErrorEnable moved to Cisco AAAConfig CRD
  • Added XValidation rules:
    • At least one of serverGroups, authentication, authorization, or accounting must be set
    • Servers in a TACACS group must have tacacs config
    • Servers in a RADIUS group must have radius config
    • groupName is required when method type is Group
    • deviceRef is immutable

Cisco AAAConfig CRD (api/cisco/nx/v1alpha1/aaaconfig_types.go)

  • Added ConsoleAuthentication *NXOSMethodList — NX-OS: aaa authentication login console
  • Added ConfigCommandsAuthorization *NXOSMethodList — NX-OS: aaa authorization config-commands default
  • Added RADIUSKeyEncryption type (Type6/Type7/Clear) with radiusKeyEncryption field (default Type7)
  • Retains KeyEncryption (Type6/Type7/Clear) and LoginErrorEnable

Controller (internal/controller/core/aaa_controller.go)

  • Updated secret loading to iterate nested ServerGroups[].Servers[].TACACS.KeySecretRef and ServerGroups[].Servers[].RADIUS.KeySecretRef
  • Updated secretToAAA watch mapping to trigger reconciliation on changes to both TACACS and RADIUS key secrets

NX-OS Provider (internal/provider/cisco/nxos/)

  • aaa.go: Added RadiusProvider, RadiusProviderGroup, RadiusProviderRef NX-OS DME structs. Added MapRADIUSKeyEncryption helper. Added groupTypeByName and MapRealmFromGroup to correctly resolve realm as "radius" or "tacacs" based on the referenced server group type. Removed read-only Name and Realm fields from AAADefaultAuthor (NX-OS rejects writes to these). Added MapNXOSRealm, MapNXOSLocal, MapNXOSFallback helpers. Note: RADIUS on NX-OS requires no feature flag (unlike TACACS+ which requires feature tacacs+).
  • provider.go: Rewrote EnsureAAA to iterate ServerGroups with a switch on group type covering both TACACS and RADIUS. Rewrote DeleteAAA with batched resets and RADIUS group/server cleanup. Changed from Patch to Update (netconf replace).

Sample YAML (config/samples/networking_v1alpha1_aaa.yaml)

  • Updated to demonstrate the new structure with serverGroups, nested servers, flat method lists, and separate Cisco AAAConfig with console/config-commands authorization.

Test Plan

  • go build ./... — compiles cleanly
  • go test ./api/... ./internal/provider/... ./internal/clientutil/... — all pass
  • make run-golangci-lint — 0 issues
  • make generate — CRDs and deepcopy regenerated
  • Verified full AAA lifecycle on NX-OS switch via gNMI (gnmic):
    • Enable TACACS+ feature, create servers + server group with VRF, source interface, server refs
    • Set default authentication, console authentication, config-commands authorization, accounting
    • Read back and verify all TACACS+ configuration
    • Reset all auth/authz/acct to local, delete group and servers, disable feature, verify clean state
    • Create RADIUS server + server group, set default auth realm to radius
    • Read back and verify all RADIUS configuration (confirmed: no feature flag needed on NX-OS)
    • Delete RADIUS group and server, verify clean state

@weneghawi weneghawi requested a review from a team as a code owner February 4, 2026 16:43
@hardikdr hardikdr added the area/metal-automation Automation processes within the Metal project. label Feb 5, 2026
@hardikdr hardikdr added this to Roadmap Feb 5, 2026
Copy link
Contributor

@felix-kaestner felix-kaestner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@weneghawi Please have a look at the openconfig-system:system/aaa yang model. Only configurations that are part of this or are otherwise commonly found on all vendors (Nokia, Juniper, Arista & Co.) should be part of the core api. All Cisco NX-OS specific configuration should be refactored into a vendor specific provider config, see e.g. the ManagementAccess resource on how this is done. There is a separate api package for cisco specific CRDs.

//
// It models the Authentication, Authorization, and Accounting (AAA) configuration on a network device,
// including TACACS+ server configuration and AAA group/method settings.
type AAASpec struct {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add some XValidation rules to verify that at least one of the fields in this spec is set? Otherwise, I'd be able to create a resource without any configuration.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Added XValidation on AAASpec:

rule="has(self.serverGroups) || has(self.authentication) || has(self.authorization) || has(self.accounting)"

Also added additional XValidation rules:

  • AAAServerGroup: servers in a TACACS group must have tacacs config
  • AAAMethod: groupName required when type is Group
  • DeviceRef: immutable

Comment on lines 74 to 80
// KeyEncryption specifies the encryption type for the key.
// Type7 is the Cisco Type 7 encryption (reversible).
// Type6 is the AES encryption (more secure).
// Clear means the key is sent in cleartext (not recommended).
// +optional
// +kubebuilder:validation:Enum=Type6;Type7;Clear
// +kubebuilder:default=Type7
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't want to include cisco specific stuff in the core api (we want this to be vendor agnostic). So at a minimum, we have to remove "Type7". Is Type6/AES universally supported across vendors (Nokia, Juniper, Arista etc.)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Removed KeyEncryption entirely from the core API. It now lives in the Cisco AAAConfig CRD (api/cisco/nx/v1alpha1/aaaconfig_types.go) where it belongs, since Type6/Type7/Clear are all Cisco-specific encryption types. The core API only references secrets via Kubernetes SecretKeySelector — the provider handles encryption.

// VRF is the VRF to use for communication with the TACACS+ servers.
// +optional
// +kubebuilder:validation:MaxLength=63
VRF string `json:"vrf,omitempty"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
VRF string `json:"vrf,omitempty"`
VrfName string `json:"vrfName,omitempty"`

Let's use a string name here, to make clear that we are talking about the name of the VRF directly. This would leave us the possibility have "VRF" field in the future, which could then be a resource reference to a "VRF" resource.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Renamed to VrfName string json:"vrfName,omitempty".

// SourceInterface is the source interface to use for communication with the TACACS+ servers.
// +optional
// +kubebuilder:validation:MaxLength=63
SourceInterface string `json:"sourceInterface,omitempty"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
SourceInterface string `json:"sourceInterface,omitempty"`
SourceInterfaceName string `json:"sourceInterfaceName,omitempty"`

Same reasoning as with the VRF, lets append the "Name" suffix to have the possibility for an "SourceInterface" object reference to a "Interface" resource.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Renamed to SourceInterfaceName string json:"sourceInterfaceName,omitempty".

Comment on lines 136 to 138
// LoginErrorEnable enables login error messages.
// +optional
LoginErrorEnable bool `json:"loginErrorEnable,omitempty"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This configuration seems cisco nx-os specific and should go into a provider config crd.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Moved to Cisco AAAConfig CRD. The provider reads it from cfg.Spec.LoginErrorEnable when building the NX-OS authentication config.

)

// TacacsPlusProvider represents a TACACS+ server host configuration.
// Path: System/userext-items/tacacsext-items/tacacsplusprovider-items/TacacsPlusProvider-list[name=<address>]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would leave out these comments as those are already clear from the XPath() functions.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Removed all // Path: ... comments from the NX-OS model structs in internal/provider/cisco/nxos/aaa.go.

}

// Configure TACACS+ server hosts
for _, server := range req.AAA.Spec.TACACSServers {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With this logic, we are leaving over tacacs servers that were once part of the resource, but got removed in a later reconcile, as we only make sure to have those that are in the spec but never clean up any previous ones. We should do this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Partially addressed. By switching from Patch (merge) to Update (replace), the server group's ProviderRef-list is fully replaced on each reconcile — so removed servers are dropped from the group. However, the TacacsPlusProvider host entries themselves are individual list items and would need explicit deletion if removed from the spec. This could be addressed in a follow-up by reading current state from the device and diffing against the desired spec.

}

// MapRealmFromMethodType maps the API method type to NX-OS realm.
func MapRealmFromMethodType(method v1alpha1.AAAMethodType, groupName string) string {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The groupName parameter is never used.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Removed the groupName parameter from MapRealmFromMethodType.

return p.Patch(ctx, conf...)
}

func (p *Provider) DeleteAAA(ctx context.Context, req *provider.DeleteAAARequest) error {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please batch all the updates into a single call, similar to the EnsureAAA.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. DeleteAAA now batches all resets (auth, console auth, authorization, accounting) into a single p.Update() call, followed by group/server deletions.

conf = append(conf, acct)
}

return p.Patch(ctx, conf...)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is a Patch (netconf merge) really the operation we'd like to perform here? Or could this lead to leftover configurations that the operator didn't intent to and is unaware of? In general we prefer to do a Update (netconf replace) to make sure that the operator is in charge of all the configuration.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Changed EnsureAAA from p.Patch() to p.Update() (netconf replace). This ensures the operator fully owns the configuration and prevents leftover/unintended settings from persisting.

@weneghawi
Copy link
Contributor Author

@weneghawi Please have a look at the openconfig-system:system/aaa yang model. Only configurations that are part of this or are otherwise commonly found on all vendors (Nokia, Juniper, Arista & Co.) should be part of the core api. All Cisco NX-OS specific configuration should be refactored into a vendor specific provider config, see e.g. the ManagementAccess resource on how this is done. There is a separate api package for cisco specific CRDs.

Done. The core API (api/core/v1alpha1/aaa_types.go) has been completely restructured to follow the OpenConfig system/aaa YANG model:

  • ServerGroups with nested Servers maps to /system/aaa/server-groups/server-group
  • AAAServerGroupType (TACACS/RADIUS) maps to server-group/config/type
  • Flat Authentication, Authorization, Accounting method lists map to /system/aaa/authentication, /authorization, /accounting

All Cisco NX-OS specific config has been moved to the AAAConfig CRD in api/cisco/nx/v1alpha1/, including: KeyEncryption, LoginErrorEnable, ConsoleAuthentication, and ConfigCommandsAuthorization.

@weneghawi weneghawi force-pushed the feature/aaa-tacacs-support branch from 14c783c to c4937c1 Compare February 17, 2026 23:35
@weneghawi weneghawi force-pushed the feature/aaa-tacacs-support branch from 715fa10 to baa51de Compare February 18, 2026 14:41
@weneghawi weneghawi force-pushed the feature/aaa-tacacs-support branch from baa51de to 1d90839 Compare February 18, 2026 14:47
@weneghawi weneghawi force-pushed the feature/aaa-tacacs-support branch from ce3748b to 9b58290 Compare February 18, 2026 15:53
@github-actions
Copy link

Merging this branch will decrease overall coverage

Impacted Packages Coverage Δ 🤖
github.com/ironcore-dev/network-operator/api/cisco/nx/v1alpha1 0.00% (ø)
github.com/ironcore-dev/network-operator/api/core/v1alpha1 0.00% (ø)
github.com/ironcore-dev/network-operator/cmd 0.00% (ø)
github.com/ironcore-dev/network-operator/hack/provider 0.00% (ø)
github.com/ironcore-dev/network-operator/internal/controller/core 63.41% (-3.46%) 👎
github.com/ironcore-dev/network-operator/internal/deviceutil 40.79% (ø)
github.com/ironcore-dev/network-operator/internal/provider 48.00% (ø)
github.com/ironcore-dev/network-operator/internal/provider/cisco/nxos 10.15% (-0.76%) 👎
github.com/ironcore-dev/network-operator/internal/provisioning 77.36% (ø)
github.com/ironcore-dev/network-operator/test/e2e 0.00% (ø)
github.com/ironcore-dev/network-operator/test/lab 0.00% (ø)

Coverage by file

Changed files (no unit tests)

Changed File Coverage Δ Total Covered Missed 🤖
github.com/ironcore-dev/network-operator/api/cisco/nx/v1alpha1/aaaconfig_types.go 0.00% (ø) 0 0 0
github.com/ironcore-dev/network-operator/api/cisco/nx/v1alpha1/zz_generated.deepcopy.go 0.00% (ø) 0 0 0
github.com/ironcore-dev/network-operator/api/core/v1alpha1/aaa_types.go 0.00% (ø) 0 0 0
github.com/ironcore-dev/network-operator/api/core/v1alpha1/zz_generated.deepcopy.go 0.00% (ø) 0 0 0
github.com/ironcore-dev/network-operator/cmd/main.go 0.00% (ø) 0 0 0
github.com/ironcore-dev/network-operator/hack/provider/main.go 0.00% (ø) 0 0 0
github.com/ironcore-dev/network-operator/internal/controller/core/aaa_controller.go 0.00% (ø) 137 (+137) 0 137 (+137)
github.com/ironcore-dev/network-operator/internal/deviceutil/deviceutil.go 40.79% (ø) 76 31 45
github.com/ironcore-dev/network-operator/internal/provider/cisco/nxos/aaa.go 0.00% (ø) 52 (+52) 0 52 (+52)
github.com/ironcore-dev/network-operator/internal/provider/cisco/nxos/ascii.go 100.00% (ø) 7 7 0
github.com/ironcore-dev/network-operator/internal/provider/cisco/nxos/provider.go 0.06% (-0.00%) 1589 (+110) 1 1588 (+110) 👎
github.com/ironcore-dev/network-operator/internal/provider/cisco/nxos/user.go 40.00% (ø) 40 16 24
github.com/ironcore-dev/network-operator/internal/provider/provider.go 48.00% (ø) 25 12 13
github.com/ironcore-dev/network-operator/internal/provisioning/http.go 77.36% (ø) 265 205 60
github.com/ironcore-dev/network-operator/test/e2e/util.go 0.00% (ø) 0 0 0

Please note that the "Total", "Covered", and "Missed" counts above refer to code statements instead of lines of code. The value in brackets refers to the test coverage of that file in the old version of the code.

Changed unit test files

  • github.com/ironcore-dev/network-operator/test/lab/main_test.go

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/metal-automation Automation processes within the Metal project.

Projects

Status: No status

Development

Successfully merging this pull request may close these issues.

3 participants

Comments