extensions: authenticate allocation endpoint with requestheader client CA#4582
extensions: authenticate allocation endpoint with requestheader client CA#4582adilburaksen wants to merge 3 commits into
Conversation
|
/gcbrun |
|
Build Failed 😭 Build Id: c818caef-47eb-44ed-90a9-b5a07b23a94f Status: FAILURE To get permission to view the Cloud Build view, join the agones-discuss Google Group. |
|
These failures appear to be pre-existing infrastructure issues unrelated to this PR's changes: Step #26 (upgrade-test): The build fails with Step #28 (e2e-test): Of the four clusters tested, three pass and one fails:
Since Happy to re-trigger ( |
|
/gcbrun It's a flake we see on autopilot sometimes. Tempted to add a retry in there for CI, but it shouldn't be needed -- but I digress. |
|
Also noting, you will need to sign your commits per DCO. |
…t CA The extensions HTTPS server exposes the allocation resource handler (/apis/allocation.agones.dev/v1/namespaces/...) to any in-cluster workload that can reach agones-controller-service:443 directly, bypassing Kubernetes RBAC for the aggregated resource. Fix using the standard Kubernetes aggregation-layer authentication protocol (https://kubernetes.io/docs/tasks/extend-kubernetes/configure-aggregation-layer/): * pkg/util/https/server.go — add ClientAuth: RequestClientCert so the TLS handshake forwards the client certificate (from the kube-apiserver proxy) into r.TLS.PeerCertificates without requiring callers that do not present a cert (e.g. webhooks) to do so. * pkg/util/apiserver/apiserver.go — add SetRequestHeaderCA(*x509.CertPool) and an authenticatedHandler wrapper. The wrapper verifies the leaf certificate in r.TLS.PeerCertificates against the CA pool with ExtKeyUsageClientAuth, mirroring the approach in k8s.io/apiserver/pkg/authentication/request/x509. Discovery and OpenAPI handlers are left unauthenticated. If no CA is configured (e.g. in unit tests) the handler is called directly. * cmd/extensions/main.go — load requestheader-client-ca-file from the kube-system/extension-apiserver-authentication ConfigMap (populated by Kubernetes for every aggregated API server) and wire it into the APIServer via SetRequestHeaderCA. A log warning is emitted if the ConfigMap is unavailable; auth is then effectively disabled for that start-up, preserving the previous behaviour while making the failure visible. Fixes: agones-dev#4572 Signed-off-by: adilburaksen <adilburaksen@gmail.com>
08a3dfe to
8bc0208
Compare
|
Build Succeeded 🥳 Build Id: 5c38641c-8e08-4493-bc28-2d26dea689a5 The following development artifacts have been built, and will exist for the next 30 days:
A preview of the website (the last 30 builds are retained): To install this version: |
markmandel
left a comment
There was a problem hiding this comment.
So I'm going to preface this with a small warning.
Going by the current engagement, I'm pretty sure you are just copy pasting whatever your LLM is giving you, and not actually taking the time to look at what you are generating, and then making changes or improvements accordingly. If you continue to do so, I'll shut this PR down as it's wasting maintainer time reviewing code that hasn't been review by you, the author before being sent out.
If not, then I apologise, but comments and code look very LLM generated, and going from the initial PR as well, there are lots of issues we need to work through with this code.
Either way, hoping to make this a learning opportunity.
Some extra review notes:
- Did you test this locally at all before sending this to us for review? If not, do so. We have a contributing and development guide that shows you how to do that locally with minikube.
- Let's get some unit tests going. No tests, no merge 👍🏻
…aderCA Add five table-driven tests in pkg/util/apiserver covering all branches of authenticatedHandler: passthrough when no CA is set, 401 when TLS is absent, 401 when no peer certificate is present, 200 for a cert signed by the expected CA, and 401 for a cert from an untrusted CA. Add four subtests in cmd/extensions for loadRequestHeaderCA: valid CA PEM in the ConfigMap, missing ConfigMap, missing key, and invalid PEM data. Signed-off-by: Adil Burak ŞEN <adilburaksen@gmail.com>
|
Fair enough on the review notes. I went back through the code more carefully. Tests are up: 5 cases for I haven't run this against a real cluster yet — I'll set up minikube and work through the dev guide before asking for another review pass. Separately: you mentioned "lots of issues" — I want to make sure I'm addressing the right things rather than guessing. Is the main concern around the approach (TLS client cert verification for requestheader auth), the startup behavior (CA loaded once with a soft warn on failure), or something else in the implementation? Happy to rework whichever pieces are off. |
|
/gcbrun There you are 😄 glad to meet you.
My comments outline the issues - the cert refresh issue is the one I'm most concerned about. Interrupting of allocation would be a big problem. I look forward to seeing your contributions! |
|
Build Succeeded 🥳 Build Id: 1f8dae26-16df-46c5-a9f9-622e155c097e The following development artifacts have been built, and will exist for the next 30 days:
A preview of the website (the last 30 builds are retained): To install this version: |
Summary
This PR fixes an authorization bypass in Agones' extension API server where any in-cluster workload could call the allocation endpoint directly, bypassing Kubernetes RBAC entirely.
Problem
The Agones extension API server (port 8082) serves
/apis/allocation.agones.dev/v1/namespaces/…/gameserverallocationsvia its own HTTP mux. When a client calls this endpoint through the Kubernetes aggregation layer,kube-apiserverproxies the request and includes a requestheader client certificate signed by the requestheader CA (fromkube-system/extension-apiserver-authentication).Previously, the server did not verify that certificate. Any pod with network access to port 8082 could send a raw HTTP request and receive a valid response — no Kubernetes ServiceAccount token, no RBAC check, nothing.
Fix
Three files changed:
pkg/util/https/server.goClientAuth: tls.RequestClientCerton the TLS config.r.TLS.PeerCertificatesfor the auth middleware.pkg/util/apiserver/apiserver.goSetRequestHeaderCA(*x509.CertPool)— called at startup to supply the CA.authenticatedHandler— wraps anyErrorHandlerFuncto verify the first peer certificate against the CA pool withExtKeyUsageClientAuth. Mirrors the logic ink8s.io/apiserver/pkg/authentication/request/x509./apis/allocation.agones.dev/v1) and OpenAPI handlers are left unauthenticated — they carry no sensitive data and are called by tools that do not present certs./apis/…/namespaces/resource handler (i.e. the allocation endpoint) requires the cert.cmd/extensions/main.goloadRequestHeaderCA— readskube-system/extension-apiserver-authenticationConfigMap, parsesrequestheader-client-ca-filePEM, returns an*x509.CertPool.NewAPIServer; if the ConfigMap is absent or empty (e.g. clusters that don't use the aggregation layer), the server starts with a warning and no auth enforcement, preserving backward compatibility.Testing
Existing unit tests pass (
go test ./pkg/util/apiserver/...). TheauthenticatedHandlernil-CA path ensures existing tests continue to work without TLS setup.Integration / e2e: the TLS
RequestClientCertflag is transparent to callers that don't present a cert (webhook callers), andkube-apiserveralways presents the requestheader cert when proxying aggregation-layer requests.References
extension-apiserver-authenticationConfigMap: https://kubernetes.io/docs/tasks/extend-kubernetes/configure-aggregation-layer/#contacting-the-extension-apiserver