-
Notifications
You must be signed in to change notification settings - Fork 428
OCPBUGS-70152: skip HCP router LB when routes use apps domain #7419
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
Skipping CI for Draft Pull Request. |
WalkthroughThis PR introduces domain-aware routing for Hosted Control Plane by adding a Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~22 minutes ✨ Finishing touches
🧪 Generate unit tests (beta)
📜 Recent review detailsConfiguration used: Organization UI Review profile: CHILL Plan: Pro Cache: Disabled due to data retention organization setting Knowledge base: Disabled due to 📒 Files selected for processing (9)
💤 Files with no reviewable changes (1)
🧰 Additional context used📓 Path-based instructions (1)**⚙️ CodeRabbit configuration file
Files:
🧬 Code graph analysis (5)support/util/expose.go (2)
control-plane-operator/controllers/hostedcontrolplane/v2/router/component.go (5)
support/util/visibility_test.go (2)
control-plane-operator/controllers/hostedcontrolplane/hostedcontrolplane_controller.go (4)
support/util/visibility.go (2)
🪛 golangci-lint (2.5.0)support/util/expose_test.go[error] 228-228: : # github.com/openshift/hypershift/sync-global-pullsecret [github.com/openshift/hypershift/sync-global-pullsecret.test] (typecheck) control-plane-operator/controllers/hostedcontrolplane/v2/router/component_test.go[error] 228-228: : # github.com/openshift/hypershift/sync-global-pullsecret [github.com/openshift/hypershift/sync-global-pullsecret.test] (typecheck) ⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
🔇 Additional comments (9)
Comment |
|
@celebdor: This pull request references Jira Issue OCPBUGS-70152, which is valid. The bug has been moved to the POST state. 3 validation(s) were run on this bug
No GitHub users were found matching the public email listed for the QA contact in Jira (heli@redhat.com), skipping review request. The bug has been updated to refer to the pull request using the external bug tracker. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: celebdor The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
217f91f to
eb70031
Compare
When a HostedControlPlane uses Route publishing strategy with hostnames that are subdomains of the management cluster's default ingress domain (e.g., oauth.apps.example.com under apps.example.com), the HCP router LoadBalancer service should not be created. The management cluster's ingress controller can directly handle these routes. This fix introduces DNS label-based subdomain checking via IsSubdomain() to properly detect when route hostnames fall within the apps domain. The UseHCPRouter() function is now exported from the router component and checks: 1. Shared ingress mode (always skip router) 2. IBMCloud platform (always skip router) 3. Private/PublicAndPrivate HCPs (need router for internal routes) 4. Public HCPs with external DNS hostnames (need router) 5. Public HCPs with apps domain hostnames (skip router) The fix also adds DefaultIngressDomain to ControlPlaneContext and WorkloadContext structs to make the domain available to component predicates. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
eb70031 to
58c8de3
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
control-plane-operator/controllers/hostedcontrolplane/hostedcontrolplane_controller.go (1)
174-174: EnsureDefaultIngressDomainenvironment variable is configured or add validation.
DefaultIngressDomainis initialized from theconfig.DefaultIngressDomainEnvVarenvironment variable inmain.go:451with no validation. When this environment variable is empty and no explicit hostname is provided,ShortenRouteHostnameIfNeeded(support/util/route.go:23-31) returns an empty string, leaving route hostnames unset. This affects the three route reconciliation functions at lines 1270, 1308, and 1322.Ensure the environment variable is always configured in deployment, or add validation/logging to catch when it's missing.
📜 Review details
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Cache: Disabled due to data retention organization setting
Knowledge base: Disabled due to Reviews -> Disable Knowledge Base setting
📒 Files selected for processing (9)
control-plane-operator/controllers/hostedcontrolplane/hostedcontrolplane_controller.gocontrol-plane-operator/controllers/hostedcontrolplane/hostedcontrolplane_controller_test.gocontrol-plane-operator/controllers/hostedcontrolplane/v2/router/component.gocontrol-plane-operator/controllers/hostedcontrolplane/v2/router/component_test.gosupport/controlplane-component/controlplane-component.gosupport/util/expose.gosupport/util/expose_test.gosupport/util/visibility.gosupport/util/visibility_test.go
💤 Files with no reviewable changes (1)
- control-plane-operator/controllers/hostedcontrolplane/hostedcontrolplane_controller_test.go
🧰 Additional context used
📓 Path-based instructions (1)
**
⚙️ CodeRabbit configuration file
-Focus on major issues impacting performance, readability, maintainability and security. Avoid nitpicks and avoid verbosity.
Files:
support/util/expose.gosupport/util/expose_test.gosupport/controlplane-component/controlplane-component.gocontrol-plane-operator/controllers/hostedcontrolplane/v2/router/component.gocontrol-plane-operator/controllers/hostedcontrolplane/v2/router/component_test.gosupport/util/visibility_test.gocontrol-plane-operator/controllers/hostedcontrolplane/hostedcontrolplane_controller.gosupport/util/visibility.go
🧬 Code graph analysis (5)
support/util/expose.go (2)
api/hypershift/v1beta1/hostedcluster_types.go (2)
ServiceType(930-930)Route(921-921)hypershift-operator/controllers/sharedingress/sharedingress_controller.go (1)
Hostname(384-390)
control-plane-operator/controllers/hostedcontrolplane/v2/router/component.go (5)
support/controlplane-component/generic-adapter.go (1)
WithPredicate(32-36)support/controlplane-component/controlplane-component.go (1)
WorkloadContext(74-93)hypershift-operator/controllers/sharedingress/sharedingress_controller.go (1)
UseSharedIngress(379-382)api/hypershift/v1beta1/hostedcluster_types.go (1)
IBMCloudPlatform(1217-1217)support/util/visibility.go (2)
IsPrivateHCP(9-20)IsPublicWithExternalDNS(69-74)
support/util/visibility_test.go (2)
hypershift-operator/controllers/sharedingress/sharedingress_controller.go (1)
Hostname(384-390)support/util/visibility.go (1)
IsPublicWithExternalDNS(69-74)
control-plane-operator/controllers/hostedcontrolplane/hostedcontrolplane_controller.go (4)
control-plane-operator/controllers/hostedcontrolplane/v2/router/component.go (1)
UseHCPRouter(63-71)support/util/visibility.go (1)
IsPublicWithExternalDNS(69-74)hypershift-operator/controllers/sharedingress/sharedingress_controller.go (1)
UseSharedIngress(379-382)api/hypershift/v1beta1/hostedcluster_types.go (1)
IBMCloudPlatform(1217-1217)
support/util/visibility.go (2)
support/util/expose.go (1)
UseDedicatedDNSWithExternalDomain(119-126)api/hypershift/v1beta1/hostedcluster_types.go (4)
APIServer(934-934)OAuthServer(940-940)Konnectivity(937-937)Ignition(943-943)
🪛 golangci-lint (2.5.0)
support/util/expose_test.go
[error] 228-228: : # github.com/openshift/hypershift/sync-global-pullsecret [github.com/openshift/hypershift/sync-global-pullsecret.test]
sync-global-pullsecret/sync-global-pullsecret_test.go:228:23: undefined: MockdbusConn
sync-global-pullsecret/sync-global-pullsecret_test.go:234:26: undefined: MockdbusConn
sync-global-pullsecret/sync-global-pullsecret_test.go:247:26: undefined: MockdbusConn
sync-global-pullsecret/sync-global-pullsecret_test.go:257:26: undefined: MockdbusConn
sync-global-pullsecret/sync-global-pullsecret_test.go:270:26: undefined: MockdbusConn
sync-global-pullsecret/sync-global-pullsecret_test.go:283:26: undefined: MockdbusConn
sync-global-pullsecret/sync-global-pullsecret_test.go:296:26: undefined: MockdbusConn
sync-global-pullsecret/sync-global-pullsecret_test.go:309:26: undefined: MockdbusConn
sync-global-pullsecret/sync-global-pullsecret_test.go:327:12: undefined: NewMockdbusConn
(typecheck)
control-plane-operator/controllers/hostedcontrolplane/v2/router/component_test.go
[error] 228-228: : # github.com/openshift/hypershift/sync-global-pullsecret [github.com/openshift/hypershift/sync-global-pullsecret.test]
sync-global-pullsecret/sync-global-pullsecret_test.go:228:23: undefined: MockdbusConn
sync-global-pullsecret/sync-global-pullsecret_test.go:234:26: undefined: MockdbusConn
sync-global-pullsecret/sync-global-pullsecret_test.go:247:26: undefined: MockdbusConn
sync-global-pullsecret/sync-global-pullsecret_test.go:257:26: undefined: MockdbusConn
sync-global-pullsecret/sync-global-pullsecret_test.go:270:26: undefined: MockdbusConn
sync-global-pullsecret/sync-global-pullsecret_test.go:283:26: undefined: MockdbusConn
sync-global-pullsecret/sync-global-pullsecret_test.go:296:26: undefined: MockdbusConn
sync-global-pullsecret/sync-global-pullsecret_test.go:309:26: undefined: MockdbusConn
sync-global-pullsecret/sync-global-pullsecret_test.go:327:12: undefined: NewMockdbusConn
(typecheck)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
- GitHub Check: Red Hat Konflux / control-plane-operator-main-on-pull-request
- GitHub Check: Red Hat Konflux / hypershift-release-mce-211-on-pull-request
- GitHub Check: Red Hat Konflux / hypershift-cli-mce-211-on-pull-request
- GitHub Check: Red Hat Konflux / hypershift-operator-main-on-pull-request
🔇 Additional comments (9)
support/util/visibility.go (1)
65-74: LGTM! Domain-aware routing logic is well-structured.The function correctly checks for external DNS requirements by verifying if any service uses a Route hostname outside the management cluster's default ingress domain. This enables the fix for the regression where router LoadBalancer services were created unnecessarily for apps-domain routes.
support/util/expose.go (2)
87-113: LGTM! DNS subdomain logic is correctly implemented.The function properly performs DNS label-based subdomain matching:
- Handles edge cases (empty inputs, exact matches, partial label matches)
- Uses case-insensitive comparison as required for DNS
- Correctly validates label boundaries to avoid false positives like "oauthapps.example.com" matching "apps.example.com"
115-126: Review the tested behavior whenDefaultIngressDomainis empty.The concern is valid:
DefaultIngressDomaincan be empty when the environment variable is unset or when the ingress capability is missing (hostedcluster_controller.go:4390). When empty, this function returnstrue, treating all hostnames as external.However, this behavior is intentional and explicitly tested (expose_test.go:259-278). The conservative approach—assuming external DNS is needed when the default ingress domain is unknown—is reasonable for determining LoadBalancer service requirements.
Ensure callers (like
IsPublicWithExternalDNSused at hostedcontrolplane_controller.go:1409) handle the case whereDefaultIngressDomainis empty. Defensive checks already exist (v2/controlplaneoperator/deployment.go:159), but document this expected behavior if not already clear.support/util/visibility_test.go (1)
233-382: LGTM! Comprehensive test coverage for external DNS logic.The test cases thoroughly validate the domain-aware routing behavior across multiple scenarios including external DNS, apps domain routes, mixed configurations, and edge cases. This provides strong confidence in the correctness of the new
IsPublicWithExternalDNSfunction.support/util/expose_test.go (1)
94-298: LGTM! Excellent test coverage for DNS helpers.Both test functions provide comprehensive coverage:
TestIsSubdomainvalidates all edge cases including case-insensitivity, label boundaries, and empty inputsTestUseDedicatedDNSWithExternalDomainverifies the external domain detection logic across various service configurationsThe empty default ingress domain test case (lines 259-278) explicitly validates that all hostnames are treated as external when no domain is specified, documenting this important behavior.
support/controlplane-component/controlplane-component.go (1)
62-63: LGTM! Proper context propagation for default ingress domain.The
DefaultIngressDomainfield is correctly added to both context types and properly propagated fromControlPlaneContexttoWorkloadContext, following the established pattern for other context fields.Also applies to: 89-89, 109-109
control-plane-operator/controllers/hostedcontrolplane/v2/router/component.go (1)
52-71: LGTM! Router routing logic correctly implements domain-aware behavior.The
UseHCPRouterfunction properly determines when a dedicated HCP router is needed:
- Skips router for shared ingress and IBMCloud (platform constraints)
- Requires router for private HCPs (always need dedicated routing)
- Requires router only for public HCPs with external DNS hostnames (fixes the regression)
The refactoring from an internal predicate to a public function with explicit parameters improves testability and clarity.
control-plane-operator/controllers/hostedcontrolplane/v2/router/component_test.go (1)
10-255: LGTM! Comprehensive test coverage for router routing decisions.The test suite thoroughly validates
UseHCPRouterbehavior across:
- Multiple platforms (AWS, GCP, IBMCloud) with various endpoint access configurations
- External DNS vs. apps domain hostname scenarios
- Shared ingress mode
- Dynamic test generation for public platforms
This provides strong confidence that the domain-aware routing logic works correctly across all supported configurations.
control-plane-operator/controllers/hostedcontrolplane/hostedcontrolplane_controller.go (1)
1052-1052: LGTM! Consistent integration of domain-aware routing.The
DefaultIngressDomainis properly propagated through the reconciliation flow:
- Added to the control plane context (line 1105)
- Used in router predicate checks (lines 1052, 1380)
- Used in external DNS checks (lines 1409, 1520)
The changes maintain consistency with the new domain-aware routing model introduced by this PR.
Also applies to: 1105-1105, 1380-1380, 1409-1409, 1520-1520
|
@celebdor: The following test failed, say
Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
bryan-cox
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@celebdor this looks good to me. One nit on test names not using the when/it should format:
Current:
"AWS private - needs router"
"proper subdomain"
Should be:
"When AWS HCP is private, it should need router"
"When hostname is proper subdomain, it should return true"
|
/test hypershift-operator-main-push |
|
@devguyio: The specified target(s) for The following commands are available to trigger optional jobs: Use DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
What this PR does / why we need it:
Fix regression where HCP router LoadBalancer service was created unexpectedly for configurations using Route hostnames under the management cluster's apps domain.
After upgrading HCP clusters, a
routerservice of type LoadBalancer is created unexpectedly. On platforms with limited LoadBalancer IPs (e.g., BareMetal with exhausted MetalLB IPAddressPool), the service stays in<pending>state and blocks the upgrade.The fix for OCPBUGS-56914 (PR #6780) changed the condition for creating the public router service from checking only if KAS uses Route to checking if ANY service uses Route with hostname. This caused the router service to be created for configurations that previously didn't need it.
This PR adds
IsPublicWithExternalDNS()function that checks if any service uses a Route with a hostname that is external to the management cluster's default ingress domain. When hostnames are subdomains of the apps domain (e.g.,oauth.apps.mgmt-cluster.example.com), the management cluster's default router can serve them via wildcard DNS, so no dedicated HCP router LoadBalancer service is needed.This preserves the OCPBUGS-56914 fix for external DNS users while preventing unnecessary router service creation for apps-domain routes.
Which issue(s) this PR fixes:
Fixes OCPBUGS-70152
Checklist
🤖 Generated with Claude Code via
/jira:solve OCPBUGS-70152 celebdor