feat: support upstream health checks in BackendTrafficPolicy#414
feat: support upstream health checks in BackendTrafficPolicy#414AlinsRan wants to merge 8 commits into
Conversation
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (1)
🚧 Files skipped from review as they are similar to previous changes (1)
📝 WalkthroughWalkthroughThis PR extends BackendTrafficPolicy with upstream health check capabilities by adding API types (active/passive checks and thresholds), updating the CRD schema, generating deepcopy methods, translating policy fields into ADC upstream checks, adding unit and e2e tests, and updating docs. ChangesUpstream Health Check Support for BackendTrafficPolicy
🎯 3 (Moderate) | ⏱️ ~20 minutes Possibly related PRs
Suggested reviewers
🚥 Pre-merge checks | ✅ 5 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Pull request overview
This PR adds upstream health check configuration support to BackendTrafficPolicy (v1alpha1) so Gateway API users can configure APISIX active/passive upstream health checks through policy attachment during translation to ADC upstreams.
Changes:
- Introduces
HealthCheck(active/passive) types and aHealthCheckfield onBackendTrafficPolicySpec(v1alpha1). - Adds translation logic to map
v1alpha1.HealthCheckinto ADCUpstreamHealthCheckwhen attachingBackendTrafficPolicyto an upstream. - Adds unit coverage for health check translation and updates generated DeepCopy + CRD schema.
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
internal/adc/translator/policies.go |
Adds BTP → ADC upstream health check translation and attaches it to upstreams. |
internal/adc/translator/httproute_test.go |
Adds unit tests for BTP health check translation (active/passive + strictTLS). |
api/v1alpha1/backendtrafficpolicy_types.go |
Adds new v1alpha1 health check API types and spec.healthCheck. |
api/v1alpha1/zz_generated.deepcopy.go |
Regenerates DeepCopy methods for the new v1alpha1 types. |
config/crd/bases/apisix.apache.org_backendtrafficpolicies.yaml |
Updates CRD OpenAPI schema for the new healthCheck fields. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@api/v1alpha1/backendtrafficpolicy_types.go`:
- Line 265: The struct field Timeouts in backendtrafficpolicy_types.go has a
mismatched JSON tag (json:"timeout,omitempty"); change the tag to
json:"timeouts,omitempty" so the field and serialized name align (i.e., keep the
field name Timeouts and update its tag), then run code generation to update CRD
manifests and deepcopy code (make generate).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 23738847-ed1e-4850-8bbe-9e4f44fa97ac
📒 Files selected for processing (5)
api/v1alpha1/backendtrafficpolicy_types.goapi/v1alpha1/zz_generated.deepcopy.goconfig/crd/bases/apisix.apache.org_backendtrafficpolicies.yamlinternal/adc/translator/httproute_test.gointernal/adc/translator/policies.go
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
test/e2e/crds/v1alpha1/backendtrafficpolicy.go (1)
66-76: ⚡ Quick winReplace fixed sleeps with condition-based waiting.
The hard-coded
time.Sleepcalls make these e2e specs slow and flaky under variable reconciliation timing. Prefer polling assertions with bounded timeout/interval until the expected state is observed.Also applies to: 235-236, 271-272, 310-311, 327-327
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@test/e2e/crds/v1alpha1/backendtrafficpolicy.go` around lines 66 - 76, Replace the fixed time.Sleep calls after resource creation with condition-based polling: after calling s.CreateResourceFromString(...) for the GatewayClass (using s.GetGatewayClassYaml()) and for the Gateway (using s.GetGatewayYaml()), remove the time.Sleep(5 * time.Second) and instead wait until the created resource reaches the expected reconciliation state using a polling assertion (e.g., Eventually/Wait with a bounded timeout/interval) that checks the resource readiness/observed status via the test harness helper (or implement a small s.WaitForResourceCondition/Get and assert the expected condition); apply the same replacement for the other occurrences noted (around lines 235-236, 271-272, 310-311, 327) so tests poll for the actual ready state instead of sleeping.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@test/e2e/crds/v1alpha1/backendtrafficpolicy.go`:
- Around line 241-247: The loop picks the first upstream with Checks != nil
(using ups and target of type adctypes.Upstream) which may be for a different
backend; change the selection to find the upstream that belongs to the
deterministic backend (e.g. match the upstream identity to
"httpbin-service-e2e-test" by checking the upstream’s owner/target/service name
field) and only set/assert against that matching upstream (target). Apply the
same scoped match (instead of the any-upstream check) in the other two test
locations mentioned so each assertion operates on the upstream that actually
belongs to the intended backend.
---
Nitpick comments:
In `@test/e2e/crds/v1alpha1/backendtrafficpolicy.go`:
- Around line 66-76: Replace the fixed time.Sleep calls after resource creation
with condition-based polling: after calling s.CreateResourceFromString(...) for
the GatewayClass (using s.GetGatewayClassYaml()) and for the Gateway (using
s.GetGatewayYaml()), remove the time.Sleep(5 * time.Second) and instead wait
until the created resource reaches the expected reconciliation state using a
polling assertion (e.g., Eventually/Wait with a bounded timeout/interval) that
checks the resource readiness/observed status via the test harness helper (or
implement a small s.WaitForResourceCondition/Get and assert the expected
condition); apply the same replacement for the other occurrences noted (around
lines 235-236, 271-272, 310-311, 327) so tests poll for the actual ready state
instead of sleeping.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 629a0dc7-9c53-4fd7-be46-500a27672a13
📒 Files selected for processing (1)
test/e2e/crds/v1alpha1/backendtrafficpolicy.go
conformance test report - apisix modeapiVersion: gateway.networking.k8s.io/v1
date: "2026-05-12T21:45:44Z"
gatewayAPIChannel: experimental
gatewayAPIVersion: v1.3.0
implementation:
contact: null
organization: APISIX
project: apisix-ingress-controller
url: https://github.com/apache/apisix-ingress-controller.git
version: v2.0.0
kind: ConformanceReport
mode: default
profiles:
- core:
result: success
statistics:
Failed: 0
Passed: 12
Skipped: 0
name: GATEWAY-GRPC
summary: Core tests succeeded.
- core:
failedTests:
- HTTPRouteInvalidBackendRefUnknownKind
result: failure
skippedTests:
- HTTPRouteHTTPSListener
statistics:
Failed: 1
Passed: 31
Skipped: 1
extended:
result: partial
skippedTests:
- HTTPRouteRedirectPortAndScheme
statistics:
Failed: 0
Passed: 11
Skipped: 1
supportedFeatures:
- GatewayAddressEmpty
- GatewayPort8080
- HTTPRouteBackendProtocolWebSocket
- HTTPRouteDestinationPortMatching
- HTTPRouteHostRewrite
- HTTPRouteMethodMatching
- HTTPRoutePathRewrite
- HTTPRoutePortRedirect
- HTTPRouteQueryParamMatching
- HTTPRouteRequestMirror
- HTTPRouteResponseHeaderModification
- HTTPRouteSchemeRedirect
unsupportedFeatures:
- GatewayHTTPListenerIsolation
- GatewayInfrastructurePropagation
- GatewayStaticAddresses
- HTTPRouteBackendProtocolH2C
- HTTPRouteBackendRequestHeaderModification
- HTTPRouteBackendTimeout
- HTTPRouteParentRefPort
- HTTPRoutePathRedirect
- HTTPRouteRequestMultipleMirrors
- HTTPRouteRequestPercentageMirror
- HTTPRouteRequestTimeout
name: GATEWAY-HTTP
summary: Core tests failed with 1 test failures. Extended tests partially succeeded
with 1 test skips.
- core:
result: partial
skippedTests:
- TLSRouteSimpleSameNamespace
statistics:
Failed: 0
Passed: 10
Skipped: 1
name: GATEWAY-TLS
summary: Core tests partially succeeded with 1 test skips. |
conformance test report - apisix-standalone modeapiVersion: gateway.networking.k8s.io/v1
date: "2026-05-12T21:46:40Z"
gatewayAPIChannel: experimental
gatewayAPIVersion: v1.3.0
implementation:
contact: null
organization: APISIX
project: apisix-ingress-controller
url: https://github.com/apache/apisix-ingress-controller.git
version: v2.0.0
kind: ConformanceReport
mode: default
profiles:
- core:
result: success
statistics:
Failed: 0
Passed: 12
Skipped: 0
name: GATEWAY-GRPC
summary: Core tests succeeded.
- core:
failedTests:
- HTTPRouteMatchingAcrossRoutes
result: failure
skippedTests:
- HTTPRouteHTTPSListener
statistics:
Failed: 1
Passed: 31
Skipped: 1
extended:
result: partial
skippedTests:
- HTTPRouteRedirectPortAndScheme
statistics:
Failed: 0
Passed: 11
Skipped: 1
supportedFeatures:
- GatewayAddressEmpty
- GatewayPort8080
- HTTPRouteBackendProtocolWebSocket
- HTTPRouteDestinationPortMatching
- HTTPRouteHostRewrite
- HTTPRouteMethodMatching
- HTTPRoutePathRewrite
- HTTPRoutePortRedirect
- HTTPRouteQueryParamMatching
- HTTPRouteRequestMirror
- HTTPRouteResponseHeaderModification
- HTTPRouteSchemeRedirect
unsupportedFeatures:
- GatewayHTTPListenerIsolation
- GatewayInfrastructurePropagation
- GatewayStaticAddresses
- HTTPRouteBackendProtocolH2C
- HTTPRouteBackendRequestHeaderModification
- HTTPRouteBackendTimeout
- HTTPRouteParentRefPort
- HTTPRoutePathRedirect
- HTTPRouteRequestMultipleMirrors
- HTTPRouteRequestPercentageMirror
- HTTPRouteRequestTimeout
name: GATEWAY-HTTP
summary: Core tests failed with 1 test failures. Extended tests partially succeeded
with 1 test skips.
- core:
result: partial
skippedTests:
- TLSRouteSimpleSameNamespace
statistics:
Failed: 0
Passed: 10
Skipped: 1
name: GATEWAY-TLS
summary: Core tests partially succeeded with 1 test skips. |
conformance test reportapiVersion: gateway.networking.k8s.io/v1
date: "2026-05-12T22:06:17Z"
gatewayAPIChannel: experimental
gatewayAPIVersion: v1.3.0
implementation:
contact: null
organization: APISIX
project: apisix-ingress-controller
url: https://github.com/apache/apisix-ingress-controller.git
version: v2.0.0
kind: ConformanceReport
mode: default
profiles:
- core:
failedTests:
- GatewayModifyListeners
result: failure
skippedTests:
- HTTPRouteHTTPSListener
statistics:
Failed: 1
Passed: 31
Skipped: 1
extended:
failedTests:
- HTTPRouteBackendProtocolWebSocket
result: failure
skippedTests:
- HTTPRouteRedirectPortAndScheme
statistics:
Failed: 1
Passed: 10
Skipped: 1
supportedFeatures:
- GatewayAddressEmpty
- GatewayPort8080
- HTTPRouteBackendProtocolWebSocket
- HTTPRouteDestinationPortMatching
- HTTPRouteHostRewrite
- HTTPRouteMethodMatching
- HTTPRoutePathRewrite
- HTTPRoutePortRedirect
- HTTPRouteQueryParamMatching
- HTTPRouteRequestMirror
- HTTPRouteResponseHeaderModification
- HTTPRouteSchemeRedirect
unsupportedFeatures:
- GatewayHTTPListenerIsolation
- GatewayInfrastructurePropagation
- GatewayStaticAddresses
- HTTPRouteBackendProtocolH2C
- HTTPRouteBackendRequestHeaderModification
- HTTPRouteBackendTimeout
- HTTPRouteParentRefPort
- HTTPRoutePathRedirect
- HTTPRouteRequestMultipleMirrors
- HTTPRouteRequestPercentageMirror
- HTTPRouteRequestTimeout
name: GATEWAY-HTTP
summary: Core tests failed with 1 test failures. Extended tests failed with 1 test
failures.
- core:
failedTests:
- GatewayModifyListeners
- TLSRouteSimpleSameNamespace
result: failure
statistics:
Failed: 2
Passed: 9
Skipped: 0
name: GATEWAY-TLS
summary: Core tests failed with 2 test failures.
- core:
failedTests:
- GRPCExactMethodMatching
- GRPCRouteHeaderMatching
- GRPCRouteListenerHostnameMatching
- GatewayModifyListeners
result: failure
statistics:
Failed: 4
Passed: 8
Skipped: 0
name: GATEWAY-GRPC
summary: Core tests failed with 4 test failures. |
- Fix Timeouts field JSON tag: timeout -> timeouts to match ADC type - Enforce minimum 1s interval in translateBTPActiveHealthCheck to prevent sub-second values from truncating to 0 (matching v2/apisixupstream behavior) - Scope e2e upstream assertions to httpbin-service-e2e-test by name - Regenerate CRD manifests and API reference docs Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@test/e2e/crds/v1alpha1/backendtrafficpolicy.go`:
- Around line 331-337: The test currently loops over ups returned by
s.DefaultDataplaneResource().Upstream().List and asserts checks are nil only for
matching names, but if no upstream matches the assertion is never executed; add
a sentinel (e.g., found or matchCount) initialized before the loop, set it
true/increment when strings.Contains(u.Name, "httpbin-service-e2e-test") and
still run Expect(u.Checks).To(BeNil()) inside that branch, then after the loop
add Expect(found).To(BeTrue(), "expected at least one upstream matching
'httpbin-service-e2e-test' to validate deletion") so the test fails when the
target upstream is missing.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 203c29f9-428b-4867-b084-cba98b03feb6
📒 Files selected for processing (5)
api/v1alpha1/backendtrafficpolicy_types.goconfig/crd/bases/apisix.apache.org_backendtrafficpolicies.yamldocs/en/latest/reference/api-reference.mdinternal/adc/translator/policies.gotest/e2e/crds/v1alpha1/backendtrafficpolicy.go
✅ Files skipped from review due to trivial changes (1)
- docs/en/latest/reference/api-reference.md
🚧 Files skipped from review as they are similar to previous changes (2)
- internal/adc/translator/policies.go
- config/crd/bases/apisix.apache.org_backendtrafficpolicies.yaml
| ups, err = s.DefaultDataplaneResource().Upstream().List(context.Background()) | ||
| Expect(err).ToNot(HaveOccurred()) | ||
| for _, u := range ups { | ||
| if strings.Contains(u.Name, "httpbin-service-e2e-test") { | ||
| Expect(u.Checks).To(BeNil(), "upstream should not have health check after policy deletion") | ||
| } | ||
| } |
There was a problem hiding this comment.
Ensure the deletion assertion fails when target upstream is missing.
Right now, if no upstream name matches, the loop body never runs and the test can pass without validating post-delete state.
Proposed fix
ups, err = s.DefaultDataplaneResource().Upstream().List(context.Background())
Expect(err).ToNot(HaveOccurred())
+ foundTarget := false
for _, u := range ups {
if strings.Contains(u.Name, "httpbin-service-e2e-test") {
+ foundTarget = true
Expect(u.Checks).To(BeNil(), "upstream should not have health check after policy deletion")
}
}
+ Expect(foundTarget).To(BeTrue(), "target upstream should exist after policy deletion")📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| ups, err = s.DefaultDataplaneResource().Upstream().List(context.Background()) | |
| Expect(err).ToNot(HaveOccurred()) | |
| for _, u := range ups { | |
| if strings.Contains(u.Name, "httpbin-service-e2e-test") { | |
| Expect(u.Checks).To(BeNil(), "upstream should not have health check after policy deletion") | |
| } | |
| } | |
| ups, err = s.DefaultDataplaneResource().Upstream().List(context.Background()) | |
| Expect(err).ToNot(HaveOccurred()) | |
| foundTarget := false | |
| for _, u := range ups { | |
| if strings.Contains(u.Name, "httpbin-service-e2e-test") { | |
| foundTarget = true | |
| Expect(u.Checks).To(BeNil(), "upstream should not have health check after policy deletion") | |
| } | |
| } | |
| Expect(foundTarget).To(BeTrue(), "target upstream should exist after policy deletion") |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@test/e2e/crds/v1alpha1/backendtrafficpolicy.go` around lines 331 - 337, The
test currently loops over ups returned by
s.DefaultDataplaneResource().Upstream().List and asserts checks are nil only for
matching names, but if no upstream matches the assertion is never executed; add
a sentinel (e.g., found or matchCount) initialized before the loop, set it
true/increment when strings.Contains(u.Name, "httpbin-service-e2e-test") and
still run Expect(u.Checks).To(BeNil()) inside that branch, then after the loop
add Expect(found).To(BeTrue(), "expected at least one upstream matching
'httpbin-service-e2e-test' to validate deletion") so the test fails when the
target upstream is missing.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 6 out of 7 changed files in this pull request and generated 3 comments.
Files not reviewed (1)
- api/v1alpha1/zz_generated.deepcopy.go: Language not supported
Comments suppressed due to low confidence (1)
api/v1alpha1/backendtrafficpolicy_types.go:216
- The interval fields accept any Duration, but translation rounds down to whole seconds and coerces anything < 1s up to 1s. Consider adding CRD validation to constrain intervals to whole seconds and enforce the documented minimum (1s) to avoid surprising truncation/coercion (apply to both healthy and unhealthy intervals).
// Interval defines the time between health check probes.
// Minimum is 1s.
Interval metav1.Duration `json:"interval,omitempty" yaml:"interval,omitempty"`
| "github.com/apache/apisix-ingress-controller/api/v1alpha1" | ||
| ) | ||
|
|
||
| const _minHealthCheckInterval = time.Second |
| // +optional | ||
| Type string `json:"type,omitempty" yaml:"type,omitempty"` | ||
|
|
||
| // Timeout sets health check timeout. |
| interval := config.Healthy.Interval.Duration | ||
| if interval < _minHealthCheckInterval { | ||
| interval = _minHealthCheckInterval | ||
| } | ||
| active.Healthy = adctypes.UpstreamActiveHealthCheckHealthy{ |
Upstream Name is cleared when embedded in a service object (httproute translator sets Name="" for single-backend upstreams). Filtering by name always fails; rely on Checks field instead. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Summary
Closes #402
HealthCheck,ActiveHealthCheck,PassiveHealthCheckand related types toapi/v1alpha1, independent of theapi/v2types used byApisixUpstreamHealthCheck *HealthCheckfield toBackendTrafficPolicySpec, enabling Gateway API users to configure active and passive upstream health checks throughBackendTrafficPolicyv1alpha1.HealthCheckto ADCUpstreamHealthCheckinattachBackendTrafficPolicyToUpstreamChanges
api/v1alpha1/backendtrafficpolicy_types.go— newHealthCheckfield + 7 new typesapi/v1alpha1/zz_generated.deepcopy.go— regenerated DeepCopyinternal/adc/translator/policies.go— health check translation logic (translateBTPHealthCheck,translateBTPActiveHealthCheck,translateBTPPassiveHealthCheck)internal/adc/translator/httproute_test.go— unit tests for health check translationtest/e2e/crds/v1alpha1/backendtrafficpolicy.go— e2e tests for health check lifecycleconfig/crd/bases/apisix.apache.org_backendtrafficpolicies.yaml— updated CRD schemaUsage Examples
Active health check only
Proactively probe
/healthzevery second. A node is marked healthy after 2 consecutive 200 responses, unhealthy after 3 consecutive 5xx responses.Active and passive health checks
Use active probes for periodic detection, and passive checks to react immediately to failures observed in live traffic.
HTTPS backend with strict TLS
For HTTPS upstream backends, enable strict TLS certificate validation on health check probes.
Test Plan
TestAttachBackendTrafficPolicyHealthCheckcovers nil, active-only, passive+active, andstrictTLS=falsecasesgo build ./...passesgo test ./...(unit tests) passesSummary by CodeRabbit
New Features
Documentation
Tests