Better support for non-deterministic external-names by updating critical annotations#850
Better support for non-deterministic external-names by updating critical annotations#850twobiers wants to merge 10 commits intocrossplane:mainfrom
Conversation
f4fb5c6 to
15df832
Compare
|
@twobiers, we are interested in progressing the solution you propose in this PR. Are you currently able to give this attention? If so, would you mind resolving the conflicts and pushing up an updated version? If you aren't currently able to give this attention would you be happy for us to take it over and get it over the line (you'll be credited with the contribution)? |
15df832 to
c14e6da
Compare
|
@jeanduplessis Thanks for your kind comment. I'm able to work on this and have rebased the branch onto master. |
|
Thanks, @twobiers. @erhancagirici will be doing a thorough review in the next few days, with a specific focus on making sure it's compatible with Upjet's async approach. |
|
Closing and reopening to kick the CI |
erhancagirici
left a comment
There was a problem hiding this comment.
Thanks for the PR! I've left some comments regarding the mechanics, otherwise LGTM conceptually. As you mentioned in the doc comments, this will be sort of a band-aid until a more async-aware implementation.
| // after the creation, but later as part of an asynchronous process. | ||
| // When Crossplane supports asynchronous creation of resources natively, this logic | ||
| // might not be needed anymore and can be revisited. | ||
| if err := r.managed.UpdateCriticalAnnotations(ctx, managed); err != nil { |
There was a problem hiding this comment.
a thing to consider (possible nit):
most reconciliation loops will enter this codepath regardless of a need to update critical annotations. I wonder if this one is no-op in terms of k8s API access or brings some extra load on the apiserver.
Especially in async creations with long-running creation times, currently (with no native async support) the external clients return observation.ResourceExists as true to avoid further actions for the observations during creation of the resource.
There was a problem hiding this comment.
Perhaps we should make writing these annotations optional. I'd suggest making them opt-out since that's the safest path. The option would be specified by the provider author - so they can disable these annotations if they know for sure they're not needed (i.e. naming is deterministic, API is strongly consistent). I remember discussing this with someone recently, but can't find a tracking issue.
(If we do make them optional, I think we could do it pretty easily by injecting a no-op implementation of the annotation updater.)
There was a problem hiding this comment.
It should also be straightforward to keep a state in the reconciler whether a critical annotation was added/changed and depending on that update the annotations or perform a no-op.
On the other hand I'm wondering whether this kind of optimization is needed for "Critical" annotations. Shouldn't the priority be here to add those annotations? Otherwise it would leave room for errors.
| err := retry.OnError(retry.DefaultRetry, func(err error) bool { | ||
| return !errors.Is(err, context.Canceled) | ||
| }, func() error { | ||
| err := u.client.Update(ctx, o) |
There was a problem hiding this comment.
I like this update operation is being replaced by SSA.
| return err | ||
| } | ||
|
|
||
| err = u.client.Patch(ctx, o, client.RawPatch(types.MergePatchType, patchData), client.FieldOwner(fieldOwnerAPISimpleRefResolver), client.ForceOwnership) |
There was a problem hiding this comment.
We had better use a different manager name than managed.crossplane.io/api-simple-reference-resolver as the annotations have nothing to do with the API resolver. Maybe something like: managed.crossplane.io/critical-annotation-updater?
There was a problem hiding this comment.
Would it make sense for the MR reconciler to just use one manager name, regardless of operation?
Possibly too late if we're already using api-simple-reference-resolver for some.
| if observation.ResourceExists { | ||
| // When a resource exists or is just created, it might have received | ||
| // a non-deterministic external name after its creation, which we need to persist. | ||
| // We do this by updating the critical annotations. | ||
| // This is needed because some resources might not receive an external-name directly | ||
| // after the creation, but later as part of an asynchronous process. | ||
| // When Crossplane supports asynchronous creation of resources natively, this logic | ||
| // might not be needed anymore and can be revisited. | ||
| if err := r.managed.UpdateCriticalAnnotations(ctx, managed); err != nil { | ||
| log.Debug(errUpdateManagedAnnotations, "error", err) | ||
| record.Event(managed, event.Warning(reasonCannotUpdateManaged, errors.Wrap(err, errUpdateManagedAnnotations))) | ||
| return reconcile.Result{Requeue: true}, errors.Wrap(r.client.Status().Update(ctx, managed), errUpdateManagedAnnotations) | ||
| } | ||
| } |
There was a problem hiding this comment.
As @lsviben explains here, upjet should not be relying on setting ResourceLateInitialized to get the critical annotations updated in the first place. The async mode implemented by upjet breaks the assumptions of the managed reconciler. I believe we need to first address this discrepancy between upjet and the managed reconciler...
|
Warning Rate limit exceeded
Your organization is not enrolled in usage-based pricing. Contact your admin to enable usage-based pricing to continue reviews beyond the rate limit, or try again in 49 minutes and 25 seconds. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (3)
📝 WalkthroughWalkthroughRefactors critical-annotation persistence to patch only Changes
Sequence DiagramsequenceDiagram
participant Reconciler
participant UpdateCritical as UpdateCriticalAnnotations
participant Client
participant APIServer
Reconciler->>UpdateCritical: UpdateCriticalAnnotations(obj)
UpdateCritical->>UpdateCritical: Extract critical<br/>annotations into map
alt No critical annotations
UpdateCritical-->>Reconciler: return
else Annotations present
UpdateCritical->>Client: Patch(MergePatchType,<br/>{"metadata":{"annotations":{...}}}, fieldOwner, force)
alt Patch succeeds
Client->>APIServer: apply merge patch
APIServer-->>Client: updated object
Client-->>UpdateCritical: success
UpdateCritical-->>Reconciler: return
else Conflict error
Client-->>UpdateCritical: conflict
UpdateCritical->>Client: Get(latest)
Client->>APIServer: fetch latest
APIServer-->>Client: latest object
Client-->>UpdateCritical: object
UpdateCritical->>UpdateCritical: re-add critical annotations to object
UpdateCritical->>Client: Patch(MergePatchType,...)
Client->>APIServer: retry merge patch
APIServer-->>Client: updated
Client-->>UpdateCritical: success
UpdateCritical-->>Reconciler: return
end
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes If you want, I can run through the specific merge-patch JSON construction and the conflict-retry flow line-by-line — would you like that? Thanks for the change; this looks well scoped. 🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 0
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
pkg/reconciler/managed/api_test.go (1)
610-623: The "Success" test case appears to be misconfigured - it expects an error.The test case named "Success" at lines 610-623 is configured with
MockPatch: test.NewMockPatchFn(errBoom)which returns an error, and the expected result iserr: errors.Wrap(errBoom, errUpdateCriticalAnnotations). This contradicts the test name and reason which states "We should return without error if we successfully update our annotations."This appears to be a copy-paste issue from the "UpdateError" test case above it.
Apply this diff to fix the test:
"Success": { reason: "We should return without error if we successfully update our annotations", c: &test.MockClient{ MockGet: test.NewMockGetFn(nil, setAnnotations), - MockPatch: test.NewMockPatchFn(errBoom), + MockPatch: test.NewMockPatchFn(nil), }, args: args{ o: &fake.LegacyManaged{}, }, want: want{ - err: errors.Wrap(errBoom, errUpdateCriticalAnnotations), + err: nil, o: &fake.LegacyManaged{}, }, },
🧹 Nitpick comments (2)
pkg/reconciler/managed/api.go (1)
58-65: Global variable is intentional for configuration, but consider using constants or a function.The
gochecknoglobalslint warning is expected here. Since this list is read-only and defines critical configuration, it's acceptable as a package-level variable. The list correctly matches the annotations defined inpkg/meta/meta.go.The
gofumptformatting issue on line 58 should be addressed - likely just needs a blank line adjustment or formatting fix.Consider running
gofumpt -won this file to fix the formatting, or alternatively convert this to a function to satisfy the linter:-var ( - criticalAnnotations = []string{ - meta.AnnotationKeyExternalCreateFailed, - meta.AnnotationKeyExternalCreatePending, - meta.AnnotationKeyExternalCreateSucceeded, - meta.AnnotationKeyExternalName, - } -) +func criticalAnnotationKeys() []string { + return []string{ + meta.AnnotationKeyExternalCreateFailed, + meta.AnnotationKeyExternalCreatePending, + meta.AnnotationKeyExternalCreateSucceeded, + meta.AnnotationKeyExternalName, + } +}pkg/reconciler/managed/reconciler.go (1)
1411-1424: Approve the critical annotation persistence logic, with a minor error-handling concern.The new block correctly ensures critical annotations are persisted for resources that may receive non-deterministic external names asynchronously. The comment is helpful in explaining the rationale and future considerations.
However, on line 1422, the error wrapping logic has an issue: if the status update fails, the return statement wraps the status update error with
errUpdateManagedAnnotations, but the annotation update error (err) is lost. Consider either:
- Returning the original annotation error and letting the status update be best-effort, or
- Logging both errors
Additionally, the
gciformatter indicates line 1415 has formatting issues.Consider adjusting the error handling to preserve the original error:
if err := r.managed.UpdateCriticalAnnotations(ctx, managed); err != nil { log.Debug(errUpdateManagedAnnotations, "error", err) record.Event(managed, event.Warning(reasonCannotUpdateManaged, errors.Wrap(err, errUpdateManagedAnnotations))) - return reconcile.Result{Requeue: true}, errors.Wrap(r.client.Status().Update(ctx, managed), errUpdateManagedAnnotations) + // Best-effort status update; return the original annotation update error + if statusErr := r.client.Status().Update(ctx, managed); statusErr != nil { + log.Debug(errUpdateManagedStatus, "error", statusErr) + } + return reconcile.Result{Requeue: true}, errors.Wrap(err, errUpdateManagedAnnotations) }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (5)
pkg/reconciler/managed/api.go(2 hunks)pkg/reconciler/managed/api_test.go(5 hunks)pkg/reconciler/managed/reconciler.go(1 hunks)pkg/reconciler/managed/reconciler_legacy_test.go(40 hunks)pkg/reconciler/managed/reconciler_modern_test.go(49 hunks)
🧰 Additional context used
🧠 Learnings (4)
📚 Learning: 2025-10-16T17:11:48.360Z
Learnt from: CR
Repo: crossplane/crossplane-runtime PR: 0
File: coderabbit-custom-pre-merge-checks-unique-id-file-non-traceable-F7F2B60C-1728-4C9A-8889-4F2235E186CA.txt:0-0
Timestamp: 2025-10-16T17:11:48.360Z
Learning: Applies to **/!(*_test).go : Do not change function/method signatures of exported Go APIs without the 'breaking-change' label
Applied to files:
pkg/reconciler/managed/api_test.go
📚 Learning: 2025-10-16T17:11:48.360Z
Learnt from: CR
Repo: crossplane/crossplane-runtime PR: 0
File: coderabbit-custom-pre-merge-checks-unique-id-file-non-traceable-F7F2B60C-1728-4C9A-8889-4F2235E186CA.txt:0-0
Timestamp: 2025-10-16T17:11:48.360Z
Learning: Applies to **/!(*_test).go : Do not rename exported Go APIs without the 'breaking-change' label
Applied to files:
pkg/reconciler/managed/api_test.go
📚 Learning: 2025-10-16T17:11:48.360Z
Learnt from: CR
Repo: crossplane/crossplane-runtime PR: 0
File: coderabbit-custom-pre-merge-checks-unique-id-file-non-traceable-F7F2B60C-1728-4C9A-8889-4F2235E186CA.txt:0-0
Timestamp: 2025-10-16T17:11:48.360Z
Learning: Applies to **/!(*_test).go : Avoid behavior changes to exported Go APIs that could break existing users unless labeled 'breaking-change'
Applied to files:
pkg/reconciler/managed/api_test.go
📚 Learning: 2025-10-16T17:11:48.360Z
Learnt from: CR
Repo: crossplane/crossplane-runtime PR: 0
File: coderabbit-custom-pre-merge-checks-unique-id-file-non-traceable-F7F2B60C-1728-4C9A-8889-4F2235E186CA.txt:0-0
Timestamp: 2025-10-16T17:11:48.360Z
Learning: Applies to **/!(*_test).go : Do not remove exported Go APIs (functions, types, methods, or fields) without the 'breaking-change' label
Applied to files:
pkg/reconciler/managed/api_test.go
🧬 Code graph analysis (5)
pkg/reconciler/managed/api_test.go (3)
pkg/resource/interfaces.go (2)
Object(193-196)LegacyManaged(219-224)pkg/resource/fake/mocks.go (2)
Object(319-322)LegacyManaged(398-405)pkg/test/fake.go (2)
NewMockGetFn(85-95)NewMockPatchFn(163-173)
pkg/reconciler/managed/api.go (2)
pkg/meta/meta.go (4)
AnnotationKeyExternalCreateFailed(56-56)AnnotationKeyExternalCreatePending(42-42)AnnotationKeyExternalCreateSucceeded(51-51)AnnotationKeyExternalName(36-36)pkg/resource/interfaces.go (1)
Object(193-196)
pkg/reconciler/managed/reconciler_legacy_test.go (1)
pkg/test/fake.go (1)
NewMockPatchFn(163-173)
pkg/reconciler/managed/reconciler.go (2)
pkg/event/event.go (2)
Event(39-44)Warning(60-70)pkg/errors/errors.go (1)
Wrap(103-105)
pkg/reconciler/managed/reconciler_modern_test.go (1)
pkg/test/fake.go (1)
NewMockPatchFn(163-173)
🪛 golangci-lint (2.5.0)
pkg/reconciler/managed/api.go
[error] 59-59: criticalAnnotations is a global variable
(gochecknoglobals)
[error] 58-58: File is not properly formatted
(gofumpt)
pkg/reconciler/managed/reconciler.go
[error] 1415-1415: File is not properly formatted
(gci)
🔇 Additional comments (5)
pkg/reconciler/managed/api.go (1)
291-330: LGTM! Patch-based approach is more targeted and addresses the core issue.The refactored implementation correctly:
- Filters to only critical annotations (lines 292-297)
- Short-circuits when no critical annotations exist (lines 299-302)
- Uses JSON merge patch for targeted updates (lines 307-318)
- Handles conflicts by re-fetching and re-applying (lines 319-325)
The field owner name
fieldOwnerAPISimpleRefResolveris reused per the discussion in past review comments where negz suggested using one manager name regardless of operation. This is acceptable.One minor observation: the conflict handling fetches the latest object and calls
meta.AddAnnotations, but then returns the original conflict error. This allows the retry logic to work correctly since the error is propagated up.pkg/reconciler/managed/api_test.go (2)
546-551: LGTM!The helper function correctly renamed from
setLabelstosetAnnotationsto match the new implementation that operates on annotations rather than labels.
562-580: LGTM!Test cases correctly updated to use
MockPatchinstead ofMockUpdate, aligning with the implementation change to use patch-based updates for critical annotations.pkg/reconciler/managed/reconciler_legacy_test.go (1)
333-334: LGTM!The
MockPatchfield additions across all test cases correctly provide the necessary mock infrastructure to support the patch-based critical annotation updates introduced in the implementation. The pattern is consistently applied throughout the test file.pkg/reconciler/managed/reconciler_modern_test.go (1)
100-2697: LGTM! Test scaffolding correctly extended to support patch operations.The mechanical addition of
MockPatch: test.NewMockPatchFn(nil)across the test suite properly supports the new critical annotation update mechanism that uses JSON merge patch operations. The changes are consistent and align with the PR's objective of persisting critical annotations independently of spec late-initialization.Note: The tests "ObserveAndLateInitializePolicy" (lines 2017-2037) and "ObserveUpdateAndLateInitializePolicy" (lines 2042-2066) don't include MockPatch. This may be intentional if these code paths don't exercise patch operations, but you may want to verify this is correct.
|
I just had the idea that a function in the pipeline could also ensure those annotations are persisted. It would not be an ideal solution, but at least a possible hotfix while async is not fully supported. |
|
@jeanduplessis can this get some more attention? |
|
@bobh66 this PR now needs a review and approval from a @crossplane/crossplane-maintainers to move it forward. |
|
Kick the CI |
|
@jeanduplessis I was concerned that this comment indicates a blocking issue that is not being tracked somewhere. If there are fundamental issues with the way upjet interacts with crossplane-runtime that would seem to be a major problem that needs attention. |
|
Crossplane does not currently have enough maintainers to address every issue and pull request. This pull request has been automatically marked as |
|
/fresh |
|
Does this fix crossplane/upjet#632 ? |
|
@twobiers can you check the lint and unit-test errors? I'd like to get this moving if there are no other solutions in the works, and I don't think there are. |
Signed-off-by: twobiers <22715034+twobiers@users.noreply.github.com>
As the UpdateCriticialAnnotations function is now not exclusively called in the creation process, we have to ensure no other fields like the spec are updated, so we don't interfer with the normal LateInitialize logic Signed-off-by: twobiers <22715034+twobiers@users.noreply.github.com>
Signed-off-by: twobiers <22715034+twobiers@users.noreply.github.com>
Signed-off-by: twobiers <22715034+twobiers@users.noreply.github.com>
Signed-off-by: twobiers <22715034+twobiers@users.noreply.github.com>
Signed-off-by: twobiers <22715034+twobiers@users.noreply.github.com>
Signed-off-by: twobiers <22715034+twobiers@users.noreply.github.com>
Signed-off-by: twobiers <22715034+twobiers@users.noreply.github.com>
11e2136 to
5035730
Compare
There was a problem hiding this comment.
🧹 Nitpick comments (1)
pkg/reconciler/managed/api.go (1)
319-328: Minor: the in‑memoryAddAnnotationson conflict looks decorative.Nice touch refreshing
oon conflict so the next retry sees the latest resourceVersion. One small observation: the subsequentmeta.AddAnnotations(o, a)doesn't feed back into the next iteration —patchDatais rebuilt fromadirectly, so the in‑memory mutation ofodoesn't influence the retry. It's only observable to the caller after the method returns.If that caller‑visible side effect is intentional (so callers see the annotations they asked to persist), it might be worth a short comment; if not, the line can go. Just checking which one you had in mind.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/reconciler/managed/api.go` around lines 319 - 328, The meta.AddAnnotations(o, a) call inside the kerrors.IsConflict branch is only mutating the in-memory object and doesn't affect the subsequent retry because patchData is rebuilt from a; either remove the meta.AddAnnotations(o, a) line to avoid a misleading no-op, or if you intended the caller to observe the annotations after this method returns, keep it but add a concise comment above referencing meta.AddAnnotations, patchData, and the client.Get refresh to explain that this in-memory mutation is intentionally for caller-visible side effects despite not influencing retry behavior.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@pkg/reconciler/managed/api.go`:
- Around line 319-328: The meta.AddAnnotations(o, a) call inside the
kerrors.IsConflict branch is only mutating the in-memory object and doesn't
affect the subsequent retry because patchData is rebuilt from a; either remove
the meta.AddAnnotations(o, a) line to avoid a misleading no-op, or if you
intended the caller to observe the annotations after this method returns, keep
it but add a concise comment above referencing meta.AddAnnotations, patchData,
and the client.Get refresh to explain that this in-memory mutation is
intentionally for caller-visible side effects despite not influencing retry
behavior.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 5272e885-bbe5-47d0-ab3d-05564ab31d33
📒 Files selected for processing (5)
pkg/reconciler/managed/api.gopkg/reconciler/managed/api_test.gopkg/reconciler/managed/reconciler.gopkg/reconciler/managed/reconciler_legacy_test.gopkg/reconciler/managed/reconciler_modern_test.go
🚧 Files skipped from review as they are similar to previous changes (4)
- pkg/reconciler/managed/reconciler_modern_test.go
- pkg/reconciler/managed/reconciler.go
- pkg/reconciler/managed/reconciler_legacy_test.go
- pkg/reconciler/managed/api_test.go
Signed-off-by: twobiers <22715034+twobiers@users.noreply.github.com>
Signed-off-by: twobiers <22715034+twobiers@users.noreply.github.com>
|
@bobh66 I've rebased the branch and fixed the issues in the checks. They do pass locally now. Not sure if you have to kick off the workflows here again manually. |
Description of your changes
This PR ensures that critical annotations are stored when a resource is created. Some resources have non-deterministic external-names, which are never updated when the
mangementPoliciesdon't containLateInitialize.So far, there is an implicit contract that an Observation updating the external-name will be eventually stored as part of the LateInitialize process. However, that should only affect updates to the
specand not the annotations like external-name.More Context can be found here: crossplane/crossplane#5918
I'm not sure how a unit test can be expressed, as I'm not really familiar with the setup. If the change is approved, I think it would make sense to backport it aswell.
Fixes:
Maybe fixes aswell:
I have:
earthly +reviewableto ensure this PR is ready for review.Added or updated unit tests.Linked a PR or a docs tracking issue to document this change.backport release-x.ylabels to auto-backport this PR.Need help with this checklist? See the cheat sheet.