HYPERFLEET-895 - feat: Add force delete design#131
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
WalkthroughThis pull request introduces documentation and architectural decisions for a force-deletion feature in HyperFleet. It adds a comprehensive design specification for admin-only synchronous force-delete endpoints that allow recovery of resources stuck in Estimated code review effort🎯 1 (Trivial) | ⏱️ ~5 minutes 🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@hyperfleet/docs/force-deletion-design.md`:
- Around line 72-78: Add "500 Internal Server Error: delete failed due to
unexpected server error" to the Response codes list in the "Response codes:"
block so it matches the sequence diagram's error case; update the Response codes
section (the list currently containing 204, 403, 404, 409) to include the 500
entry and ensure the phrasing mirrors the sequence diagram's "500 Internal
Server Error" wording for consistency.
- Around line 83-86: The Database Impact section currently references the Hard
Delete Design but doesn't explicitly state that force delete uses the identical
bottom-up deletion ordering; update the prose under "Database Impact" (and/or
the "Force delete" description) to explicitly confirm that force delete performs
the same bottom-up ordering as normal hard-delete (e.g., nodepools before
clusters, child resources before parents) despite bypassing the Reconciled=True
gate, and add a short example or clause referencing the Hard Delete Design to
make the guarantee clear.
- Around line 68-79: Mark the missing authz middleware as a hard blocker for the
force-delete feature and update the design to require authz middleware be
implemented and enabled before any force-delete route is added; specifically
note in the document that routes.go's TODO for authz must be completed and
deployment must be gated (either via CI check, feature-flag, or a runtime
startup validation) so the force-delete endpoint cannot be active until authz is
registered, and ensure the handler that performs the force delete only accepts
requests when the resource is in Finalizing (deleted_time set).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Enterprise
Run ID: a29836ee-e833-4547-b7a0-e48699fd4528
📒 Files selected for processing (1)
hyperfleet/docs/force-deletion-design.md
3eebc62 to
9e3bfcc
Compare
ciaranRoche
left a comment
There was a problem hiding this comment.
Nice work, just a small piece of house keeping, i think there is some reference in the existing delete docs to force deletion having a separate spike, so might be worth updating and linking to this doc 🙏
for force-delete audit trail
to force deletion design
gauge for stuck detection metric
into HYPERFLEET-895/force-delete
DB-only scope and update trade-offs
| `POST /admin/clusters/{id}/force-delete` | ||
| `POST /admin/clusters/{cluster_id}/nodepools/{nodepool_id}/force-delete` | ||
|
|
||
| Force delete is a privileged operation exposed under the `/admin/` path prefix, a new pattern outside the standard `/api/hyperfleet/{version}/` convention. The `/admin/` prefix is used because force delete is an operational escape hatch, not a versioned resource API. The HyperFleet API is not directly customer-facing. It sits behind partner APIs (GCP, AWS/ROSA) that provide native authorization. Partners do not expose admin endpoints in their public API. |
There was a problem hiding this comment.
I think it should be enough having the special /force-delete endpoint to control the access to this functionality.
I still think of it as a valid operation on a versioned API resource.
I just want to have more opinions here
There was a problem hiding this comment.
Fair point. I went with /admin/ to make privileged operations structurally distinct at the URL level, but I agree it could live under the versioned path too. Happy to go either way if there's a team preference.
There was a problem hiding this comment.
Do we plan to send a payload where the version of it is controlled?
There was a problem hiding this comment.
The payload is minimal (just a reason string in, 204 out), so nothing to version today. I could see it going either way. What's your preference?
|
/lgtm |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: rafabene The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
7afa0c9
into
openshift-hyperfleet:main
Summary
Jira ticket
Design document for the force deletion mechanism. Force delete is a synchronous, admin-only API action that hard-deletes resources stuck in
Finalizing, bypassing theReconciled=Truegate.Covers:
Summary by CodeRabbit