fix: atomic last-owner guard prevents TOCTOU race on role demotion#1590
fix: atomic last-owner guard prevents TOCTOU race on role demotion#1590rohilsurana wants to merge 8 commits intomainfrom
Conversation
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
📝 WalkthroughSummary by CodeRabbit
WalkthroughAdds a guarded policy-deletion flow to prevent removing the last role-holder for a resource. Introduces repository/service/mocks support for DeleteWithMinRoleGuard, a new sentinel error ErrLastRoleGuard, a Postgres guarded-delete implementation, and membership-service updates/tests to invoke and translate the DB guard into ErrLastOwnerRole. ChangesGuarded Policy Deletion for Owner Role Protection
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Suggested reviewers
🚥 Pre-merge checks | ✅ 2✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Tip 💬 Introducing Slack Agent: The best way for teams to turn conversations into code.Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.
Built for teams:
One agent for your entire SDLC. Right inside Slack. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Pull request overview
This PR aims to prevent a TOCTOU race in organization owner demotions by moving the “must keep at least 1 owner” check into an (intended) atomic delete path in the policy repository, while keeping the existing application-level validation as a fast-path.
Changes:
- Added
DeleteWithMinRoleGuardto the policy repository/service layer and wired it intoSetOrganizationMemberRole. - Introduced a new policy-layer error (
ErrLastRoleGuard) surfaced when a guarded delete would violate the minimum-role constraint. - Updated membership unit tests/mocks to use the new guarded delete method.
Reviewed changes
Copilot reviewed 8 out of 8 changed files in this pull request and generated 6 comments.
Show a summary per file
| File | Description |
|---|---|
| internal/store/postgres/policy_repository.go | Adds guarded delete SQL intended to atomically prevent deleting the last policy of a role for a resource. |
| core/policy/service.go | Exposes DeleteWithMinRoleGuard at service level and calls into repository. |
| core/policy/policy.go | Extends repository interface with DeleteWithMinRoleGuard. |
| core/policy/errors.go | Adds ErrLastRoleGuard. |
| core/policy/mocks/repository.go | Updates mocks for the new repository method. |
| core/membership/service.go | Switches org role replacement to guarded policy deletion for owner demotions. |
| core/membership/mocks/policy_service.go | Updates mocks for the new policy service method. |
| core/membership/service_test.go | Updates expectations to use guarded deletion and accounts for extra owner-role lookups. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Coverage Report for CI Build 25422697696Coverage decreased (-0.06%) to 41.905%Details
Uncovered Changes
Coverage Regressions6 previously-covered lines in 1 file lost coverage.
Coverage Stats
💛 - Coveralls |
| if err := s.relationService.Delete(ctx, relation.Relation{ | ||
| Object: relation.Object{ | ||
| ID: id, | ||
| Namespace: schema.RoleBindingNamespace, | ||
| }, | ||
| }); err != nil { | ||
| return err | ||
| } |
There was a problem hiding this comment.
This is still unguarded, right?
There was a problem hiding this comment.
yeah you are right, fixing that
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 8 out of 8 changed files in this pull request and generated 5 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 8 out of 8 changed files in this pull request and generated 1 comment.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Add DeleteWithMinRoleGuard to the policy repository that atomically checks the owner count within the DELETE statement itself. This eliminates the race where two concurrent demotion requests both pass the application-level owner count check then both proceed. The SQL condition ensures the DELETE only executes if the policy being removed is either not the guarded role, or at least one other policy with that role exists for the same resource. If the condition fails (last owner), zero rows are affected and ErrLastRoleGuard is returned. The existing validateMinOwnerConstraint remains as a fast-path optimization to avoid unnecessary DB round-trips.
1. Flip SpiceDB/DB ordering — DB delete first, then SpiceDB relation removal. Prevents inconsistent state if the guard rejects the delete. 2. Use SELECT FOR UPDATE in CTE to serialize concurrent deletions under READ COMMITTED isolation. Two concurrent DELETEs of different owner rows now block on the row lock instead of both proceeding. 3. Use TABLE_POLICIES constant instead of hardcoded "policies" string. 4. Derive resource_id/resource_type from existingPolicy inside the repository instead of trusting caller-supplied values. 5. Fetch owner role once — validateMinOwnerConstraint now returns the resolved ownerRoleID for reuse, eliminating the duplicate Get call.
- Only use guarded delete for owner policies, plain Delete for non-owner (avoids unnecessary locking/serialization) - Apply guarded delete in RemoveOrganizationMember flow too (same TOCTOU) - Add unit test for ErrLastRoleGuard → ErrLastOwnerRole mapping - Fix test expectations for non-owner policy deletions
…OrganizationMember If the guard rejects (last owner), we now return early before touching any SpiceDB relations, preventing partial state modification on rejection.
e796c64 to
4fdd3fb
Compare
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 8 out of 8 changed files in this pull request and generated 4 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Restructured RemoveOrganizationMember into 3 phases: 1. Classify policies by type (no deletions) 2. Guarded org owner delete (rejects early if last owner) 3. Remaining org policies, sub-resource policies, then relations This ensures guard rejection leaves no partial state.
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (1)
core/membership/service_test.go (1)
775-789: ⚡ Quick winAdd a happy-path
RemoveOrganizationMembertest that hits the guarded delete branch.The new removal assertions cover the reject path and plain-delete failures, but there is still no case that expects
DeleteWithMinRoleGuardduring a successful org-owner removal. That branch is the new phase-ordering behavior and is currently easy to regress unnoticed.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: bf6c45a2-8320-48a5-94d6-a6ad4db445ee
📒 Files selected for processing (8)
core/membership/mocks/policy_service.gocore/membership/service.gocore/membership/service_test.gocore/policy/errors.gocore/policy/mocks/repository.gocore/policy/policy.gocore/policy/service.gointernal/store/postgres/policy_repository.go
When rowsAffected==0, re-check existence inside the transaction. If the row is gone (concurrent delete), return sql.ErrNoRows which maps to ErrNotExist. Only return ErrLastRoleGuard when the row still exists but the guard condition prevented deletion.
There was a problem hiding this comment.
Actionable comments posted: 1
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: b47f33b1-63ca-4643-aaed-63cdd8eb4b01
📒 Files selected for processing (1)
internal/store/postgres/policy_repository.go
There was a problem hiding this comment.
🧹 Nitpick comments (1)
internal/store/postgres/policy_repository.go (1)
366-442: ⚡ Quick winAdd a Postgres integration test for the FOR UPDATE serialization.
The Coveralls report shows 0% patch coverage on this file (0/64 changed lines). For a guard whose entire correctness argument rests on row-locking under concurrent demotions, the existing
pg_sleep-based manual test described in the PR description should be promoted into the repository's docker-postgres integration suite — otherwise a future refactor (e.g., droppingFOR UPDATE, reordering the CTE, changing the predicate) can silently regress the TOCTOU fix without any signal in CI.Suggested coverage:
- Two concurrent
DeleteWithMinRoleGuardcalls against the only two guarded-role rows → exactly one returnsErrLastRoleGuard, the other succeeds, and the resource ends with exactly one guarded-role policy.- Concurrent delete of the same
idfrom two transactions → one succeeds, the other returnsErrNotExist(verifies the existence re-check path).- Demotion when target's
role_id != guardRoleID→ unconditional delete succeeds even when count of guarded-role rows is 0.- Guard rejection when target is the sole guarded-role holder → returns
ErrLastRoleGuardand row is preserved.Want me to draft the integration test scaffolding using the existing repository test harness?
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 013354a8-2a05-40ef-b061-7cc3d6740636
📒 Files selected for processing (1)
internal/store/postgres/policy_repository.go
Summary
SetOrganizationMemberRolerequests to demote the last two owners can both pass the application-levelvalidateMinOwnerConstraintcheck (each sees 2 owners) and both proceed, leaving the org with zero ownersDeleteWithMinRoleGuardto the policy repository that usesSELECT FOR UPDATEin a CTE to serialize concurrent deletions, then conditionally deletes only if at least one other owner remainsvalidateMinOwnerConstraint→replacePolicyWithOwnerGuard, eliminating duplicate role lookupsvalidateMinOwnerConstraintremains as a fast-path optimizationTest plan
pg_sleepforced race: withoutFOR UPDATEboth delete (0 owners), withFOR UPDATEsecond transaction blocks and is rejected (1 owner)