feat(authz): swap existing-gate handlers to RequirePermission (PR 2c)#305
Conversation
🔐 Codex Security Review
Review SummaryOverall Risk: NONE FindingsNo findings in the scoped diff. NotesReviewed Generated by Codex Security Review | |
4c427ca to
6bec15e
Compare
cd8a669 to
b730354
Compare
ef17a98 to
394d3c1
Compare
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 394d3c1898
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
394d3c1 to
052ecd4
Compare
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 052ecd474d
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
There was a problem hiding this comment.
Pull request overview
This PR advances the server-side granular RBAC migration by replacing role-based “admin” gates with permission-based gates (RequirePermission) across several Connect-RPC handlers, and updates the RPC permission declaration map and associated unit tests to match.
Changes:
- Swapped multiple handler entry gates from
RequireAdmin/ inline role checks tomiddleware.RequirePermission(...)for sites, buildings, API keys, fleet-node admin, server logs, curtailment termination, and auth user-management endpoints. - Expanded
ProcedurePermissionsinrpc_permissions.goand reducedProceduresPendingMigrationaccordingly. - Updated/added handler-layer tests, including a shared
handlerstest.CtxWithPermissionshelper to construct contexts withEffectivePermissions.
Reviewed changes
Copilot reviewed 18 out of 18 changed files in this pull request and generated 8 comments.
Show a summary per file
| File | Description |
|---|---|
| server/internal/handlers/sites/handler.go | Swap site RPC gates to RequirePermission (site:read/manage). |
| server/internal/handlers/sites/handler_test.go | Update tests to inject permission-based contexts and assert permission gating. |
| server/internal/handlers/serverlog/handler.go | Replace inline role check with RequirePermission(serverlog:read). |
| server/internal/handlers/middleware/rpc_permissions.go | Move migrated procedures into ProcedurePermissions; update migration bookkeeping. |
| server/internal/handlers/handlerstest/permissions.go | New shared helper for building permission-bearing request contexts in handler tests. |
| server/internal/handlers/fleetnodeadmin/handler.go | Replace inline admin-session gate with RequirePermission(fleetnode:read/manage). |
| server/internal/handlers/curtailment/handler.go | Gate AdminTerminateEvent with RequirePermission(curtailment:manage). |
| server/internal/handlers/curtailment/handler_test.go | Update terminate-event gate test from role-based to permission-based. |
| server/internal/handlers/buildings/handler.go | Swap building RPC gates to RequirePermission (site:read/manage). |
| server/internal/handlers/buildings/handler_test.go | Update tests to inject permission-based contexts and assert permission gating. |
| server/internal/handlers/auth/handler.go | Add handler-layer permission gates for user-management RPCs. |
| server/internal/handlers/auth/handler_gate_test.go | New fast-path tests asserting user-management RPCs deny without required permissions. |
| server/internal/handlers/apikey/handler.go | Replace local admin helper with RequirePermission(apikey:manage). |
| server/internal/handlers/apikey/handler_error_sanitization_test.go | Update interceptor test injector to provide EffectivePermissions. |
| server/internal/domain/authz/reconcile_test.go | Update admin seed expectations around user/role permissions. |
| server/internal/domain/authz/builtin.go | Update ADMIN seed-permission documentation and exclusions. |
| server/internal/domain/auth/service.go | Remove checkCanManageUser; add role-hierarchy check for user mutations and cross-org guards. |
| server/internal/domain/auth/service_test.go | Add unit test coverage for the new role-hierarchy helper. |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 4e4bba4dd1
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
Addresses PR review feedback (#305) — Codex P2. ResetUserPassword and DeactivateUser previously mapped every error from GetUserRoleName to InvalidArgument("invalid user_id"). A transient DB failure (timeout, connection drop, etc.) is not bad input; the caller should retry, not assume they typed the wrong ID. Distinguish: sql.ErrNoRows → InvalidArgument (target not in caller's org, preserves the existence-leak guard). Any other error → Internal, matching the rest of the codebase's error-classification pattern.
Re: Codex Security ReviewBoth HIGH findings (site mutations + building handlers using Declined as intentional: the handlers migrated in this PR (sites, buildings, apikey, fleetnodeadmin, serverlog, curtailment Admin*, auth user-management) were previously gated by Also worth noting: today's authz model is additive grants (no "deny-at-site"); the "org grant bypasses site-scoped restriction" scenario only matters when a user holds only site-scoped grants without an org-scope grant, which doesn't apply to org admins. The PR description's "What this PR does NOT do" section now documents the deferral explicitly. |
Addresses PR review feedback (#305) — Codex P2. ResetUserPassword and DeactivateUser previously mapped every error from GetUserRoleName to InvalidArgument("invalid user_id"). A transient DB failure (timeout, connection drop, etc.) is not bad input; the caller should retry, not assume they typed the wrong ID. Distinguish: sql.ErrNoRows → InvalidArgument (target not in caller's org, preserves the existence-leak guard). Any other error → Internal, matching the rest of the codebase's error-classification pattern.
ab78311 to
a4fe6f9
Compare
…tion Addresses PR review feedback (#305) — Codex security review. **Finding 1 — custom-role hijack via password reset (HIGH).** The prior role-name-based hierarchy check treated only ADMIN / SUPER_ADMIN as elevated. An ADMIN with user:manage could reset the password of a user whose custom role grants role:manage (or any other sensitive permission), log in, and escalate. Replace requireCanManageTargetRole(string, string) with requireCallerCanManageTarget(callerKeys, targetKeys []string): caller's effective permissions must be a superset of target's. Role names are no longer consulted; operator-created custom roles get the right treatment automatically. - Plumb *authz.PermissionResolver through auth.NewService so the service can load the target's effective permissions (and the caller's) independent of the request middleware. - Add UserManagementStore.ListPermissionKeysByRoleID for CreateUser, which checks against the role being assigned (currently always ADMIN). - TestRequireCallerCanManageTarget covers super-admin/admin/field-tech/ custom-role combinations including the escalation primitive (custom role with role:manage). **Finding 3 — ADMIN user:manage backfill on upgrade (MEDIUM).** The additive-only reconciler is correct: it preserves operator-revoked permissions across restarts (covered by TestReconcile_OperatorRevokedSensitivePermStaysRevokedOnRestart). Re-asserting the seed every boot would silently restore deliberate revocations — that's the wrong fix. Ship a one-off migration (000060_backfill_admin_user_manage) that inserts user:read and user:manage into every existing ADMIN role row. The INSERT ... ON CONFLICT DO NOTHING makes it idempotent and safe on orgs already holding either key. Custom roles aren't touched (WHERE builtin_key = 'ADMIN'). Update builtin.go to point at the migration for future seed-formula changes that need to propagate to upgraded orgs.
Follow-up to the security reviewThree findings re-assessed; pushed Finding 1 (HIGH custom-role hijack) — fixedReplaced the role-name-based hierarchy check (
Finding 3 (MEDIUM ADMIN backfill) — fixed via migrationConsidered changing Shipping a one-off backfill migration instead: Finding 2 (HIGH site/building scope) — correction postedAcknowledged on each declined thread: my earlier rationale ("the model is additive grants, no deny-at-site") was wrong. cc the inline reviewers. |
… ListUsers gap
Migrates the handlers that already had an admin gate to the catalog-permission
model introduced in PR 2a. Behavior preserved for the existing gates; the
ListUsers handler picks up a real gate for the first time (PR 2b's review
flagged this as a P0 — any authenticated org member could enumerate users).
Swaps:
- buildings/handler.go (5 procedures): RequireAdmin -> RequirePermission with
PermSiteRead for reads, PermSiteManage for writes.
- sites/handler.go (6 procedures): RequireAdmin -> RequirePermission with
PermSiteRead for ListSites, PermSiteManage for mutations.
- apikey/handler.go (3 procedures): drop the local requireAdmin helper, gate
every endpoint on PermAPIKeyManage.
- fleetnodeadmin/handler.go (4 working procedures): drop the local
requireAdminSession helper, gate by PermFleetnodeRead (List) or
PermFleetnodeManage (Create/Confirm/Revoke). The four Unimplemented stubs
stay in ProceduresPendingMigration.
- serverlog/handler.go: drop the local requireAdmin helper, gate by
PermServerlogRead.
- curtailment/handler.go AdminTerminateEvent: swap to
RequirePermission(PermCurtailmentManage). Start/Stop/Preview keep their
conditional inline gates pending the broader curtailment authz redesign.
- auth/service.go: delete checkCanManageUser (only caller was the connect
handler). Gate moves to the handler layer:
- CreateUser/DeactivateUser/ResetUserPassword: PermUserManage.
- ListUsers: PermUserRead (closes the P0 from PR 2b review).
rpc_permissions.go now reflects the swap: 23 procedures move from
ProceduresPendingMigration to ProcedurePermissions; the contract test
catches drift between the map and the actual handler call.
Tests:
- Handler-level gate tests in buildings, sites, and curtailment switched
from role-based assertions ("VIEWER", "ADMIN", "SUPER_ADMIN") to
permission-set assertions wired through middleware.WithEffectivePermissions.
- The adminAuthInjector in the apikey sanitization test now stuffs an
org-scope EffectivePermissions value carrying PermAPIKeyManage onto the
request context.
… harden tests
Code review of PR 2c surfaced one P1 security issue and several testing /
maintainability gaps that warranted closing in the same PR:
1. **Cross-org IDOR (security P1):** with checkCanManageUser deleted, the
service-layer methods resolved the target user via GetUserByExternalID
(a global lookup) without verifying the target belongs to the caller's
org. A SUPER_ADMIN in org A who learned an external user_id from org B
could reset that user's password or deactivate them. Add a
GetUserRoleName(target, callerOrg) probe in ResetUserPassword and
DeactivateUser; an error from the probe means the target isn't in the
caller's org and we return InvalidArgument without leaking existence.
2. **ADMIN locked out of /settings/team (P1 regression):** the new ListUsers
gate uses PermUserRead, which the previous ADMIN seed formula excluded
alongside PermUserManage. Result: any default ADMIN user opening the
Team page got PermissionDenied. Drop user:read from the exclusion set
so ADMIN holds user:read (view team) but still lacks user:manage
(mutate users). Reconcile test updated to assert the new shape.
3. **Test helper duplication (P1 maintainability):** ctxWithPermissions
was copy-pasted between buildings/sites tests. Extract to a new
internal/handlers/handlerstest package; both call sites now import it.
4. **No fast-path unit tests for the auth gates (P1 testing):** the four
new RequirePermission gates on Create/Deactivate/ResetPassword/List
only had DB-backed integration coverage. Add a
handler_gate_test.go that wires EffectivePermissions through the
shared handlerstest helper and asserts PermissionDenied on the deny
path for each gate.
Safe-auto polish applied in the same commit:
- Stale '(Super Admin only)' docstrings on auth service + handler methods
replaced with notes pointing at the handler-layer RequirePermission gate.
- adminCtx test helper renamed to sitePermsCtx since it now wires
permission keys, not a role.
- Stale 'requireAdminFromContext' attribution in curtailment validation-
test comment updated to point at middleware.RequirePermission.
- Dead Role field removed from apikey sanitization-test's
adminAuthInjector now that gate reads EffectivePermissions, not the
role string.
Deferred to follow-up:
- Empty ResourceContext{} on every swapped call site (latent gap for
site-scoped assignments).
- Curtailment Start/Stop/Preview partial-revocation cascade — PR 2d.
- TestHandler_sitePermissionsPassGate PermSiteManage subcase always
also wires PermSiteRead; the case proves nothing about manage alone.
- buildings authGate denial-loop only exercises List + Create (Get,
Update, Delete not covered in the gate test).
…in layer ADMIN previously could not create, reset, or deactivate any user — the seed formula excluded user:manage outright. The intended design is for ADMIN to manage non-elevated users (custom roles, FIELD_TECH) while ADMIN and SUPER_ADMIN targets stay SUPER_ADMIN-only. Wire that up: - adminSeedPermissions(): only exclude role:manage. ADMIN now holds user:manage at the permission level. - requireCanManageTargetRole(): pure helper enforcing the hierarchy — a non-SUPER_ADMIN caller cannot mutate an ADMIN or SUPER_ADMIN target. SUPER_ADMIN is unrestricted. - CreateUser: gate caller against AdminRoleName (the role assigned to every newly-created user today). A non-SUPER_ADMIN caller is blocked until the proto grows a role field. - ResetUserPassword / DeactivateUser: reuse the existing cross-org GetUserRoleName probe as the target-role input — no extra DB roundtrip — and then check the caller's role. role:manage stays SUPER_ADMIN-only because role edits can grant arbitrary permissions and have no comparable per-target hierarchy. Tests: - TestRequireCanManageTargetRole: 10 cases covering each caller/target combination including custom roles. - TestReconcile_FreshInstall_AdminExcludesRoleManagement: renamed + asserts ADMIN now holds user:manage.
Every current RequirePermission caller passes an empty ResourceContext because the migrated handlers are all org-scoped. Flag the first site-scoped migration as the right moment to extract a shared helper (e.g. siteResourceForMiner) instead of inlining the miner_id → site_id lookup at each callsite.
Addresses PR review feedback (#305) — Codex P2. ResetUserPassword and DeactivateUser previously mapped every error from GetUserRoleName to InvalidArgument("invalid user_id"). A transient DB failure (timeout, connection drop, etc.) is not bad input; the caller should retry, not assume they typed the wrong ID. Distinguish: sql.ErrNoRows → InvalidArgument (target not in caller's org, preserves the existence-leak guard). Any other error → Internal, matching the rest of the codebase's error-classification pattern.
Rebase onto main picked up #308 / #311, which added a defense-in-depth requireAdminFromContext call ahead of the curtailment:manage gate. PR 2c's design moves RBAC to the authoritative check, so drop the legacy role gate and reorder so RequirePermission runs before the nil-service check (auth-before-implementation; preserves Unauthenticated on missing session instead of leaking Unimplemented). The TestHandler_AdminTerminateEvent_RejectsNonAdmin case asserted the legacy role gate's defense-in-depth behavior — it's redundant with the renamed TestHandler_AdminTerminateEvent_RejectsCallerWithoutCurtailmentManage which already covers the RBAC denial path. Drop the obsolete test.
…tion Addresses PR review feedback (#305) — Codex security review. **Finding 1 — custom-role hijack via password reset (HIGH).** The prior role-name-based hierarchy check treated only ADMIN / SUPER_ADMIN as elevated. An ADMIN with user:manage could reset the password of a user whose custom role grants role:manage (or any other sensitive permission), log in, and escalate. Replace requireCanManageTargetRole(string, string) with requireCallerCanManageTarget(callerKeys, targetKeys []string): caller's effective permissions must be a superset of target's. Role names are no longer consulted; operator-created custom roles get the right treatment automatically. - Plumb *authz.PermissionResolver through auth.NewService so the service can load the target's effective permissions (and the caller's) independent of the request middleware. - Add UserManagementStore.ListPermissionKeysByRoleID for CreateUser, which checks against the role being assigned (currently always ADMIN). - TestRequireCallerCanManageTarget covers super-admin/admin/field-tech/ custom-role combinations including the escalation primitive (custom role with role:manage). **Finding 3 — ADMIN user:manage backfill on upgrade (MEDIUM).** The additive-only reconciler is correct: it preserves operator-revoked permissions across restarts (covered by TestReconcile_OperatorRevokedSensitivePermStaysRevokedOnRestart). Re-asserting the seed every boot would silently restore deliberate revocations — that's the wrong fix. Ship a one-off migration (000060_backfill_admin_user_manage) that inserts user:read and user:manage into every existing ADMIN role row. The INSERT ... ON CONFLICT DO NOTHING makes it idempotent and safe on orgs already holding either key. Custom roles aren't touched (WHERE builtin_key = 'ADMIN'). Update builtin.go to point at the migration for future seed-formula changes that need to propagate to upgraded orgs.
0cf6a71 to
eb7be2c
Compare
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: eb7be2c248
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
Addresses PR review feedback (#305) — Codex P1. The prior parity check compared callerEff.FlatKeys() against targetEff.FlatKeys(), which collapses site-scope and org-scope grants into one set. That lets a caller launder a site-scoped permission (e.g. role:manage at one site via a custom assignment) into apparent authority over an org-scoped target — exactly the escalation primitive the check was meant to block. Replace the flat-key comparison with a scope-aware predicate: - New EffectivePermissions.IsSubsumedBy(other): for every (key, scope) the receiver holds, other.Has(key, that scope) must be true. Reuses the existing narrowing-aware Has(), so site-scope narrowing applies correctly when the caller has any site-scope assignment at that site. - requireCallerCanManageTarget now takes *EffectivePermissions on both sides and delegates to IsSubsumedBy. - ResetUserPassword / DeactivateUser load both caller and target via permResolver.LoadEffective (no more FlatKeys flattening). - CreateUser synthesizes the target as a single org-scope Assignment built from the assigned role's permissions — the new user holds them org-wide, so the synthetic snapshot matches reality. Tests: - TestEffective_IsSubsumedBy covers the org-vs-site mismatch cases directly (including the reviewer's example: caller with site-scoped role:manage must NOT subsume a target with org-scoped role:manage). - TestRequireCallerCanManageTarget refactored to use real Assignment values; adds the laundering case + a site-scoped-target case.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 6d7ea29058
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
Addresses PR review feedback (#305) — Codex P1. After granting ADMIN user:manage, the privilege-parity check passed by equality for ADMIN ↔ ADMIN: target's permission set equals caller's, so "target ⊆ caller" trivially held. That turned CreateUser into a way for any ADMIN to create another ADMIN and walk off with its temp password, and let ADMIN reset / deactivate peer ADMINs — both contrary to the design intent that ADMIN/SUPER_ADMIN mutations stay reserved for SUPER_ADMIN. Tighten the check: subset still required, plus the caller must hold org-scope role:manage OR have a *strictly* larger permission set than the target. Permission-based — no role-name strings — so an operator can authorize a custom role for peer management by granting it role:manage explicitly. Resulting semantics: - SUPER_ADMIN manages anyone (role:manage bypass). - ADMIN manages strictly-fewer-perms targets (e.g. FIELD_TECH, custom). - ADMIN BLOCKED from peer ADMIN, from CreateUser (which assigns ADMIN to the new account), and from any target holding a permission ADMIN doesn't (e.g. a custom role with role:manage). builtin.go docstring updated to explain why role:manage is excluded from the ADMIN seed (it's the peer-management bypass).
Addresses PR review feedback (#305) from @ankitgoswami. - requireCallerCanManageTarget docstring: collapse 20-line two-bullet block into a single sentence that still names the bypass rule and why equality without it is dangerous. - Drop "Privilege parity:" inline preambles at the three call sites (CreateUser / ResetUserPassword / DeactivateUser) — they restated the helper's docstring. Function name carries the meaning. - Tighten the cross-org guard comment in ResetUserPassword / DeactivateUser to one short paragraph naming the existence-leak reason and the Internal-vs-InvalidArgument split. No behavior change.
Addresses PR review feedback (#305) from @ankitgoswami. ResetUserPassword and DeactivateUser duplicated the same five-step flow: GetUserByExternalID → cross-org guard via GetUserRoleName → load target/caller effective permissions → requireCallerCanManageTarget. CreateUser had an analogous but distinct shape (synthesize target from the assigned role's permissions). Extract two service methods so each handler reads as one named step plus its actual business logic. - authorizeCallerForUser(ctx, callerUserID, orgID, targetExternalID) → (stores.User, error): the shared path. Returns the resolved target so callers can use its IDs / username downstream. - authorizeCallerForNewUserWithRole(ctx, callerUserID, orgID, roleName) → (stores.Role, error): the CreateUser variant. Returns the resolved role so the caller can bind the new user to it. Net: 148 lines shifted, −76 / +72; ResetUserPassword and DeactivateUser each lose ~25 lines of duplicated lookup + parity-check scaffolding. No behavior change; existing tests cover the paths.
Addresses PR review feedback (#305) from @ankitgoswami. The privilege-parity check expressed "subset AND (role:manage OR strict subset)" via three sequential conditionals reading IsSubsumedBy twice in opposite directions. Extracting StrictlyDominates(other) — defined as other.IsSubsumedBy(e) && !e.IsSubsumedBy(other) — turns the helper into two readable branches: role:manage holders need subsumes-only, everyone else needs strict dominance. No behavior change; existing TestRequireCallerCanManageTarget cases all still pass. Added TestEffective_StrictlyDominates to cover the predicate directly (proper superset / subset / equality).
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 8baf1a7889
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
Addresses PR review feedback (#305) — Codex P1. The prior implementation iterated the receiver's keys and asked other.Has(key, scope) at each one. For receiver org-scope keys it only asked other.Has(key, ResourceContext{}), which ignored sites where other has a site-scope assignment that narrows away the key. Result: a caller with org-scope SUPER_ADMIN plus a restrictive FIELD_TECH assignment at site 7 would falsely subsume an org ADMIN target whose ADMIN authority is still live at site 7 (no narrowing on the target side). ResetUserPassword would then hand the caller credentials for an account with broader site-7 authority than the caller itself wields. Two-domain check: 1. Org-scope: every org-scope key the receiver holds must be in other's org-scope (unchanged). 2. Site-scope: at every site where either party has a narrowing assignment, verify each key the receiver can perform there is also performable by other at that site. "Each key the receiver can perform at site s" = receiver.bySite[s] if it has one (narrowing applies), otherwise receiver.orgScope (org-scope applies). Uses the narrowing-aware Has() on the other side, so fall-through to org-scope is handled automatically. Sites where neither party narrows remain covered by the org-scope pass alone — there's no per-site differentiation to detect. Tests: - TestEffective_IsSubsumedBy: new case for the caller-narrowing scenario (broad org-scope caller with site-7 narrowing must NOT subsume an org-scope target whose authority is live at site 7). - TestRequireCallerCanManageTarget: end-to-end case mirroring the reviewer's exact framing (org-scope SUPER_ADMIN + FIELD_TECH site-7 narrowing → cannot reset an org-ADMIN target's password).
|
@codex review |
|
Codex Review: Didn't find any major issues. More of your lovely PRs please. ℹ️ About Codex in GitHubYour team has set up Codex to review pull requests in this repo. Reviews are triggered when you
If Codex has suggestions, it will comment; otherwise it will react with 👍. Codex can also answer questions or update the PR. Try commenting "@codex address that feedback". |
Third slice of PR 2 from the granular RBAC plan. Mechanical handler swap plus the privilege-parity check and migration that fell out of code review.
PR 2b (rpc_permissions infra, #303) is merged on main; this PR now targets
maindirectly.What landed
Handler swaps (RequireAdmin → RequirePermission)
buildings/handler.go(5 procedures):site:readfor reads,site:managefor writes.sites/handler.go(6 procedures):site:readfor ListSites,site:managefor the rest.Inline-check swaps (drop local helpers)
apikey/handler.go(3): droprequireAdmin; gate every endpoint onapikey:manage.fleetnodeadmin/handler.go(4 working procedures): droprequireAdminSession; gate byfleetnode:read(List) orfleetnode:manage(Create/Confirm/Revoke). The four Unimplemented stubs stay inProceduresPendingMigration— they don't reach a gate today and won't until the methods are actually implemented.serverlog/handler.go: droprequireAdmin; gate byserverlog:read.curtailment/handler.goAdminTerminateEvent: drop the legacyrequireAdminFromContextdefense-in-depth gate and rely onRequirePermission(PermCurtailmentManage)as the authoritative check. Start/Stop/Preview keep their conditional inline gates pending the broader curtailment authz redesign (PR 2d).Auth domain layer
auth/service.go::checkCanManageUser(sole callers were the Connect handlers; no domain-internal callers).handlers/auth/handler.go:CreateUser,DeactivateUser,ResetUserPassword→RequirePermission(PermUserManage)ListUsers→RequirePermission(PermUserRead). This closes the P0 surfaced in PR 2b's review —ListUserspreviously had no role check at all, so any authenticated org member could enumerate users.GetUserRoleNameerrors:sql.ErrNoRows(target not in caller's org) →InvalidArgument; any other error (transient store failure) →Internal. Avoids treating a backend hiccup as bad input.ADMIN seed + privilege-parity check
user:read+user:manage; onlyrole:manageis excluded from the formula.role:managestays SUPER_ADMIN-only because it can grant arbitrary permissions and is the catalog's ultimate authority key.requireCallerCanManageTarget(inauth/service.go) gates every user mutation (CreateUser/ResetUserPassword/DeactivateUser) on two conditions:(permission key, scope)the target holds must also be held by the caller. NewEffectivePermissions.IsSubsumedByreuses the narrowing-awareHas(), so a site-scoped grant cannot launder into apparent org-scope authority.role:manage. Without this, equality (ADMIN target equals ADMIN caller) would make CreateUser a primitive for ADMIN to mint new ADMINs and walk off with the temp password.role:managebypass).role:manage.Migration: backfill user:read + user:manage to existing ADMIN rows
migrations/000060_backfill_admin_user_manage.up.sql. Migration 000053 seeded ADMIN explicitly excluding both keys; the additive-only reconciler does not re-assert seed permissions onto existing role rows, so upgraded orgs would silently deny ADMIN on every newly-gated user-management endpoint without an explicit backfill.INSERT ... ON CONFLICT DO NOTHING, scoped toWHERE builtin_key = 'ADMIN'. Idempotent; operator-created custom roles untouched. Down migration removes both keys from ADMIN.builtin.godocstring updated to point at this migration as the pattern for future seed-formula changes.rpc_permissions.go bookkeeping
ProceduresPendingMigration→ProcedurePermissions.What this PR does NOT do
RequirePermissioncallers in this PR passauthz.ResourceContext{}, which makes the check org-scoped.EffectivePermissions.Has()does implement narrowing (a site-scope assignment shadows org-scope at that site), so the migrated handlers preserve the pre-existingRequireAdmin(org-wide) semantics rather than introducing narrowing. Site-scope wiring lands in PR 2d, tracked by a TODO atmiddleware/permission.gocalling out the sharedsiteResourceForMiner/siteResourceForBuildinghelper that should accompany the first site-scoped handler migration sobuilding_id → site_idlookups don't get inlined across handlers.user_organization.role_idor add the raising trigger. PR 2d.middleware/admin.go. PR 2d onceProceduresPendingMigrationempties.Test plan
go build ./...cleango vet ./...cleango test ./internal/handlers/middleware/...green (contract test still passes; the map matches every swapped handler).go test ./internal/handlers/buildings/... ./internal/handlers/sites/... ./internal/handlers/curtailment/...green after switching handler-level gate tests from role-based to permission-based assertions.go test ./internal/handlers/apikey/... -run TestAPIKeyHandler_SanitizesInternalErrorsgreen after wiringEffectivePermissionsthrough the test'sadminAuthInjector.go test ./internal/domain/auth/...green. Privilege-parity helper covered byTestRequireCallerCanManageTarget(super_admin/admin/field_tech/custom-role combos, including the equality block, the site-scope laundering case, and the operator-granted-role:manage-custom-role bypass).go test ./internal/domain/authz/...green.TestEffective_IsSubsumedBycovers scope-aware subset semantics directly.TestReconcile_NewCatalogPermissionDoesNotPropagateToExistingBuiltinsconfirms operator-revoked permissions still survive restart (additive contract intact); the migration is what propagates seed-formula changes to upgraded orgs.go test ./...green.Rollback
Reverting this PR reinstates
RequireAdmin,checkCanManageUser, and the local helper functions. Migration000060_backfill_admin_user_managehas a.down.sqlthat removesuser:readanduser:managefrom every ADMIN row; note this also removes the keys from orgs that already held them pre-backfill (provenance isn't tracked, so the down migration over-corrects on first-install orgs — acceptable in dev rollback scenarios but not for prod recovery).