Skip to content

feat(checkpoints): FGAC checkpoint routes, enforce via parent task pe…#260

Open
jenniechung wants to merge 2 commits into
mainfrom
jenniechung/AGX1-302-checkpoint-route-fgac
Open

feat(checkpoints): FGAC checkpoint routes, enforce via parent task pe…#260
jenniechung wants to merge 2 commits into
mainfrom
jenniechung/AGX1-302-checkpoint-route-fgac

Conversation

@jenniechung
Copy link
Copy Markdown

@jenniechung jenniechung commented Jun 1, 2026

Summary

Enforce Spark AuthZ on the checkpoint routes by delegating to the owning task. Checkpoints have no SpiceDB type of their ownthread_id is the owning task id (LangGraph checkpointer convention; the SDK templates set configurable.thread_id = task.id), so each route checks the parent task directly via DAuthorizedBodyId(task, …, field_name="thread_id"). No register_resource, no dual-write — the authz request only ever carries resource type task.

Operation mapping (matches the task SpiceDB schema):

Route Operation Task permission
get-tuple, list read viewer + editor + owner
put, put-writes update editor + owner
delete-thread delete owner

A denied task check collapses to 404 (not 403) so callers cannot probe cross-tenant existence by comparing response codes.

Switches put / put-writes from executeupdate: execute maps to no task permission in the schema (it would lock everyone out), whereas task.update = (editor + owner) & internal_tenant_gate lets the editor role write checkpoints without owner.

Changes

  • src/utils/authorization_shortcuts.pyDAuthorizedBodyId collapses a denied task check into 404 (AuthorizationError → ItemDoesNotExist); non-task resources keep surfacing 403. Carries a TODO to converge on the canonical check_task_or_collapse_to_404 helper from AGX1-275 / feat(AGX1-275): per-RPC task permission rewire and 404/403 wrap #249.
  • src/api/routes/checkpoints.pyput / put-writes: execute → update. (get-tuple / list already read; delete-thread already delete.)
  • tests/integration/api/checkpoints/test_checkpoints_authz_api.py (new) — integration tests covering the auth matrix.

Behavior change beyond the ticket

The 404 collapse lives in DAuthorizedBodyId's task branch, so it also applies to the other body-id task routes (message POST/PUT, state create) — those now collapse a denied check to 404 as well. Same cross-tenant-leak rationale, and aligns with the canonical wrap landing in AGX1-275 (#249).

Out of scope

Coordination

AGX1-275 (#249) introduces the shared check_task_or_collapse_to_404 helper in src/utils/task_authorization.py and the same execute → update swap across checkpoints.py / messages.py / states.py. This PR's inline DAuthorizedBodyId task-wrap (flagged with a TODO) converges on that helper when the two land together — trivial merge regardless of order.

Pre-merge wire dependency: the execute → update swap changes the operation literal hitting agentex-auth. update is already in SGP's AgentexOperation enum on master, so no additional gating is needed for this PR.

Test Plan

  • New: tests/integration/api/checkpoints/test_checkpoints_authz_api.py — 7/7 passing:
    • get-tuple / list authorized → 200 (one task/read check); denied → 404 via collapse.
    • put authorized → 200 (one task/update check); denied → 404.
    • delete-thread authorized → 204 (one task/delete check).
  • Ruff lint + format clean across changed files.
  • Regression: tests/integration/api/{messages,states}/, since the shared DAuthorizedBodyId task-wrap changes their write-route deny code from 403 → 404. Both suites pass; no regressions from the DAuthorizedBodyId task-wrap change.

Linear Issue

resolves https://linear.app/scale-epd/issue/AGX1-302, Parent: AGX1-299.

Greptile Summary

This PR enforces FGAC authorization on the five checkpoint routes by delegating to the parent task (thread_id is the task ID), and fixes put/put-writes from the non-functional execute operation to task.update. A new task-specific branch in DAuthorizedBodyId collapses AuthorizationErrorItemDoesNotExist (404) to prevent cross-tenant probing, with a TODO tracking convergence on the shared check_task_or_collapse_to_404 helper from AGX1-275.

  • checkpoints.py: put and put-writes now check task.update (editor + owner) instead of task.execute (unmapped, would lock everyone out). get-tuple/list remain read; delete-thread remains delete.
  • authorization_shortcuts.py: DAuthorizedBodyId gains a task branch that re-raises AuthorizationError as 404, keeping gateway/service errors (502/503) unchanged. The change also silently affects other body-id task routes (message POST/PUT, state create) as documented in the PR.
  • test_checkpoints_authz_api.py: Seven new integration tests cover the auth matrix; put-writes is not tested (authorized or denied), and the delete-thread denied path is absent.

Confidence Score: 4/5

Safe to merge with minor test gaps; the authorization logic itself is correct and well-scoped.

The core changes in checkpoints.py and authorization_shortcuts.py are correct. The test file omits put-writes entirely and skips the delete-thread denied path, leaving two gaps in the authorization coverage the PR set out to verify.

agentex/tests/integration/api/checkpoints/test_checkpoints_authz_api.py is missing put-writes tests and the delete-thread denied path.

Important Files Changed

Filename Overview
agentex/src/api/routes/checkpoints.py Swaps execute → update on put and put-writes routes; all five routes now enforce the correct task operation via DAuthorizedBodyId.
agentex/src/utils/authorization_shortcuts.py Adds a task-specific branch to DAuthorizedBodyId that catches AuthorizationError and re-raises as ItemDoesNotExist (404). Non-task resources still surface 403; gateway/service errors propagate correctly.
agentex/tests/integration/api/checkpoints/test_checkpoints_authz_api.py New integration test suite covering get-tuple, list, put, and delete-thread authorization, but put-writes is entirely untested and the delete-thread denied path is missing.

Sequence Diagram

sequenceDiagram
    participant Client
    participant CheckpointRoute
    participant DAuthorizedBodyId
    participant AuthzService
    participant CheckpointsUseCase

    Client->>CheckpointRoute: "POST /checkpoints/{route}"
    CheckpointRoute->>DAuthorizedBodyId: "resolve dependency field_name=thread_id"
    DAuthorizedBodyId->>DAuthorizedBodyId: "body[thread_id] = field_value"

    alt "resource_type == task"
        DAuthorizedBodyId->>AuthzService: check(task, field_value, operation)
        alt Authorized
            AuthzService-->>DAuthorizedBodyId: OK
            DAuthorizedBodyId-->>CheckpointRoute: field_value
            CheckpointRoute->>CheckpointsUseCase: execute business logic
            CheckpointsUseCase-->>Client: 200 or 204
        else AuthorizationError
            AuthzService-->>DAuthorizedBodyId: AuthorizationError
            DAuthorizedBodyId->>DAuthorizedBodyId: raise ItemDoesNotExist
            DAuthorizedBodyId-->>Client: 404
        else Gateway or ServiceError
            AuthzService-->>DAuthorizedBodyId: 502 or 503
            DAuthorizedBodyId-->>Client: 502 or 503
        end
    else other resource_type
        DAuthorizedBodyId->>AuthzService: check(resource, operation)
        AuthzService-->>Client: 403 on denial
    end
Loading

Fix All in Cursor Fix All in Claude Code Fix All in Codex

Prompt To Fix All With AI
Fix the following 2 code review issues. Work through them one at a time, proposing concise fixes.

---

### Issue 1 of 2
agentex/tests/integration/api/checkpoints/test_checkpoints_authz_api.py:61
**Missing `put-writes` test coverage**

The `put-writes` endpoint had the same `execute → update` operation change as `put`, but has no tests at all — neither an authorized path (verifying the `update` operation is sent to the authz service) nor a denied path (verifying the 404 collapse). The PR test plan lists 7 tests and explicitly names the other four routes, but `put-writes` is absent. A regression that re-introduced `execute` on this route would go undetected.

### Issue 2 of 2
agentex/tests/integration/api/checkpoints/test_checkpoints_authz_api.py:325-369
**No denied-path test for `delete-thread`**

Every other route with a denied case has a corresponding `_unauthorized_returns_404` test (get-tuple, list, put), but `delete-thread` only has the authorized 204 path. Because `delete` maps to the owner role only — the most restrictive permission — validating that a denied check collapses to 404 rather than 403 is particularly important for this route to preserve the cross-tenant opacity guarantee.

Reviews (1): Last reviewed commit: "move back imports" | Re-trigger Greptile

Greptile also left 2 inline comments on this PR.

@jenniechung jenniechung marked this pull request as ready for review June 1, 2026 23:00
@jenniechung jenniechung requested a review from a team as a code owner June 1, 2026 23:01
if path == "/v1/authz/search":
return {"items": []}
raise Exception(f"Unknown path: {path}")

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Missing put-writes test coverage

The put-writes endpoint had the same execute → update operation change as put, but has no tests at all — neither an authorized path (verifying the update operation is sent to the authz service) nor a denied path (verifying the 404 collapse). The PR test plan lists 7 tests and explicitly names the other four routes, but put-writes is absent. A regression that re-introduced execute on this route would go undetected.

Prompt To Fix With AI
This is a comment left during a code review.
Path: agentex/tests/integration/api/checkpoints/test_checkpoints_authz_api.py
Line: 61

Comment:
**Missing `put-writes` test coverage**

The `put-writes` endpoint had the same `execute → update` operation change as `put`, but has no tests at all — neither an authorized path (verifying the `update` operation is sent to the authz service) nor a denied path (verifying the 404 collapse). The PR test plan lists 7 tests and explicitly names the other four routes, but `put-writes` is absent. A regression that re-introduced `execute` on this route would go undetected.

How can I resolve this? If you propose a fix, please make it concise.

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

Fix in Cursor Fix in Claude Code Fix in Codex

Comment on lines +325 to +369
"metadata": {"step": 1},
"blobs": [],
},
)
assert response.status_code == 404

# ── delete-thread ──

@pytest.mark.asyncio
@patch(
"src.api.authentication_middleware.AgentexAuthMiddleware.is_enabled",
return_value=True,
)
@patch(
"src.domain.services.authorization_service.AuthorizationService.is_enabled",
return_value=True,
)
@patch(
"src.utils.http_request_handler.HttpRequestHandler.post_with_error_handling",
side_effect=_mock_post_factory(),
)
async def test_delete_thread_authorized_returns_204(
self,
post_with_error_handling_mock,
is_enabled_authorization_mock,
is_enabled_mock,
isolated_client,
test_checkpoint,
test_task,
):
response = await isolated_client.post(
"/checkpoints/delete-thread", json={"thread_id": test_task.id}
)
assert response.status_code == 204

check_calls = [
call
for call in post_with_error_handling_mock.call_args_list
if call[0][1] == "/v1/authz/check"
]
assert len(check_calls) == 1
authz_data = check_calls[0][1]["json"]
assert authz_data["resource"]["type"] == AgentexResourceType.task.value
assert authz_data["resource"]["selector"] == test_task.id
assert authz_data["operation"] == "delete"
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 No denied-path test for delete-thread

Every other route with a denied case has a corresponding _unauthorized_returns_404 test (get-tuple, list, put), but delete-thread only has the authorized 204 path. Because delete maps to the owner role only — the most restrictive permission — validating that a denied check collapses to 404 rather than 403 is particularly important for this route to preserve the cross-tenant opacity guarantee.

Prompt To Fix With AI
This is a comment left during a code review.
Path: agentex/tests/integration/api/checkpoints/test_checkpoints_authz_api.py
Line: 325-369

Comment:
**No denied-path test for `delete-thread`**

Every other route with a denied case has a corresponding `_unauthorized_returns_404` test (get-tuple, list, put), but `delete-thread` only has the authorized 204 path. Because `delete` maps to the owner role only — the most restrictive permission — validating that a denied check collapses to 404 rather than 403 is particularly important for this route to preserve the cross-tenant opacity guarantee.

How can I resolve this? If you propose a fix, please make it concise.

Fix in Cursor Fix in Claude Code Fix in Codex

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant