Skip to content

chore(openspec): amend prohibition-gate spec for reworked OR contract (#123)#148

Open
rjzondervan wants to merge 3 commits into
developmentfrom
feat/123/anonymisation-grondslagen-and-prohibition-gate-amend
Open

chore(openspec): amend prohibition-gate spec for reworked OR contract (#123)#148
rjzondervan wants to merge 3 commits into
developmentfrom
feat/123/anonymisation-grondslagen-and-prohibition-gate-amend

Conversation

@rjzondervan
Copy link
Copy Markdown
Member

Summary

Spec-only amend to anonymisation-grondslagen-and-prohibition-gate. The OpenRegister-side entity-relation-grondslagen change (PR ConductionNL/openregister#1503) was reworked during explore mode after we discovered that OpenAnonymiser is a detection-time backend, not an anonymise-time backend. The OR contract changed:

  • Original spec assumed the anonymise endpoint would accept a bases array and forward to OpenAnonymiser.
  • Reworked OR contract: bases and skipAnonymization are decoupled into a separate PATCH /api/entity-relations/{id} endpoint. The anonymise call stays unchanged in shape (POST /api/anonymization/anonymize/{fileId} with the existing entities payload).
  • DocuDesk's prohibition-gate now consumes the decoupled OR contract: PATCH per relation to record decision metadata, then call anonymise. Skip-anonymise decisions are honoured by OR's DocumentProcessingHandler via the new findSkippedEntityValuesForFile DI guard.

This PR amends proposal.md, design.md, tasks.md, and the two spec.md files in the change to reflect the reworked contract. No code change yet — the implementation lands when both 1.2 (entity-publication-policies) and 1.3 (entity-relation-grondslagen) have merged.

Dependencies

  • Hard: OpenRegister PR #1503 (entity-relation-grondslagen) provides the PATCH endpoint + skip-aware DI guard this gate consumes.
  • Soft: DocuDesk PR feat: entity-level publication policy layer (#112) #147 (entity-publication-policies) provides the policy layer that's downstream of the prohibition gate.

Test plan

  • openspec validate anonymisation-grondslagen-and-prohibition-gate clean.
  • Spec text references the correct OR endpoints + method names (PATCH /api/entity-relations/{id}, EntityRelationMapper::findSkippedEntityValuesForFile, updateDecisionMetadata).
  • Cross-reference from design.md to PR #1503 + feat: entity-level publication policy layer (#112) #147 for dependency context.

… for new OR contract

OR's reworked entity-relation-grondslagen change (commit ef5464b94 on
openregister:feat/1435/entity-relation-grondslagen-impl) decouples
bases-set from the anonymise call. Bases live on the new
PATCH /api/entity-relations/{id} endpoint (or its DI parallel
EntityRelationMapper::updateDecisionMetadata). This DD amendment
brings the DocuDesk-side spec in line with that contract.

Major changes:

- DROP bases-pass-through. DocuDesk's anonymise payload no longer
  carries per-entity bases. A stray bases field on an incoming
  payload entry is silently dropped (backwards-compatible no-op).
  Bases are set by the caller (frontend, batch tooling) directly
  on OR's PATCH endpoint between extract and anonymise.

- ADD override-acknowledge side effects. When DD's anonymise endpoint
  receives a validated acknowledgedOverrides[] entry, DD must:
  1. Persist a DD-side audit entry (ruleId, entityRelationId, fileId,
     reason, acknowledgedBy, acknowledgedAt) — captures *why* the
     operator released a flagged entity from anonymisation. OR's
     audit-trail records only *what* changed on the row; DD's
     records the operator commentary.
  2. PATCH OR's matching EntityRelation with
     { skipAnonymization: true } via the DI mapper.
  3. Proceed with the anonymise call — OR's markAsAnonymized excludes
     skip-flagged rows automatically.
  DD audit entry MUST be written before the OR PATCH (so a PATCH
  failure leaves the operator intent recorded for forensics/retry).
  Multiple overrides commit atomically: all OR PATCHes succeed or
  the request returns HTTP 500.

- ADD prohibitionOverrideAudit storage hint. Smallest implementation
  is a new schema in docudesk_register.json; implementations may
  choose alternative persistent stores if they fit DD's existing
  audit-log conventions.

- TIGHTEN cross-app dep. The dep on OR's entity-relation-grondslagen
  is now HARD (not Soft): DD's override path calls OR's DI method
  directly. Apply ordering: OR rework lands first, then DD amend.

- UNCHANGED: prohibition gate (high-confidence must be in set; 422
  on missing; threshold config). prohibitionMatch on extract.
  suggestedBases on consolidated-entities response.

Refs: #123
Pair: openregister:feat/1435/entity-relation-grondslagen-impl (ef5464b94)
@github-actions
Copy link
Copy Markdown
Contributor

Quality Report — ConductionNL/docudesk @ e7f05af

Check PHP Vue Security License Tests
lint
phpcs
phpmd
psalm
phpstan
phpmetrics
eslint
stylelint
composer ✅ 108/108
npm
PHPUnit ⏭️
Newman ⏭️
Playwright ⏭️

Quality workflow — 2026-05-15 08:26 UTC

Download the full PDF report from the workflow artifacts.

Copy link
Copy Markdown

@WilcoLouwerse WilcoLouwerse left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

REQUEST_CHANGES — Strict mode. 1 blocker, 2 significant issues.

🔴 Blockers (1): The spec's atomicity Requirement is undeliverable: it states all override PATCHes commit atomically (all-or-none), but the implementation sends individual per-relation PATCHes to OpenRegister sequentially with no transaction or rollback on partial failure.

🟡 Significant (2): prohibitionOverrideAudit storage location stated as "smallest" without a SHOULD/MUST preference — both options have different data-retention and query-complexity implications; typo "side-side audit entry" (should be "side-by-side").

All CI failures are pre-existing on the development branch.

…rides

DD #148 reviewer flagged the atomicity Requirement as undeliverable: the
implementation sends individual per-relation PATCHes to OpenRegister
sequentially with no transaction or rollback on partial failure, but the
spec claimed "all overrides commit, or none do".

**Blocker — atomicity rewrite.** Spec updated to explicitly model
sequential best-effort with stop-on-first-failure semantics:

- The override loop iterates `acknowledgedOverrides` in submission order.
- Each override commits its DD audit entry FIRST, then its OR PATCH (so
  the audit captures operator intent even when the PATCH later fails).
- A PATCH failure stops further processing for this request and returns
  HTTP 500. Already-committed DD audit + OR PATCH writes from earlier
  overrides in the same request are NOT rolled back — they form a
  per-relation audit trail of operator intent.
- The anonymise call is NOT forwarded if any override failed.
- Retries are safe: the skip flag is idempotent on OR. A repeat PATCH
  on an already-skipped relation is a semantic no-op on OR's side per
  the `entity-relation-grondslagen` spec; a retry writes a fresh DD
  audit entry (each operator intent event is its own row).

The "Multiple overrides commit atomically" scenario is renamed to
"Multiple overrides commit sequentially (happy path)". The "OR PATCH
fails" scenario is rewritten to assert per-relation effects (R7
committed before failure, R8 audit-only, R9 untouched, anonymise call
NOT forwarded). A new "Retry after partial failure is idempotent"
scenario covers the replay semantics.

**Significant — storage location mandated.** The
`prohibitionOverrideAudit` storage location was ambiguous ("smallest
implementation is X; alternative is Y; pick whichever fits"). Spec now
mandates a `prohibitionOverrideAudit` schema on `docudesk_register.json`
with three explicit reasons: queryable via the existing /objects
endpoints, reuses OpenRegister's retention + RBAC story, and gives the
audit a single source of truth so the "which audit do we trust?"
question never needs an answer. Additional sinks (logger payload,
audit-log table) MAY be added but the register-backed entry is the
mandated one. Updated in both spec.md, tasks.md (6.5), and design.md
(D10).

**Significant — typo.** `side-side audit entry` → `side-by-side audit
entry` in the Scenario at line 149.

Design.md gains a new §D10a section spelling out the no-request-level-
transaction decision, including two edge-case discussions: parallel
operators acknowledging the same low-confidence match, and partial-
failure-then-retry replay semantics.
@github-actions
Copy link
Copy Markdown
Contributor

Quality Report — ConductionNL/docudesk @ 45e776e

Check PHP Vue Security License Tests
lint
phpcs
phpmd
psalm
phpstan
phpmetrics
eslint
stylelint
composer ✅ 108/108
npm
PHPUnit ⏭️
Newman ⏭️
Playwright ⏭️

Quality workflow — 2026-05-15 15:01 UTC

Download the full PDF report from the workflow artifacts.

@rjzondervan
Copy link
Copy Markdown
Member Author

Review response — commit 3bc972d

Addressed the blocker + both significant items.

Blocker — atomicity rewrite

The original spec asserted "all overrides commit atomically (all-or-none)" but the implementation sends individual per-relation PATCHes to OpenRegister sequentially with no transaction or rollback. Per the user-confirmed scoping decision (sequential semantics + best-effort + per-relation audit trail), the spec now models this honestly:

  • The override loop iterates acknowledgedOverrides in submission order.
  • Each override commits its DD audit entry FIRST, then its OR PATCH (so the audit captures operator intent even when the PATCH later fails).
  • A PATCH failure stops further processing for this request and returns HTTP 500. Already-committed DD audit + OR PATCH writes from earlier overrides in the same request are NOT rolled back — they form a per-relation audit trail of operator intent.
  • The anonymise call is NOT forwarded if any override failed.
  • Retries are safe: the skip flag is idempotent on OR. A repeat PATCH on an already-skipped relation is a semantic no-op per the entity-relation-grondslagen spec; a retry writes a fresh DD audit entry (each operator intent event is its own row).

The "Multiple overrides commit atomically" scenario is renamed to "Multiple overrides commit sequentially (happy path)". The "OR PATCH fails" scenario is rewritten to assert per-relation effects (R7 committed before failure, R8 audit-only, R9 untouched, anonymise call NOT forwarded). A new "Retry after partial failure is idempotent" scenario covers the replay semantics.

Design.md gains a new §D10a section spelling out the no-request-level-transaction decision, including two edge-case discussions: parallel operators acknowledging the same low-confidence match, and partial-failure-then-retry replay semantics.

Significant — storage location mandated

The prohibitionOverrideAudit storage location was ambiguous ("smallest implementation is X; alternative is Y; pick whichever fits"). Spec now mandates a prohibitionOverrideAudit schema on docudesk_register.json with three explicit reasons:

  1. Queryable via the existing /objects endpoints (operators run audit lookups through that surface today).
  2. Reuses OpenRegister's retention + RBAC story (no parallel infrastructure).
  3. Gives the audit a single source of truth so the "which audit do we trust?" question never needs an answer.

Additional sinks (logger payload, audit-log table) MAY be added, but the register-backed entry is the mandated one. Updated in spec.md (the override-acknowledge Requirement), tasks.md (item 6.5), and design.md (D10).

Significant — typo

side-side audit entryside-by-side audit entry in the Scenario at line 149.

WilcoLouwerse
WilcoLouwerse previously approved these changes May 18, 2026
Copy link
Copy Markdown

@WilcoLouwerse WilcoLouwerse left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re-review (Quick) — both prior findings resolved: 🔴 atomicity claim rewritten to honest sequential best-effort semantics with audit-first + stop-on-first-failure (the three rewritten scenarios are self-consistent and design.md §D10a captures the no-request-level-transaction decision with rationale), 🟡 "side-side" typo fixed.

Two minor 🟡 documentation follow-ups (non-blocking):

  • The new "Retry after partial failure is idempotent" scenario's GIVEN says "the underlying R8 issue has been resolved on the next attempt" without specifying mechanism — fine for a happy-path scenario, but the failure-on-retry path (where R8 stays broken) is undocumented. Could add a one-liner noting that a second failure produces the same stop-on-first-failure outcome (more DD audit rows, same HTTP 500).
  • proposal.md still reads "Smallest implementation: add a prohibitionOverrideAudit schema to docudesk_register.json … or place it under a separate audit register" — contradicts spec.md / tasks.md / design.md §D10 which now mandate docudesk_register.json. Worth aligning the proposal bullet for consistency.

…misation-grondslagen-and-prohibition-gate-amend

# Conflicts:
#	openspec/changes/anonymisation-grondslagen-and-prohibition-gate/design.md
#	openspec/changes/anonymisation-grondslagen-and-prohibition-gate/proposal.md
#	openspec/changes/anonymisation-grondslagen-and-prohibition-gate/tasks.md
@github-actions
Copy link
Copy Markdown
Contributor

Quality Report — ConductionNL/docudesk @ 0494bdf

Check PHP Vue Security License Tests
lint
phpcs
phpmd
psalm
phpstan
phpmetrics
eslint
stylelint
composer ✅ 108/108
npm ✅ 529/529
PHPUnit
Newman ⏭️
Playwright ⏭️

Coverage: 0% (0/10 statements)


Quality workflow — 2026-05-19 03:22 UTC

Download the full PDF report from the workflow artifacts.

Copy link
Copy Markdown

@WilcoLouwerse WilcoLouwerse left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Standard re-review follow-up notes — none block. See verdict review for APPROVE.


1. **Persist a DocuDesk-side audit entry** capturing `{ruleId, entityRelationId, fileId, reason, acknowledgedBy, acknowledgedAt}`. Implementations MUST use a `prohibitionOverrideAudit` schema in `docudesk_register.json` for this entry, alongside the existing schemas. (Alternative persistent stores — dedicated audit-log table, structured-logger payload — were considered and rejected: keeping audit on the register surface keeps DD's audit volume queryable via the existing `objects` endpoints and reuses the standard OpenRegister retention/RBAC story. Implementations MAY add an additional sink, but the register-backed entry is the mandated one.) This entry records *why* the operator chose to release a flagged entity from anonymisation — operator-commentary metadata that OpenRegister's audit-trail does not carry (OR's PATCH whitelist is decision-only).
2. **PATCH OpenRegister's matching `EntityRelation` row** with `{skipAnonymization: true}` via `EntityRelationMapper::updateDecisionMetadata` (via OR's DI lookup). The DocuDesk-side audit entry MUST be written BEFORE the corresponding OR PATCH so a failure of the OR call doesn't leave the override unrecorded on the DD side. Each override is processed sequentially (per-relation audit + PATCH pair). Best-effort semantics: if one OR PATCH fails, DocuDesk MUST stop processing further overrides in the same request and respond with HTTP 500. Already-committed DD audit entries and already-applied OR PATCHes from earlier overrides in the same request are NOT rolled back — they remain on disk as a per-relation audit trail of operator intent. The skip flag is idempotent (PATCH-ing the same relation again with `skipAnonymization: true` is a semantic no-op on OR's side), so a retry replays cleanly.
3. **Proceed with the anonymise call** to OpenRegister. OR's `markAsAnonymized` will already exclude the skip-flagged row from the redaction set — no further DocuDesk work is needed for execution.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Scenario asserts anonymise-call abort, but the Requirement body doesn't say so. Step 2 says "stop processing further overrides … HTTP 500" and step 3 says "Proceed with the anonymise call to OpenRegister" — but the "OR PATCH fails" Scenario at line 223 asserts "the anonymise call MUST NOT have been forwarded to OpenRegister". A reader of the Requirement alone could plausibly conclude the anonymise call still proceeds for already-flipped relations (e.g. R7's skipAnonymization=true is committed; OR could still redact the file with R7 excluded). The scenario's behaviour is the right one; the Requirement should be explicit.

Suggested fix: add a sub-clause to step 2 or replace step 3 with: "If any per-override PATCH fails, the anonymise call to OR MUST NOT be issued in this request — the operator must retry. Otherwise, proceed with the anonymise call."

- **AND** the operator retries the same request with the same overrides for R7, R8, R9
- **AND** the underlying R8 issue has been resolved on the next attempt
- **WHEN** DocuDesk processes the retry
- **THEN** R7's repeat PATCH MUST be a semantic no-op on OR (skip is already true; no duplicate OR audit entry per OR's `entity-relation-grondslagen` spec)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Retry-idempotency anchor is unanchored. This bullet load-bears "no duplicate OR audit entry per OR's entity-relation-grondslagen spec" — that's the correctness anchor for the entire retry-is-safe story, but entity-relation-grondslagen is referenced by short name with no link or quoted Requirement ID. If OR's spec doesn't actually guarantee no-op-on-identical-PATCH dedup, this scenario is wrong and the retry semantics collapse.

Suggested fix: either cite the exact OR requirement (e.g. openspec/changes/entity-relation-grondslagen/specs/.../spec.md#Requirement: …) or soften the assertion to "SHOULD be a no-op subject to OR's idempotency contract — see PR ConductionNL/openregister#1503" and capture the cross-spec dependency in design.md.

For every `acknowledgedOverrides` entry that validates per the previous Requirement, DocuDesk's controller MUST:

1. **Persist a DocuDesk-side audit entry** capturing `{ruleId, entityRelationId, fileId, reason, acknowledgedBy, acknowledgedAt}`. Implementations MUST use a `prohibitionOverrideAudit` schema in `docudesk_register.json` for this entry, alongside the existing schemas. (Alternative persistent stores — dedicated audit-log table, structured-logger payload — were considered and rejected: keeping audit on the register surface keeps DD's audit volume queryable via the existing `objects` endpoints and reuses the standard OpenRegister retention/RBAC story. Implementations MAY add an additional sink, but the register-backed entry is the mandated one.) This entry records *why* the operator chose to release a flagged entity from anonymisation — operator-commentary metadata that OpenRegister's audit-trail does not carry (OR's PATCH whitelist is decision-only).
2. **PATCH OpenRegister's matching `EntityRelation` row** with `{skipAnonymization: true}` via `EntityRelationMapper::updateDecisionMetadata` (via OR's DI lookup). The DocuDesk-side audit entry MUST be written BEFORE the corresponding OR PATCH so a failure of the OR call doesn't leave the override unrecorded on the DD side. Each override is processed sequentially (per-relation audit + PATCH pair). Best-effort semantics: if one OR PATCH fails, DocuDesk MUST stop processing further overrides in the same request and respond with HTTP 500. Already-committed DD audit entries and already-applied OR PATCHes from earlier overrides in the same request are NOT rolled back — they remain on disk as a per-relation audit trail of operator intent. The skip flag is idempotent (PATCH-ing the same relation again with `skipAnonymization: true` is a semantic no-op on OR's side), so a retry replays cleanly.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 No remediation surface for partial-commit state. The spec explicitly accepts that R7's DD audit + OR skip remain committed when R8's PATCH fails — fine, but the HTTP 500 response has no schema, so the operator can't tell which relations committed. From their perspective the request "failed" and the safe assumption ("retry the whole thing") is exactly what the next scenario covers — but only because the retry happens to be idempotent. If the next attempt brought a different acknowledgedOverrides set, observability would matter.

Suggested fix: add {committedRelationIds: [<ids>], failedRelationId: <id>, error: "<message>"} to the 500 response body, or — if that's intentionally out-of-scope — add a **Design note:** line below the Requirement noting that retry-with-same-payload is the prescribed operator workflow.

# Anonymization — Delta for Prohibition Flag (Bases Pass-Through Removed)

This delta extends the existing `anonymization` capability so the per-document anonymise endpoint can carry per-entity legal bases (grondslagen) through to OpenRegister, and so the extract endpoint surfaces prohibition matches in its response. No existing requirement is changed in behaviour; both additions are non-breaking and additive.
This delta extends the existing `anonymization` capability so the extract endpoint surfaces prohibition matches in its response. **Bases pass-through was removed from this delta in the post-explore-mode rework (2026-05-12)** — bases are now set per-relation on OR's own `PATCH /api/entity-relations/{id}` endpoint, not threaded through DocuDesk's anonymise payload. The `entities[]` shape on DocuDesk's anonymise endpoint stays at `{text, entityType, key, ...}` — no `bases` field is added to it.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟢 Coverage-change rationale missing. Pre-amend version had four scenarios (no-bases / with-bases / empty-array / mixed-payload). New version has three (no-bases / stray-ignored / OR-PATCH-first). The mixed-payload scenario is gone — which is correct, since bases is now universally ignored and a "mixed" case is no longer distinguishable from "all stray" — but the delta preamble doesn't note the coverage change.

Suggested fix: add one line: "Mixed-bases scenario removed: no longer reachable since bases is universally ignored on the DocuDesk side."

Copy link
Copy Markdown

@WilcoLouwerse WilcoLouwerse left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Both prior findings verified fixed: the 🔴 atomicity rewrite now correctly models per-override best-effort semantics with idempotent retry, and the 🟡 typo is corrected. Four follow-up notes posted (none block) — three 🟡 spec-clarity items (scenario↔requirement mismatch on anonymise-call abort, unanchored OR-spec idempotency claim, no remediation surface for partial-commit state) and one 🟢 polish item (coverage-change rationale). Approving.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants