Phase 4: Approval workflow (high-risk actions → parent app)

## Context

In M1-M3, the policy engine is deterministic — accept or deny based on signed cap-tokens and rule sets. Some actions sit between deterministic-allow and deterministic-deny: "I trust this agent but want eyes on THIS specific request." Approval workflows are the surface that handles those.

This issue graduates the M1 schema-only `approval.request` to production. High-risk actions (large payments, cross-namespace memory writes, scope expansions) push a notification to the parent app; parent taps approve/deny; agent proceeds (or fails). Audit row for every decision.

Per [`milestones-roadmap.md` §5](https://github.com/litentry/agentKeys/blob/main/docs/spec/plans/milestones-roadmap.md), this is M4 depth that enables enterprise-tier vendor pilots — without approval workflows, regulated B2B use-cases (kids' devices, health-adjacent agents) can't ship.

## Scope (M4)

### "High-risk" policy definition

Configurable per vendor + per actor; vendor settings (#113) expose policy templates, actor-level overrides via the parent UI (#110 → graduated #115 dashboard).

Default-shipping high-risk triggers:
- **Payment over X**: payment cap-mint where `amount > spend_threshold` (vendor-configurable, default ¥500)
- **Cred write for sensitive service**: `cred.put` to an explicit sensitive-service allowlist (banking, healthcare, identity-doc storage)
- **Memory write to `family` namespace from a non-family-context device**: device's `vendor_context` field doesn't tag as family-context yet device tries to write `family` namespace
- **Scope expansion**: an existing cap mints a child cap with a SCOPE that approaches the parent's max (e.g., 90%+ of parent's scope; defined by per-vendor policy)

### Approval request flow

```
1. Agent calls agentkeys.approval.request(actor, action, params)
2. AgentKeys policy engine evaluates: is this high-risk?
   - No: return immediate decision (cap-mint or deny) → standard path
   - Yes: enter approval flow ↓
3. Push notification to parent app (web UI #115, native app M5)
4. Parent taps Approve or Deny in the UI
5. AgentKeys mints cap-token (Approve) or returns Denied (Deny)
6. Agent proceeds or fails
7. Audit row appended (request + decision + decider identity)
```

### TTL on pending approvals

- Default 5 minutes
- After TTL, request times out → auto-deny → audit row with `decision_method: timeout`
- Configurable per action type (some actions need longer; some shorter)

### Audit row schema

Every approval decision emits:
```
{
  audit_event_id,
  actor_omni,
  action,
  request_params,
  decision: "approved" | "denied" | "timeout",
  decider: { type: "parent" | "system", identity: <user_omni or "system"> },
  decided_at,
  reason?: "high_risk_payment" | "sensitive_cred" | ...,
}
```

## Out of scope (defer)

- Multi-parent approvals ("either parent can approve") — M5 with family-sharing UX work
- Approval delegation ("if I'm asleep, my partner can approve") — M5
- ML-based "would this user have approved?" prediction — M5 if vendor demand surfaces
- Approval workflows for non-action requests (e.g., "give me a peek at sensitive memory" pre-approval) — M5

## Acceptance criteria

- [ ] Demo scenario: a toy asks for a ¥600 payment (over the default ¥500 high-risk cap); parent gets a push notification within 200ms; parent taps Approve; payment cap-mint succeeds; audit row reflects approval
- [ ] Same scenario with Deny: spend cap-mint fails; audit row shows denial reason + decider identity
- [ ] Timeout scenario: parent doesn't tap within TTL (5 min default); audit row shows `decision_method: timeout`; agent receives `ApprovalTimeout` error
- [ ] Per-vendor policy template lets a vendor set their default high-risk triggers + thresholds
- [ ] Per-actor override lets a parent tighten or loosen thresholds for their specific child device
- [ ] Push notification delivery verified across iOS Safari + Chrome Android + desktop (web push)
- [ ] Negative test: parent's device is logged out (session JWT expired) → approval request still queued, surfaces when parent re-logs in; if past TTL → timeout-deny audit row

## Risks

| Risk | Mitigation |
|---|---|
| Parent doesn't see push notification in time → approval times out → agent fails → bad UX | UI surfaces queued approvals on next login; agent retry semantics documented in #107 |
| High-risk threshold mis-tuned → parents bombarded with approval requests | Per-vendor + per-actor tuning; default thresholds calibrated against M2-M3 vendor pilot data |
| Race: parent approves while another timeout-deny is firing | Approval mutation is conditional on `decision == "pending"`; whichever decision lands first wins, the other is no-op |
| Approval ITSELF becomes a phishing vector ("evil agent triggers requests to wear parent down") | Rate-limit approval requests per actor; vendor settings include "max approvals per hour"; audit-feed badge for unusual approval-request volume |

## References

- [`docs/spec/plans/milestones-roadmap.md`](https://github.com/litentry/agentKeys/blob/main/docs/spec/plans/milestones-roadmap.md) §5 (M4 scope)
- [`docs/research/agent-iam-strategy.md`](https://github.com/litentry/agentKeys/blob/main/docs/research/agent-iam-strategy.md) — approval workflows in the "deferred from v1" list; corrected design notes
- [`docs/arch.md`](https://github.com/litentry/agentKeys/blob/main/docs/arch.md) §15 (audit framing)
- #107 (MCP server — `approval.request` schema-only tool graduates here)
- #110 (parent UI) → #115 (audit dashboard) — UI surface for approve/deny
- #113 (vendor portal — per-vendor policy templates)
- #121 (Delegation chains — approval requests can fire mid-delegation; the audit chain includes delegation path)
- #123 (Policy versioning — what counts as high-risk versions with the policy)

## Effort

~2 weeks. Sequencing:

1. (Days 1-3) Policy engine: high-risk evaluator + per-vendor + per-actor config schema
2. (Days 3-6) Approval-request flow: enqueue + push-notification + UI surface
3. (Days 6-9) Approve/deny mutations + cap-token mint-or-reject + audit rows
4. (Days 9-11) TTL + timeout semantics + race conditions
5. (Days 11-14) Demo scenarios + rate-limit + acceptance tests

## Pickup notes for the next agent / developer

- The policy engine is **deterministic**. Don't use an LLM to decide what's high-risk — it has to be auditable and reproducible.
- High-risk thresholds are PRODUCT decisions. Don't set them yourself; check with whoever owns vendor BD what the M4-tier vendor pilots actually need.
- Push notifications: web push first (works in M1's web UI + #115 audit dashboard); native push deferred to M5.
- The audit chain on every approval decision is **the regulator surface**. Every event has a decider identity (parent_omni). Make sure that's the parent_omni who actually tapped, not a session-level user.
- Watch for: approval requests are a phishing surface. If a vendor's agent can trigger 100 requests/hour to wear a parent down, that's bad. Rate-limit hard.
- Use the `/agentkeys-issue-create` skill for follow-up issues (e.g., multi-parent approval, M5 native push)


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Phase 4: Approval workflow (high-risk actions → parent app) #122

Context

Scope (M4)

"High-risk" policy definition

Approval request flow

TTL on pending approvals

Audit row schema

Out of scope (defer)

Acceptance criteria

Risks

References

Effort

Pickup notes for the next agent / developer

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Risk	Mitigation
Parent doesn't see push notification in time → approval times out → agent fails → bad UX	UI surfaces queued approvals on next login; agent retry semantics documented in #107
High-risk threshold mis-tuned → parents bombarded with approval requests	Per-vendor + per-actor tuning; default thresholds calibrated against M2-M3 vendor pilot data
Race: parent approves while another timeout-deny is firing	Approval mutation is conditional on `decision == "pending"`; whichever decision lands first wins, the other is no-op
Approval ITSELF becomes a phishing vector ("evil agent triggers requests to wear parent down")	Rate-limit approval requests per actor; vendor settings include "max approvals per hour"; audit-feed badge for unusual approval-request volume

Phase 4: Approval workflow (high-risk actions → parent app) #122

Description

Context

Scope (M4)

"High-risk" policy definition

Approval request flow

TTL on pending approvals

Audit row schema

Out of scope (defer)

Acceptance criteria

Risks

References

Effort

Pickup notes for the next agent / developer

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions