Skip to content

Phase 4: Approval workflow (high-risk actions → parent app) #122

@hanwencheng

Description

@hanwencheng

Context

In M1-M3, the policy engine is deterministic — accept or deny based on signed cap-tokens and rule sets. Some actions sit between deterministic-allow and deterministic-deny: "I trust this agent but want eyes on THIS specific request." Approval workflows are the surface that handles those.

This issue graduates the M1 schema-only approval.request to production. High-risk actions (large payments, cross-namespace memory writes, scope expansions) push a notification to the parent app; parent taps approve/deny; agent proceeds (or fails). Audit row for every decision.

Per milestones-roadmap.md §5, this is M4 depth that enables enterprise-tier vendor pilots — without approval workflows, regulated B2B use-cases (kids' devices, health-adjacent agents) can't ship.

Scope (M4)

"High-risk" policy definition

Configurable per vendor + per actor; vendor settings (#113) expose policy templates, actor-level overrides via the parent UI (#110 → graduated #115 dashboard).

Default-shipping high-risk triggers:

  • Payment over X: payment cap-mint where amount > spend_threshold (vendor-configurable, default ¥500)
  • Cred write for sensitive service: cred.put to an explicit sensitive-service allowlist (banking, healthcare, identity-doc storage)
  • Memory write to family namespace from a non-family-context device: device's vendor_context field doesn't tag as family-context yet device tries to write family namespace
  • Scope expansion: an existing cap mints a child cap with a SCOPE that approaches the parent's max (e.g., 90%+ of parent's scope; defined by per-vendor policy)

Approval request flow

1. Agent calls agentkeys.approval.request(actor, action, params)
2. AgentKeys policy engine evaluates: is this high-risk?
   - No: return immediate decision (cap-mint or deny) → standard path
   - Yes: enter approval flow ↓
3. Push notification to parent app (web UI #115, native app M5)
4. Parent taps Approve or Deny in the UI
5. AgentKeys mints cap-token (Approve) or returns Denied (Deny)
6. Agent proceeds or fails
7. Audit row appended (request + decision + decider identity)

TTL on pending approvals

  • Default 5 minutes
  • After TTL, request times out → auto-deny → audit row with decision_method: timeout
  • Configurable per action type (some actions need longer; some shorter)

Audit row schema

Every approval decision emits:

{
  audit_event_id,
  actor_omni,
  action,
  request_params,
  decision: "approved" | "denied" | "timeout",
  decider: { type: "parent" | "system", identity: <user_omni or "system"> },
  decided_at,
  reason?: "high_risk_payment" | "sensitive_cred" | ...,
}

Out of scope (defer)

  • Multi-parent approvals ("either parent can approve") — M5 with family-sharing UX work
  • Approval delegation ("if I'm asleep, my partner can approve") — M5
  • ML-based "would this user have approved?" prediction — M5 if vendor demand surfaces
  • Approval workflows for non-action requests (e.g., "give me a peek at sensitive memory" pre-approval) — M5

Acceptance criteria

  • Demo scenario: a toy asks for a ¥600 payment (over the default ¥500 high-risk cap); parent gets a push notification within 200ms; parent taps Approve; payment cap-mint succeeds; audit row reflects approval
  • Same scenario with Deny: spend cap-mint fails; audit row shows denial reason + decider identity
  • Timeout scenario: parent doesn't tap within TTL (5 min default); audit row shows decision_method: timeout; agent receives ApprovalTimeout error
  • Per-vendor policy template lets a vendor set their default high-risk triggers + thresholds
  • Per-actor override lets a parent tighten or loosen thresholds for their specific child device
  • Push notification delivery verified across iOS Safari + Chrome Android + desktop (web push)
  • Negative test: parent's device is logged out (session JWT expired) → approval request still queued, surfaces when parent re-logs in; if past TTL → timeout-deny audit row

Risks

Risk Mitigation
Parent doesn't see push notification in time → approval times out → agent fails → bad UX UI surfaces queued approvals on next login; agent retry semantics documented in #107
High-risk threshold mis-tuned → parents bombarded with approval requests Per-vendor + per-actor tuning; default thresholds calibrated against M2-M3 vendor pilot data
Race: parent approves while another timeout-deny is firing Approval mutation is conditional on decision == "pending"; whichever decision lands first wins, the other is no-op
Approval ITSELF becomes a phishing vector ("evil agent triggers requests to wear parent down") Rate-limit approval requests per actor; vendor settings include "max approvals per hour"; audit-feed badge for unusual approval-request volume

References

Effort

~2 weeks. Sequencing:

  1. (Days 1-3) Policy engine: high-risk evaluator + per-vendor + per-actor config schema
  2. (Days 3-6) Approval-request flow: enqueue + push-notification + UI surface
  3. (Days 6-9) Approve/deny mutations + cap-token mint-or-reject + audit rows
  4. (Days 9-11) TTL + timeout semantics + race conditions
  5. (Days 11-14) Demo scenarios + rate-limit + acceptance tests

Pickup notes for the next agent / developer

  • The policy engine is deterministic. Don't use an LLM to decide what's high-risk — it has to be auditable and reproducible.
  • High-risk thresholds are PRODUCT decisions. Don't set them yourself; check with whoever owns vendor BD what the M4-tier vendor pilots actually need.
  • Push notifications: web push first (works in M1's web UI + Phase 2: Audit dashboard (two-tier visible: real-time feed + chain anchor) #115 audit dashboard); native push deferred to M5.
  • The audit chain on every approval decision is the regulator surface. Every event has a decider identity (parent_omni). Make sure that's the parent_omni who actually tapped, not a session-level user.
  • Watch for: approval requests are a phishing surface. If a vendor's agent can trigger 100 requests/hour to wear a parent down, that's bad. Rate-limit hard.
  • Use the /agentkeys-issue-create skill for follow-up issues (e.g., multi-parent approval, M5 native push)

Metadata

Metadata

Assignees

No one assigned

    Labels

    area/brokerBroker server, cap-token issuance, OIDC issuancearea/uiParent-control UI, vendor onboarding portal, audit dashboard

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions