Skip to content

AgentKeys: decoder + per-op_kind renderer for AuditEnvelope v1 (Phases D + E) #12

@hanwencheng

Description

@hanwencheng

What this issue tracks

A new audit message format (AuditEnvelope v1) is being emitted on-chain by the upstream AgentKeys audit publisher. This issue tracks the indexer-side work in this repo:

  • Decode the new chain events.
  • Fetch the off-chain envelope body from the audit worker by hash.
  • Verify cross-encoder canonical-CBOR determinism.
  • Expose typed REST shapes to the explorer UI.

Everything you need to implement the decoder is in this issue — no upstream docs required. The UI work is a companion PR against litentry/subscan-essentials-ui-react.


Conceptual model

   ┌─ chain event ─────────────────────┐
   │  AuditAppendedV2(                 │
   │    operatorOmni, actorOmni,       │
   │    opKind, envelopeHash )         │
   └────────────┬──────────────────────┘
                │ (envelopeHash)
                ▼
   ┌─ off-chain worker ────────────────┐
   │  GET /v1/audit/envelope/<hash>    │
   │  → canonical CBOR bytes           │
   └────────────┬──────────────────────┘
                │
                ▼
   ┌─ this indexer ────────────────────┐
   │  decode envelope-level CBOR       │
   │  dispatch on opKind → typed body  │
   │  expose via /agentkeys/audit/…    │
   └───────────────────────────────────┘

The chain commits only (opKind, envelopeHash). The full envelope (operator-readable intent text, per-op typed body, result code, timestamps) lives at the worker, addressed by hash. The contract is op-kind-agnostic — new op_kinds need zero contract changes.


Chain event signatures

event AuditAppendedV2(
  bytes32 indexed operatorOmni,
  bytes32 indexed actorOmni,
  uint8   indexed opKind,
  bytes32 envelopeHash
);

event AuditRootAppendedV2(
  bytes32 indexed operatorOmni,
  bytes32 indexed merkleRoot,
  bytes32 opKindBitmap,
  uint64  entryCount
);

Notes:

  • All three of operatorOmni, actorOmni, opKind are indexed event topics. The indexer MUST allow querying by any combination via standard eth_getLogs filters.
  • opKindBitmap is bytes32 where bit N (counting from LSB) indexes one of 256 possible op_kinds present in a tier-A Merkle batch. Lets the UI filter root batches by op_kind without fetching every leaf.
  • V2 is event-only — no on-chain storage of entries or roots. Position within an operator's stream is derivable from (block_number, log_index).
  • CredentialAudit contract is deployed on Heima Mainnet (chain ID 212013). The exact address will be supplied by the operator alongside the closing PR's tx capture (see Test artifacts §1 below) — the address ships after the contract redeploy that introduces V2.

Off-chain worker HTTP API

The audit worker holds the full envelope by hash. Default URL https://audit.litentry.org; per-deployment override via the indexer's config.

POST /v1/audit/append/v2
  Body: AppendV2Request (JSON; envelope fields + op_body as a JSON object)
  Returns: { ok: bool, envelope_hash: "0x<64 hex>" }

GET /v1/audit/envelope/<hash>
  Returns: 200 application/cbor with canonical CBOR bytes
           404 envelope_not_found

The indexer ONLY consumes GET. The POST exists for the on-chain publisher's emit path.


Envelope CBOR shape

The bytes returned by GET /v1/audit/envelope/<hash> are a single CBOR map encoded per RFC 8949 §4.2.1 (deterministic encoding). 9 envelope-level keys + an op_body map whose shape depends on op_kind:

AuditEnvelope {
  version           : uint8        // = 1 ; decoders MUST reject other values
  ts_unix           : uint64       // server-side at queue time
  actor_omni        : bytes        // 32 raw bytes — who performed the op
  operator_omni     : bytes        // 32 raw bytes — whose data-class boundary it touched
  op_kind           : uint8        // canonical table below
  op_body           : map          // op-kind-specific ; see schemas below
  result            : uint8        // 0=Success, 1=Failure, 2=NotPermitted
  intent_text       : text | null  // operator-readable text (display)
  intent_commitment : bytes | null // 32 raw bytes; keccak256(intent_text || 0x7c || op_payload_digest)
}

envelope_hash = keccak256(canonical_cbor_bytes). The chain commits this; the indexer MUST be able to verify the hash matches what GET /v1/audit/envelope/<hash> returned (keccak256(body) round-trip).

Canonical CBOR rules (RFC 8949 §4.2)

The encoder + every consumer that re-hashes envelopes MUST follow these rules. Subtle drift here silently desynchronizes chain commitments from worker bodies.

  1. Definite-length for all maps, arrays, strings, and byte strings (no indefinite-length items).
  2. Shortest-form integers — 0..23 in the major-type byte; 24..255 as 1-byte uint8; 256..65535 as 2-byte uint16; etc.
  3. No floats in any envelope-level or op_body field of this spec. If a future op_body introduces a float, it uses the RFC 8949 §4.2.2 shortest-form rules.
  4. Map keys sorted by canonical CBOR-encoded-byte ordering (RFC 8949 §4.2.3 — lexicographic on the encoded bytes, NOT on the decoded text). For text keys this means SHORTER keys sort BEFORE longer keys regardless of alphabetical order. Example: result (encoded as 0x66 72 65 73 75 6c 74, 7 bytes) sorts before actor_omni (encoded as 0x6a 61 63 74 6f 72 5f 6f 6d 6e 69, 11 bytes). Within the same length, sort by raw ASCII bytes.
  5. Recursive canonicalization — the rule above applies to EVERY map in the envelope, including nested maps inside op_body. The reference encoder (Rust) sorts recursively. Indexer encoders that build envelopes for fixture testing MUST do the same — encoder drift here is a common bug.

Canonical order of the 9 envelope-level keys

Applying rule (4) to the actual key lengths:

result            (6  chars)
op_body           (7  chars)
op_kind           (7  chars)
ts_unix           (7  chars)
version           (7  chars)
actor_omni        (10 chars)
intent_text       (11 chars)
operator_omni     (13 chars)
intent_commitment (17 chars)

A canonical encoder MUST emit the top-level map in EXACTLY this order. Any other order = encoder bug.


Canonical op_kind table

Bytes are assigned, never reused, never reordered. Each row gives the body field schema. Field types use CBOR primitives: uint, bytes, text, bool. bytes32 means a CBOR byte-string of length exactly 32. Hex strings in text fields are 0x-prefixed lowercase.

Byte Kind Family Body field schema (CBOR map keys + types)
0 CredStore creds service: text, payload_hash: text
1 CredFetch creds service: text, cap_hash: text
2 CredTeardown creds actor_target: text
10 MemoryPut memory key: text, payload_hash: text
11 MemoryGet memory key: text, cap_hash: text
12 MemoryTeardown memory actor_target: text
20 SignEip191 signs message_digest: text, wallet: text
21 SignEip712 signs chain_id: uint, verifying_contract: text, primary_type: text, type_hash: text, domain_separator: text, digest: text
30 PaymentEscrowRedeem payments escrow_addr: text, amount: text, recipient: text, chain_id: uint
31 PaymentDirect payments rail: text, ref: text, amount_minor: uint, currency: text
40 ScopeGrant scope agent_omni: text, service: text, max_calls: uint, max_amount: text
41 ScopeRevoke scope agent_omni: text, service: text
50 DeviceAdd device device_key_hash: text, role_bits: uint, attestation_hash: text
51 DeviceRevoke device device_key_hash: text
52 K10Rotate device old_device_key_hash: text, new_device_key_hash: text
60 EmailSend email to_hash: text, subject_hash: text, message_id: text
61 EmailReceive email from_hash: text, message_id: text, payload_hash: text
70 K3EpochAdvance K3 old_epoch: uint, new_epoch: uint, gov_tx: text

Field-semantics notes:

  • text fields carrying hashes / addresses use 0x<lowercase hex> format (0x prefix + lowercase hex, no length padding beyond the natural length of the value). Addresses are 42 chars (0x + 40 hex); 32-byte hashes are 66 chars (0x + 64 hex).
  • amount / amount_minor / max_amount may exceed uint64::MAX (they're U256 values from chain). Use text (string-encoded decimal or 0x-hex) rather than CBOR uint so JSON consumers can round-trip without losing precision. The reference encoder always uses text for these.
  • role_bits is a small u8 bitfield: bit 0 = CAP_MINT, bit 1 = RECOVERY, bit 2 = SCOPE_MGMT. Higher bits reserved.
  • result byte: 0=Success, 1=Failure, 2=NotPermitted. Other values are reserved-future — see invariant It does not have EVM account option in the search bar #1 below.

Reserved byte ranges (do NOT claim): 3..=9, 13..=19, 22..=29, 32..=39, 42..=49, 53..=59, 62..=69, 71..=79, 80..=255.


REST endpoints to expose

GET /agentkeys/audit/<operator_omni>
    ?op_kind=<byte>           (optional filter)
    &actor_omni=<hex>         (optional filter)
    &from_block=<n>           (optional)
    &to_block=<n>             (optional)
    &cursor=<opaque>          (optional pagination cursor)
    &limit=<n>                (optional ; default 50, max 500)
Response: { events: [ TypedAuditRow, … ], next_cursor: string | null }

GET /agentkeys/audit/envelope/<hash>
Response: 200 ApplicationOctetStream with canonical CBOR bytes (proxies the
          worker fetch with local immutable-by-hash cache).
          404 not_found.

GET /agentkeys/audit/root/<merkle_root>
Response: { merkle_root, op_kind_bitmap_u256, entry_count,
            block, tx, leaves: [ envelope_hash, … ] }

Where TypedAuditRow is the envelope-level fields plus a body field whose shape depends on op_kind:

  • Known op_kind: body is a typed JSON object matching the schema in the table above (e.g. for SignEip712: { chain_id, verifying_contract, primary_type, type_hash, domain_separator, digest }).
  • Unknown op_kind (byte not in the table): body is { op_kind_byte: <byte>, op_body_b64: <base64-encoded raw CBOR bytes> }. The envelope-level fields (actor, operator, ts_unix, intent_text, intent_commitment) are still surfaced — only the body is opaque.

8 non-break invariants

Adding a new op_kind in the future MUST be a no-op for old explorer deployments — they degrade to Unknown(byte) rendering, never crash, never drop the event. These invariants enforce that:

  1. op_kind is u8, NOT a sealed enum. Indexer MUST handle unknown bytes via a generic fallback path. Never 5xx; never drop the event.
  2. Envelope-level fields are stable across all op_kinds. Decoding the 9 envelope-level keys works for ANY op_kind value, even one this code doesn't recognize. Only op_body is op-kind-specific.
  3. version is gated on envelope-level breakage only. Adding a new op_kind does NOT bump version. A future version = 2 envelope means envelope-level fields changed (added, removed, or retyped); decoders MUST reject version != 1 until they're upgraded.
  4. Unknown(byte) fallback path renders a generic row from the envelope-level fields + base64 of the opaque op_body bytes. Never propagate the raw op_body to a typed decoder — the decoder doesn't know the shape.
  5. Worker passes through opaque bytes when it doesn't recognize an op_kind. The indexer follows the same rule: store the CBOR bytes as-is; render via the fallback path.
  6. Chain contract is op-kind-agnostic. opKind is a uint8 event topic only — no on-chain decode of op_body. New op_kinds never require a contract redeploy.
  7. Op_kind table never reuses numbers; never reorders rows. Reviewer for a new-op_kind PR can grep the canonical table in this issue to confirm the byte is unclaimed.
  8. 3 tests per new op_kind in this repo: CBOR decode of a canonical fixture for the typed body, Unknown(byte) non-break check, byte-uniqueness assertion against the canonical table.

Test artifacts — what the closing PR MUST attach

The decoder is only as trustworthy as the txs it's been verified against. The PR closing this issue MUST attach proof that it decodes REAL on-chain V2 traffic correctly, not just hand-crafted fixtures. Five artifacts:

Artifact 1 — Live tx capture from the canonical end-to-end demo runs

The upstream audit publisher provides three canonical end-to-end demo scripts (foundation, hardening, isolation) that exercise every shipped surface against Heima Mainnet. The operator runs all three, captures every AuditAppendedV2 / AuditRootAppendedV2 event emitted, and supplies the resulting txhash dump to this PR.

Per-demo expectations (the PR attaches a YAML manifest with the actual block / log / txhash values):

Demo Min V2 events expected Op_kinds expected
foundation (credential vault smoke + tier-A relay) ≥1 AuditAppendedV2 (cred audit) + ≥1 AuditRootAppendedV2 CredStore=0
hardening (multi-device + scope mutation) ≥3 AuditAppendedV2 (device add, scope set, K11 enroll) DeviceAdd=50, ScopeGrant=40
isolation (cred + memory roundtrip on each tier) ≥4 AuditAppendedV2 (cred + memory store/fetch) CredStore=0, CredFetch=1, MemoryPut=10, MemoryGet=11

For each captured event, the manifest records: txhash, block_number, log_index, indexed topics (operator_omni, actor_omni, op_kind), and the raw envelope_hash from the event payload.

Artifact 2 — Indexer decodes EACH captured event correctly

The PR includes a fixture file tests/fixtures/heima-mainnet-canonical-demos.jsonl with one row per tx from Artifact 1, holding the indexer's full decoded output:

{
  "demo": "isolation",
  "txhash": "0xabcd…",
  "block": 9631478,
  "log_index": 2,
  "operator_omni": "0x941cb1c3…",
  "actor_omni": "0xb3224706…",
  "op_kind": 21,
  "envelope_hash": "0xdead…beef",
  "envelope_fetched_from_worker": true,
  "decoded_typed_body": {
    "chain_id": 212013,
    "verifying_contract": "0x…",
    "primary_type": "Permit",
    "type_hash": "0x…",
    "domain_separator": "0x…",
    "digest": "0x…"
  },
  "intent_text": "Approve USDC 1000 to Uniswap v4 router",
  "intent_commitment_verified": true
}

The intent_commitment_verified boolean is computed by the indexer:

intent_commitment_verified == ( intent_commitment ==
    keccak256( intent_text.bytes() || 0x7c || op_payload_digest ) )

For sign events, op_payload_digest is decoded_typed_body.digest. For other op_kinds, the publisher will supply the digest formula in a follow-up table (track here when those op_kinds first land).

A passing row proves: the chain commitment binds to the rendered intent text the operator actually saw on the K11 WebAuthn confirmation page. This is the load-bearing forensics property.

Artifact 3 — Cross-language envelope_hash determinism

The reference encoder (Rust) ships canonical CBOR test vectors as JSON files, each holding { envelope_json, canonical_cbor_hex, envelope_hash_hex }. The closing PR's Go decoder MUST, for each vector:

  1. Build the canonical CBOR from envelope_json.
  2. Verify the bytes match canonical_cbor_hex exactly.
  3. Verify keccak256(bytes) matches envelope_hash_hex exactly.

If a single vector mismatches → encoder drift → CI fail. Bug to fix before merging.

The vectors land in tests/fixtures/cross-language-vectors/<op_kind>.json in this repo, supplied alongside Artifact 1.

This is the load-bearing test for the whole spec. Two encoder bugs were caught on the reference encoder side BEFORE this issue's PR landed (one in recursive op_body sort, one in the top-level map sort). Hand-crafted Go-only fixtures would have missed both. The vector cross-check is the only reliable guard against future encoder drift.

Artifact 4 — Negative-path coverage (per non-break invariant #4)

The PR includes an integration test that:

func TestUnknownOpKind_RealEventNonBreak(t *testing.T) {
    // Hand-craft an envelope with op_kind=250 (reserved future byte).
    // Submit via POST /v1/audit/append/v2 to the worker.
    // Indexer-side handler MUST process the corresponding chain event
    // AuditAppendedV2(..., opKind=250, envelopeHash=H) and:
    //   1. Store the row WITHOUT erroring or dropping.
    //   2. Mark op_kind as Unknown(250) in the REST response.
    //   3. Surface envelope-level fields (actor, operator, ts_unix,
    //      intent_text, intent_commitment) from the worker-fetched CBOR.
    //   4. Return body as { op_kind_byte: 250, op_body_b64: "…" }.
}

Plus a op_kind=255 test (the canary value the reference encoder uses).

Artifact 5 — Bulk replay test

The PR's CI runs the indexer against a captured cast logs (or equivalent) dump of all AuditAppendedV2 events from the operator's mainnet account between <deploy_block> and <latest> at the time the PR opens. Every event MUST be decoded without indexer-side errors. The dump lives in tests/fixtures/mainnet-bulk-replay.jsonl.

This catches "the decoder works on the 8 hand-picked txs from Artifact 1 but breaks on the 23rd one in production" — the bulk replay is what proves the decoder is robust across the actual stream of events the chain produces.


Acceptance checklist (gate for closing this issue)

  • All five artifacts above attached to the PR.
  • CI passes the full fixture suite including the cross-language hash determinism check (Artifact 3).
  • GET /agentkeys/audit/<operator_omni> returns paged results for the operator's omni against the live CredentialAudit deploy.
  • GET /agentkeys/audit/envelope/<hash> returns canonical CBOR matching keccak256(body) == hash for every captured envelope_hash.
  • At least three op_kinds wired end-to-end with typed JSON shapes: SignEip712 (byte 21), ScopeGrant (byte 40), DeviceAdd (byte 50). All other op_kinds may render with the Unknown(byte) fallback initially.
  • Indexer survives an unknown op_kind without crashing (Artifact 4).
  • Op_kind decoder map in this repo's code matches the canonical table in this issue — verified by a CI step that asserts every claimed byte has a corresponding decoder OR a generic-fallback marker.

Companion issue

UI work (subscan-essentials-ui-react) — per-op_kind renderer components + the generic <Unknown(byte)/> fallback renderer. Tracked separately in that repo.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions