Skip to content

Add KV identity graph with CAS concurrency control (#536)#549

Draft
ChristianPavilonis wants to merge 14 commits intofeature/edge-cookiesfrom
feature/ec-kv-identity-graph
Draft

Add KV identity graph with CAS concurrency control (#536)#549
ChristianPavilonis wants to merge 14 commits intofeature/edge-cookiesfrom
feature/ec-kv-identity-graph

Conversation

@ChristianPavilonis
Copy link
Collaborator

Summary

  • Implements Story 3 (KV identity graph #536) of the Edge Cookie epic (Implement Edge Cookie (EC) identity system #532): adds KvIdentityGraph backed by Fastly KV Store with optimistic concurrency control (generation markers) for safe concurrent partner ID writes.
  • Defines the full KV schema (KvEntry, KvConsent, KvGeo, KvPartnerId, KvMetadata) with factory methods for initial, minimal recovery, and tombstone entries.
  • Tombstone guards in upsert_partner_id and update_last_seen prevent late syncs from repopulating partner IDs or extending TTL after consent withdrawal.

Closes #536

Changes

File Change
ec/kv.rs New. KvIdentityGraph struct with 8 methods: get, get_metadata, create, create_or_revive, upsert_partner_id, update_last_seen, write_withdrawal_tombstone, delete
ec/kv_types.rs New. Schema types with serde, factory methods (KvEntry::new, ::minimal, ::tombstone, KvMetadata::from_entry, KvGeo::from_geo_info), 14 unit tests
ec/mod.rs Added pub mod kv; pub mod kv_types; and module doc entries
fastly.toml Added ec_identity_store and ec_partner_store KV store declarations
trusted-server.toml Added ec_store and partner_store to [ec] section

Key design decisions

  • CAS via try_insert_add helper — Returns Ok(true) for created, Ok(false) for key-exists, Err for real failures. Avoids fragile string matching on error debug output.
  • Bounded retriesMAX_CAS_RETRIES = 3 for all CAS operations. No unbounded recursion.
  • Tombstone guardsupsert_partner_id rejects tombstoned entries; update_last_seen skips them. Prevents late syncs from resurrecting withdrawn consent.
  • 300s debounce on update_last_seen — Prevents write thrashing under bursty traffic (Fastly KV enforces 1 write/sec per key).
  • 24h tombstone TTL — Allows batch sync clients to distinguish consent_withdrawn from ec_hash_not_found.
  • Methods return Result — Callers decide error policy (swallow on organic paths, propagate on sync endpoints).

Verification

  • cargo fmt --all -- --check — clean
  • cargo clippy --workspace --all-targets --all-features -- -D warnings — zero warnings
  • cargo test --workspace729 tests passed, 0 failed (14 new tests for schema types + KV helpers)

- Rename all external identifiers: x-synthetic-id → x-ts-ec, synthetic_id
  cookie → ts-ec, synthetic_fresh → ec_fresh
- Simplify hash generation to use only client IP with HMAC-SHA256, removing
  User-Agent, Accept-Language, Accept-Encoding, and template rendering
- Rename config section [synthetic] → [ec] with backward-compat alias
- Rename ec.rs to edge_cookie.rs for clarity
- Remove handlebars dependency (and transitive deps)
- Add x-ts-ec-fresh to internal headers blocklist
- Update all docs with new Edge Cookie (EC) terminology
- Fix review findings: remove redundant serde rename, stale optimization
  entry, leftover 'synthetic' references in agent configs and docs

Closes #462
- Rename allows_ssc_creation → allows_ec_creation and update all doc
  comments, test names, and assertion messages to use Edge Cookie (EC)
- Fix intra-doc link [`ec`] → [`edge_cookie`] in lib.rs
- Downgrade test log from info to debug in edge_cookie.rs for consistency
- Add fallback comment and wire-protocol breaking-change doc in openrtb.rs
- Run prettier --write on 3 doc files to fix format-docs CI
- Update integration-tests Cargo.lock to sync derive_more 2.1.1
…ences

- Rename TRUSTED_SERVER__SYNTHETIC__SECRET_KEY to TRUSTED_SERVER__EC__SECRET_KEY
  in CI action and local integration test scripts (root cause of CI failure:
  Viceroy could not start without the EC secret key)
- Update stale doc reference synthetic.secret_key → ec.secret_key
- Update stale comments in consent_config.rs and consent/types.rs
- Rename struct Ec → EdgeCookie, field settings.ec → settings.edge_cookie
- Add serde alias "ec" for backward compatibility with existing configs
- Update all TOML configs, env vars, CI actions, scripts, and docs
- TRUSTED_SERVER__EC__* env vars → TRUSTED_SERVER__EDGE_COOKIE__*
- Validation messages now reference edge_cookie.secret_key
- Downgrade EC ID/IP log statements from debug to trace to prevent
  sensitive data appearing in production logs (edge_cookie.rs)
- Fix .change_context indentation in HMAC error handling
- Remove unused counter_store and opid_store fields from EdgeCookie
  config struct, all test fixtures, TOML configs, and documentation
- Add serialization test asserting ec_fresh wire field name
- Fix edge-cookies.md config section reference and consent language
- Update error-reference.md to reflect HMAC-based generation
- Update configuration.md to remove dead KV store field docs
Introduce a new ec/ module that owns the Edge Cookie lifecycle with a
two-phase design: read_from_request() extracts existing EC state
pre-routing, generate_if_needed() creates new IDs only in organic
handlers when consent permits.

Fixes and behavioral changes:
- Jurisdiction::Unknown now blocks EC creation (fail-closed)
- GPC independently blocks EC in US states regardless of us_privacy
- Missing client IP returns an error instead of using "unknown" fallback
- IPv6 normalization uses zero-padded hex without separators
- Integration proxy now consent-gates cookie setting (was unconditional)

The old edge_cookie module is reduced to a thin re-export shim.
…535)

Remove the cookie_domain config field from Publisher and compute it
as .{domain} automatically. This eliminates a redundant setting that
was always expected to be the dot-prefixed publisher domain.

Move create_ec_cookie, set_ec_cookie, and expire_ec_cookie from the
generic cookies module into ec/cookies where they belong alongside
the rest of the EC subsystem.

Update all TOML fixtures, integration test configs, and documentation
to remove cookie_domain references.
Address PR review findings:

- Strip revoked EC IDs before building auction requests: gate
  ec_value behind ec_allowed() so withdrawn-consent users don't
  leak their EC to bidders (#533)
- Make fresh_id generation best-effort: auction proceeds without
  a fresh EC when client IP is unavailable instead of hard-failing
- Restore cookie_domain field on Publisher for non-EC cookie use;
  rename computed method to ec_cookie_domain() per spec §5.2 which
  says EC cookies derive domain from publisher.domain independently
  while cookie_domain continues serving its existing purpose
- Add cookie expiration on consent withdrawal in handle_proxy,
  matching the publisher proxy revocation path
- Extract parse_ec_from_request() shared helper so get_ec_id and
  EcContext::read_from_request use a single cookie-parsing pass
- Replace ec_hash unwrap_or with explicit find('.') match for clarity
The fresh_id concept was a SyntheticID artifact for per-request freshness
detection. The EC spec explicitly removes it:
- §12.1: Remove ext.synthetic_fresh from openrtb.rs
- §12.5: Auction response headers include only X-ts-ec (no X-ts-ec-fresh)

Removed:
- UserInfo.fresh_id field (types.rs)
- fresh_id generation in convert_tsjs_to_auction_request (formats.rs)
- HEADER_X_TS_EC_FRESH constant and internal header entry
- UserExt.ec_fresh field and serialization test (openrtb.rs)
- ec_fresh population in Prebid adapter
- All test fixture fresh_id values
Rename EdgeCookie struct to Ec, secret_key field to passphrase,
and the TOML section from [edge_cookie] to [ec] to align with the
spec's configuration schema.

Add optional ec_store and partner_store fields to the Ec struct
in preparation for Story 3 (KV identity graph) and Story 4
(partner registry).

Remove the edge_cookie.rs legacy re-export shim — no consumers
remain after the ec/ module migration.
Implement KvIdentityGraph backed by Fastly KV Store for the EC
identity graph. Each EC hash maps to a JSON entry with consent
state, geo, and accumulated partner IDs.

Methods:
- get/get_metadata: read entry or metadata-only (fast path)
- create: InsertMode::Add for safe concurrent creates
- create_or_revive: revive tombstones on re-consent via CAS
- upsert_partner_id: atomic partner ID merge with CAS retry,
  auto-creates minimal entries on miss, rejects tombstones
- update_last_seen: 300s debounce, skips tombstones
- write_withdrawal_tombstone: unconditional overwrite, 24h TTL
- delete: reserved for IAB data deletion framework

Schema types (KvEntry, KvConsent, KvGeo, KvPartnerId, KvMetadata)
with factory methods for new, minimal, and tombstone entries.

Tombstone guards prevent late syncs from repopulating partner IDs
or extending TTL after consent withdrawal.
- Pass store_name to serialize_entry for meaningful error messages
- Add attempt number to upsert_partner_id race-retry log
- Add explicit backwards-timestamp guard in update_last_seen
  (previously absorbed silently into debounce check)
- Reuse store handle and serialized data in create_or_revive
  CAS loop instead of re-opening and re-serializing
- Add comment explaining intentional dual store handles in
  upsert_partner_id (get() opens its own; outer is for writes)
- Document tombstone created-field reset as intentional design
- Add PartialEq derive to all schema types for test ergonomics
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant