Add idempotency key support for task deduplication#88
Add idempotency key support for task deduplication#88AntoineToussaint wants to merge 3 commits intomainfrom
Conversation
4b21848 to
42e29c1
Compare
Two new fields on SpawnOptions: - `only_once: bool` — auto-derives key from hash(task_name, params) - `idempotency_key: Option<String>` — explicit key (takes precedence) When a key is set, spawn_task checks for an existing non-terminal task with the same key. If found, returns the existing task instead of creating a duplicate. This enables first-served semantics where multiple clients can safely try to spawn the same logical task. DB changes: - New nullable `idempotency_key` column on task tables - Partial unique index on idempotency_key WHERE NOT NULL - Migration adds column/index to all existing queues - ensure_queue_tables updated for new queues Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
42e29c1 to
6023294
Compare
Deduplication will be handled at the durable crate level via idempotency keys (tensorzero/durable#88), not per-instance caching. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Per review on #88: auto-deriving a key from `hash(task_name, params)` makes it ambiguous what happens when a task is spawned multiple times with different settings (e.g. a different `max_attempts` would still hash to the same key and be treated as a duplicate). Callers should decide what "the same task" means for their use case and pass an explicit `idempotency_key`. Drops `SpawnOptions.only_once` and the corresponding SQL fallback in `spawn_task`. `idempotency_key` is now the single mechanism for deduplication. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
| end if; | ||
| end if; | ||
| v_cancellation := p_options->'cancellation'; | ||
| -- Extract parent_task_id for subtask tracking |
There was a problem hiding this comment.
Done in 447f2a6 — restored in both sql/schema.sql and the migration.
| @@ -0,0 +1,295 @@ | |||
| -- Add idempotency key support for task deduplication. | |||
| -- When set, only the first spawn with a given key creates a task. | |||
| -- Subsequent spawns with the same key (for non-terminal tasks) are no-ops. | |||
There was a problem hiding this comment.
I think the non-terminal check is somewhat surprising. In the autopilot use case, consider this scenario:
- Gateway 1 streams tool tool calls, spawns a task, which then fails
- Gateway 2 starts up, and does the same thing (spawning a new task).
I'm not sure if this is the behavior that we want, especially if the task is something long-running the like GEPA.
There was a problem hiding this comment.
Gateway 1 wouldn't deal with the error and retry?
There was a problem hiding this comment.
Agreed — switched to strict semantics. The check is now unconditional (where idempotency_key = $1), so a key is permanently bound to its first task regardless of state (running/completed/failed/cancelled). Updated the column comment, migration header, and SpawnOptions::idempotency_key docstring to make this explicit. Caller owns retries — pick a new key if you want to retry after failure.
There was a problem hiding this comment.
I didn't understand the comment (and the PR), but yes, I think idempotency should be definitely burnt after use.
There was a problem hiding this comment.
The code was wrong.
Address review feedback: drop the `state not in (terminal)` filter so that a key is bound to its first task forever, regardless of state. This matches the principle that an idempotency key has exactly one owner — if you want to retry after a failure, pick a new key. - Spawn check is now an unconditional `where idempotency_key = ?` - Column comment, migration header comment, and SpawnOptions docstring all updated to make the strict semantics explicit - Restored the `-- Extract parent_task_id for subtask tracking` comment that was accidentally removed (per Aaron's review) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Summary
See https://www.notion.so/tensorzerodotcom/Fanout-Only-once-Durable-Executions-33d7520bbad380c480a9c015c66cbd62?source=copy_link
Two new fields on
SpawnOptionsfor controlling task deduplication:only_once: bool— auto-derives idempotency key fromhash(task_name, params). First spawn creates the task, duplicates are no-ops.idempotency_key: Option<String>— explicit key for when params differ but the operation is the same. Takes precedence overonly_once.Default behavior (
only_once: false, no key) is unchanged — every spawn creates a new task.Motivation
In multi-instance deployments, multiple clients may observe the same event and independently try to spawn the same logical task. Without dedup, N instances create N tasks that all execute. This is wasteful for expensive operations and incorrect for non-idempotent ones.
DB Changes
idempotency_key TEXTcolumn on task tablesWHERE idempotency_key IS NOT NULLspawn_taskchecks for existing non-terminal task with same key before insertingensure_queue_tablesupdated for new queuesDesign doc
See
docs/design/durable-idempotency-key.mdin the autopilot repo.Test plan
only_once: true— second spawn returns existing taskidempotency_key— same behavior🤖 Generated with Claude Code