Skip to content

Latest commit

 

History

History
233 lines (184 loc) · 12.7 KB

File metadata and controls

233 lines (184 loc) · 12.7 KB
title Agent task
description Agent Runtime task lifecycle, orchestration, attempts, progress, and recovery contract.

Agent task

An agent task is the runtime-owned unit of work that gives an agent objective, scope, lifecycle, progress, relationships, and delivery semantics.

A task is not a checklist item, not a chat message, not a model request, and not only a background job. It is the durable execution object that can span turns, start runs, spawn subagents, wait for input, produce artifacts, and be audited after recovery.

Design pressure

Real runtimes show the same task pressure in different forms:

  • Terminal agents keep foreground work, local shell work, remote agent work, scheduled work, and backgrounded work under one task surface.
  • Gateway and scheduler agents need durable jobs, delivery state, per-run output, checkpoint/resume, missed-run handling, and failure notification.
  • Typed coding runtimes need thread goals, todo lists, plan items, turn status, job items, approval state, and spawn edges to join through stable ids.
  • Desktop runtimes need task profiles, automation jobs, subagent turns, execution summaries, evidence export, and UI projection to read from the same fact chain.
  • Durable workflow systems show why workflow id, run id, task queue, child work, signals, cancellation, retries, and history cannot be left as prose.

The portable contract therefore needs a task model above individual tool calls and below host-product workflows.

Task is not job, run, step, or todo

Object Meaning Runtime rule
task Semantic objective with lifecycle, owner, relationships, constraints, and acceptance. Stable across retries, turns, backgrounding, and recovery.
run One execution attempt for a task. New run for retry, resume-after-crash, or alternate worker execution.
step Ordered item inside a run or turn. Model, reasoning, tool, process, approval, artifact, warning, or evidence item.
job Durable batch or scheduled dispatcher. May create or own tasks, but should not replace task semantics.
todo / plan item Agent-visible checklist. Useful progress hint, not authoritative lifecycle.

A compatible runtime SHOULD expose all five concepts when they exist instead of flattening them into one message or one task string.

Task record

A task SHOULD include these fields:

Field Purpose
task_id Stable task id.
parent_task_id / root_task_id Task graph linkage.
session_id / thread_id / turn_id Conversation or execution context linkage when applicable.
title / objective Human-readable work statement.
task_kind / task_family Portable classification and coarse grouping.
visibility foreground, background, internal, or hidden.
status Normalized lifecycle status.
priority Scheduling hint, not a completion guarantee.
requested_by / owner / assignee User, agent, workflow, channel, or worker attribution.
scope Workspace, project, thread, account, environment, or host boundary refs.
constraints Permission, sandbox, network, model, tool, cost, time, and output constraints.
task_profile Capability, latency, budget, fallback, and continuity profile.
acceptance Completion criteria or refs.
progress Percent, phase, current step, summary, counters, or output refs.
current_run_id Active run, if any.
attempts Prior and active runs.
relationships Dependencies, blocks, child tasks, source tasks, spawned agents, artifacts.
artifacts / evidence_refs Produced or consumed refs.
last_error / status_reason Structured failure, block, or wait explanation.
created_at / updated_at / started_at / ended_at Lifecycle timestamps.

Status model

A portable runtime SHOULD support these normalized task statuses:

Status Meaning
draft Defined but not yet accepted by runtime.
accepted Runtime accepted the task and assigned identity.
queued Waiting for scheduler, queue, dependency, or worker capacity.
preparing Resolving context, tools, model, policy, or environment.
running Active execution is making progress.
waiting_input Waiting for user or external structured input.
waiting_permission Waiting for human, policy, or host approval.
waiting_resource Waiting for credential, quota, file, network, worker, or external system.
blocked Cannot proceed until a named blocker is resolved.
paused Intentionally paused and resumable.
retrying A retry or fallback run is being prepared or active.
cancelling Cancellation requested; cleanup is in progress.
cancelled Stopped by user, policy, or runtime.
timed_out Stopped because a time or inactivity limit fired.
failed Terminal failure with error facts.
lost Runtime cannot prove whether the worker is still alive.
completed Acceptance criteria are satisfied and facts are reconciled.
archived No longer active in scheduling, but retained for history.
stale Snapshot may not reflect current execution.
unknown Runtime lacks enough facts to assert a status.

Implementation-native states MAY be preserved in native_status, but portable consumers SHOULD receive the normalized status.

Attempts and runs

A task SHOULD keep attempts rather than replacing history on retry.

A task_attempt or run SHOULD include:

Field Purpose
run_id / attempt_id Stable execution attempt id.
status Run lifecycle status.
worker Agent, process, hosted worker, scheduler, or external runtime.
input_refs Prompt, files, dataset rows, schedule trigger, or event refs.
output_refs Result, stdout/stderr, artifact, report, or external output refs.
checkpoint_refs Resume, rollback, or reconstruction boundaries.
started_at / ended_at Attempt timing.
attempt_count Retry count or ordinal.
retry_policy Max attempts, backoff, non-retryable errors, timeout policy.
last_error Structured failure fact.
completion_summary Final human-readable but non-authoritative summary.

A new attempt SHOULD be created for retry after terminal error, worker loss, crash recovery, different routing decision, or user-requested rerun.

Task graph

A task graph SHOULD represent relationships explicitly:

Relationship Meaning
parent / child Decomposition or delegation.
blocks / blocked_by Dependency ordering.
source_task Output or context came from another task.
source_attempt Output or context came from a specific run.
spawned_subagent Child agent context was created for the task.
assigned_thread Thread currently executing part of the task.
produced_artifact Artifact ref produced by the task.
consumed_artifact Artifact ref used as input.
evidence Evidence, replay, review, trace, or audit ref.

Edges SHOULD carry created_at, updated_at, status, and optional reason. Cancellation intent SHOULD stick to the graph so schedulers stop adding new child work while active children settle.

A2A peer tasks

When a task is delegated to an Agent2Agent peer, the local runtime SHOULD create a local task wrapper or remote task ref instead of replacing its task model with the peer protocol object.

An A2A mapping SHOULD preserve:

Field Purpose
native_protocol a2a or another peer protocol name.
remote_task_id A2A taskId or peer-native task id.
remote_context_id A2A contextId when supplied.
remote_agent_ref / agent_card_ref Agent Card, discovery record, or configured peer.
delivery_mechanism polling, streaming, subscription, push notification, or custom.
remote_status / native_status Peer-native state before normalization.
remote_artifact_refs Artifact refs received from the peer.

A2A messages SHOULD map to task input, clarification, or status events. A2A artifacts SHOULD map to durable artifact refs and task graph edges. Completion SHOULD require both peer terminal state and local reconciliation of artifacts, evidence, and delivery facts.

Progress and output

A runtime SHOULD report progress as facts, not only natural language:

  • phase: planning, working, verifying, delivering, waiting, or implementation-specific values.
  • current step or checklist summary.
  • counters for items, child tasks, job items, tool calls, tests, or files.
  • new output refs and output offsets for append-only logs.
  • delivery state: pending, delivered, queued, failed, parent missing, or not applicable.
  • validation state and acceptance coverage.

Large output SHOULD use refs. A progress event may carry a small summary and point to stdout, artifact, report, or evidence refs.

Events

A compatible runtime SHOULD emit the normalized task family:

Event When emitted
task.created Task identity and initial objective exist.
task.accepted Runtime accepted ownership.
task.queued Task entered a queue.
task.started First run or active execution started.
task.updated Metadata, owner, constraints, or profile changed.
task.progress Progress, counters, phase, summary, or output refs changed.
task.waiting Task waits for input, permission, resource, dependency, or worker.
task.blocked Task cannot proceed until a blocker is resolved.
task.paused / task.resumed Task was paused or resumed.
task.retrying Retry or fallback attempt began.
task.cancel_requested Cancellation intent recorded.
task.cancelled Task reached cancelled terminal state.
task.timed_out Runtime timeout or inactivity timeout fired.
task.failed Task reached failed terminal state.
task.lost Runtime lost authoritative worker state.
task.completed Task reached completed terminal state.
task.archived Task was removed from active scheduling.
task.delegated Child task, subagent, job, or worker assignment was created.
task.dependency.updated Relationship or blocker state changed.
task.attempt.started / task.attempt.completed / task.attempt.failed Attempt lifecycle changed.

Task events SHOULD carry task_id; attempt events SHOULD also carry run_id or attempt_id.

Control plane

A runtime SHOULD expose or map these operations:

Command Required semantics
create_task Create task identity with objective, scope, profile, constraints, and idempotency key.
update_task Update title, objective, metadata, priority, assignee, acceptance, or constraints.
start_task Start a run, bind worker/thread/environment, and emit attempt facts.
append_task_progress Append progress, output refs, counters, or delivery state.
pause_task / resume_task Pause or resume without losing graph and attempt history.
cancel_task Record cancellation intent and propagate to active workers or child tasks.
retry_task Start a new attempt with explicit retry reason and inherited constraints.
complete_task / fail_task Reconcile terminal state, artifacts, evidence, and acceptance facts.
list_tasks / get_task Return durable task read models.
link_tasks / unlink_tasks Update parent, child, block, source, artifact, or evidence edges.

Mutating commands MUST write runtime facts. UI-only task cards or local optimistic state are not authoritative.

Snapshot projection

Read models SHOULD expose:

  • task_summary: active count, terminal count, failed count, lost count, waiting count, and recent terminal tasks.
  • tasks: compact task records with status, title, owner, scope, current run, progress, relationships, and refs.
  • task_graph: edges needed to recover parent/child and dependency views.
  • attempts: active and recent attempts with output and checkpoint refs.
  • blocked_tasks: blockers and required action ids.
  • delivery_state: whether output was delivered back to parent, channel, or UI.

Snapshots SHOULD mark missing data as unknown, stale, or not_applicable rather than inferring success.

Anti-patterns

  • Treating the model's todo list as the task lifecycle authority.
  • Replacing a task record on retry and losing prior attempt failures.
  • Keeping background work only in UI memory, so restart loses the task.
  • Reporting completed before artifacts, evidence, or delivery facts are reconciled.
  • Creating child agents or jobs without task graph edges.
  • Emitting only final prose for a long-running task without progress, output refs, or errors.
  • Using scheduler ticks as product-level tasks.
  • Treating a lost worker as success because no error was observed.