SSOT: Ilchul Agent OS — MCP TaskStream control plane MVP

## SSOT status

This issue is the active SSOT for the Ilchul Agent OS rewrite.

Ilchul is being reset around a Rust-first, DDD-style MCP TaskStream kernel/control plane.

Previous roadmap issues such as #114, #167, #190, #195, and #201 are no longer active implementation owners. They may be used as design references, but they do not define the organizing spine for this rewrite.

Core identity:

```text
Ilchul = product/repo brand
Rust core = TaskStream authority kernel
MCP = control-plane adapter
Pi/TS = facade/runtime adapter surface
Filesystem = artifacts/projections, not canonical authority
```

MVP boundary:

```text
one human goal
→ one TaskStream
→ one Run
→ one RuntimeSession
→ evidence
→ verification
→ kernel-owned CompletionDecision
```

Non-goals for MVP:

```text
multi-runtime orchestration
parallel worker DAGs
repair loop convergence
GitHub PR automation
Discord/Hermes presentation
destructive cleanup automation
```

## Summary

Implement the first Ilchul Agent OS MVP as a **single-session governed coding TaskStream over MCP**.

`main` submits a human goal and queries status/evidence, while the Rust TaskStream kernel owns stream state, event/evidence records, runtime dispatch, verification, and the final completion decision.

Runtime output is never completion authority. Runtime output may be a claim or evidence candidate; the kernel decides completion from accepted evidence and verification.

## Active child issue queue

- #238: Rust workspace scaffold + DDD core skeleton
- #239: Core domain invariants for TaskStream authority
- #240: Core services and explicit port contracts
- #241: SQLite canonical store adapter
- #242: Filesystem artifact inbox and projection adapter
- #243: MCP control-plane adapter for TaskStream kernel
- #244: Pi runtime adapter and TS facade boundary
- #245: End-to-end coding_test_fix TaskStream fixture
- #246: Documentation and migration notes for #236 reset

## Historical reference issues

These prior issues are design reference only. They are not active implementation parents for #236.

- #114: RunContract Harness roadmap — superseded as roadmap owner.
- #167: objective-driven / parallel runtime roadmap — superseded as roadmap owner.
- #185: RunState schema — harvest state/versioning ideas.
- #186: runtime event taxonomy — harvest event/accountability ideas.
- #188: adapter/substrate contracts — harvest runtime adapter ideas.
- #191: verification matrix — harvest verification policy ideas.
- #194/#196/#197: task graph, worker evidence, leases — defer until post-MVP orchestration.
- #190/#195: integration/repair semantics — post-MVP design reference.
- #201: safe cleanup semantics — post-MVP operational hardening reference.

## MVP proof shape

Use a small repo coding task with a clear failing test:

```text
Human goal:
  Fix the implementation so the intentionally failing test passes.
```

Expected flow:

```text
MCP goal submission
→ TaskStream created
→ Run created
→ one Pi RuntimeSession started
→ run contract dispatched
→ implementation changed
→ test evidence collected
→ verification gate evaluated
→ completion decided by kernel
```

## Naming decision

- Workspace folder names should stay generic and structural: `core`, `store`, `artifacts`, `mcp`, `runtime-pi`, `cli`, `testkit`.
- Rust package/crate names should use the repository/product namespace: `ilchul-core`, `ilchul-store`, `ilchul-artifacts`, `ilchul-mcp`, `ilchul-runtime-pi`, `ilchul-cli`, `ilchul-testkit`.
- Domain type names should remain generic: `TaskStream`, `Run`, `RuntimeSession`, `EvidenceRecord`, `VerificationResult`, `CompletionDecision`, `RuntimeAdapter`, `ArtifactStore`, `EventStore`.
- Avoid proper-noun leakage into reusable domain concepts unless the name is clearly packaging/product namespace rather than domain semantics.


## Core / adapter boundary decision

The #236 rewrite must keep a hard core/adapter boundary. The Rust core is the only place where TaskStream authority lives; adapters are replaceable projections and integrations.

Core owns:

- TaskStream, Run, RuntimeSession, TurnRef, Event, EvidenceRecord, VerificationResult, CompletionDecision domain models.
- Contract versioning and immutability rules.
- Stream/run lifecycle state machines and invariants.
- Canonical storage writes and transaction boundaries.
- Append-only accountability events.
- Evidence envelope validation and artifact reference validation.
- Verification result acceptance rules.
- Completion authority: runtime output is input; only core can emit accepted completion decisions.

Core must not own:

- Pi/Node UI behavior.
- tmux/process quirks except through ports.
- GitHub issue/PR lifecycle automation.
- Runtime-specific prompt syntax beyond abstract contracts.
- Filesystem artifact formatting beyond artifact refs/inbox rules.
- Discord/Hermes/main presentation concerns.

Adapters own:

- MCP tool/resource transport over the core API.
- SQLite implementation of the storage port.
- Filesystem artifact inbox/projection implementation.
- Pi runtime/session adapter and prompt projection.
- Future Codex/Claude/Hermes adapters.
- CLI/TS compatibility facades.
- GitHub/Discord/operator-facing presentation layers.

Architecture style rule:

- Do not enforce Clean Architecture dogmatically. Use it only where it clarifies dependency direction.
- Prefer domain-driven development: model the ubiquitous language and invariants first, then place ports/adapters around the domain.
- The primary boundary is not `domain/application/infrastructure` purity; it is whether TaskStream authority, state transitions, evidence, verification, and completion semantics remain in the Rust core.
- Module boundaries should follow domain concepts before framework or transport concerns.

Dependency rule:

```text
core domain model -> no adapter imports
core services -> domain + explicit ports only
adapters -> implement ports around the core
facades -> call core services through adapters
```

Acceptance rule: a PR that introduces Pi/GitHub/Discord/CLI transport concerns or runtime-specific prompt mechanics into the Rust core authority model fails the #236 architecture boundary. A PR does not fail merely because it is not textbook Clean Architecture.

## Implementation language decision

- Rust is the primary implementation language for the #236 kernel-first MVP.
- The canonical TaskStream kernel, storage authority, event/evidence/verification/completion model, and MCP control-plane server should be designed Rust-first.
- TypeScript remains acceptable for Pi extension facades, compatibility shims, UI/command surfaces, and adapter glue where Pi/Node integration requires it.
- The architecture must avoid letting the TS/Pi facade become the canonical authority; facades call into the Rust kernel/control-plane boundary rather than owning stream state or completion decisions.

## Storage / source-of-truth decision

MVP 1 uses a hybrid storage model, with a stricter authority boundary than a plain filesystem workflow:

```text
Canonical authority = kernel-owned SQLite/store
Filesystem = artifacts, runtime logs, transcripts, diffs, test output, and human-readable projections
Events = append-only accountability history, not raw transcript storage
```

Required boundaries:

- Kernel-owned typed records are canonical for current/queryable state: `TaskStream`, `TaskStreamContract`, `Run`, `RuntimeSession`, `EvidenceRecord`, `VerificationResult`, and `CompletionDecision`.
- Append-only events are canonical for history/accountability: lifecycle transitions, contract finalization, runtime dispatch/output observation, evidence registration, verification, and completion decisions.
- Filesystem artifacts are referenced by ID/path/digest/summary; they are not completion authority by themselves.
- Runtime workers may write only to controlled artifact inboxes or sandboxed workspaces. They must not directly mutate canonical stream/run/evidence/verification/completion state.
- Raw transcripts, long stdout/stderr, diffs, test logs, screenshots, and generated reports stay as artifacts; events carry bounded summaries, hashes/digests, artifact refs, actor/source, and decision facts.
- `completion.decided` must be emitted by Ilchul/kernel authority and must reference the evidence and verification result IDs that justify the decision.


## Required artifact surface

The MVP must write enough artifacts for mechanical verification:

```text
stream.json
  TaskStream identity, goal, contract/status, selected runtime, completion state

events.jsonl
  compact accountability timeline, not full turn/raw transcript

run/session records
  Run id, runtime adapter, session id, dispatch contract, exit/runtime state

evidence records
  normalized evidence used for completion, including test command, exit code, changed files, before/after status where available

verification output
  raw stdout/stderr/logs stored as artifact refs, not inline event payloads

completion_decision
  completed/failed, rationale, evidence_ids, decided_by=ilchul
```

## Required event policy

Events must not record every token, every turn, or full raw runtime logs inline.

Event log principle:

```text
Event log = compact accountability timeline
Artifacts = raw logs, transcripts, diffs, stdout/stderr, test outputs
Evidence = normalized facts used for completion
```

Turn and event are not 1:1:

```text
one turn may produce zero events
many turns may be summarized into one event
critical runtime output may become one bounded event plus artifact refs
```

Use `runtime.output_observed` / bounded progress summaries instead of mandatory `turn_observed`.

Minimum required event set:

```text
stream.created
stream.contract_finalized
run.created
runtime.session_started
runtime.dispatched
runtime.output_observed
evidence.recorded
verification.started
verification.finished
completion.decided
```

Mechanical success minimum must include at least:

```text
stream.created
run.created
runtime.dispatched
evidence.recorded
verification.finished
completion.decided
```

`completion.decided` must reference `evidence_ids` and include `decided_by=ilchul`.

## Adapter boundary

`pi` and `Codex` are both first-class reference adapter targets so the core contract does not become dependent on one runtime vocabulary.

However, each MVP execution uses only one adapter/session at a time:

```text
one TaskStream
one Run
one RuntimeSession
one selected adapter: pi OR Codex
```

The same `TaskStreamContract` should be projectable to both paths.

Core must avoid adapter-specific leakage:

- no tmux-specific fields in generic TaskStream core
- no Codex-specific approval/sandbox fields in generic TaskStream core
- runtime-specific details belong in adapter projection/session records

## TaskStream / RunState boundary

`TaskStream` must not accidentally replace or duplicate the existing runtime spine.

Use this relationship unless later design explicitly changes it:

```text
TaskStream = Agent OS-facing purpose/governance contract or projection
Run / RunState = runtime operational execution state
RuntimeSession = adapter/substrate execution session
Event = accountability/runtime transition record
Evidence = normalized completion basis
Artifact = raw logs/diffs/stdout/stderr/transcripts/test outputs
```

The first implementation should prefer a thin projection/contract relationship over a second competing source of truth.

## Event compatibility boundary

The event names in this issue define the Agent OS-level accountability timeline. They should remain compatible with #186's runtime event taxonomy.

Acceptable implementation approaches:

- shared EventStore shape for StreamEvent and RuntimeEvent;
- StreamEvent records that reference lower-level RuntimeEvent ids;
- a projection layer that maps runtime events into TaskStream accountability events.

Avoid creating an isolated event vocabulary that cannot replay or audit against the runtime event model.

## Non-goals

Out of scope for MVP 1:

- multi-runtime handoff
- multi-runtime parallel execution
- N-session orchestration
- long-lived ambient agent behavior
- robot real-time control
- automatic durable memory promotion
- full GitHub PR lifecycle automation
- raw turn/token transcript persistence as default event behavior
- destructive cleanup of tmux/worktree/session state

Session cleanup must follow #201. MVP 1 may retain runtime session artifacts rather than trying to clean them destructively.

## Acceptance criteria

- [ ] Main can submit a human goal to Ilchul through MCP.
- [ ] Ilchul creates one `TaskStream` from the human goal.
- [ ] Ilchul finalizes a `TaskStreamContract`.
- [ ] Ilchul creates one `Run` inside the TaskStream.
- [ ] Ilchul starts one `pi` or `Codex` `RuntimeSession`.
- [ ] Ilchul dispatches the Run contract to the selected runtime adapter.
- [ ] Runtime output is observed or summarized as bounded progress.
- [ ] Ilchul records normalized evidence from the coding fixture.
- [ ] Ilchul runs a verification gate based on test evidence.
- [ ] Ilchul writes `stream.json`.
- [ ] Ilchul writes `events.jsonl` with the minimum required events.
- [ ] Ilchul writes run/session records.
- [ ] Ilchul writes evidence records.
- [ ] Ilchul writes verification output as artifact refs.
- [ ] Ilchul writes a final report or `completion_decision`.
- [ ] `completion_decision.status` is `completed` or `failed`.
- [ ] `completion_decision.evidence_ids` is non-empty.
- [ ] `completion_decision.decided_by` is `ilchul`.
- [ ] A mechanical check can verify required events and completion fields without subjective interpretation.
- [ ] `TaskStream` is documented or implemented as an Agent OS-facing layer over/alongside the runtime spine rather than an uncoordinated replacement for RunState.
- [ ] Event records are compatible with or explicitly mapped to the runtime event taxonomy from #186.
- [ ] MVP behavior does not require destructive session cleanup; cleanup semantics are deferred to #201.

## Verification

Implement targeted tests or fixtures that prove:

- a small failing test fixture can be driven through one TaskStream/Run/session;
- required artifacts are written;
- required minimum events exist in `events.jsonl`;
- raw runtime/test output is referenced as artifacts rather than stored inline in events;
- `completion.decided` fails if required evidence or `decided_by=ilchul` is missing;
- the generic TaskStream core does not require pi- or Codex-specific fields;
- Stream/Run/Event/Evidence artifacts can be mechanically checked against this issue's minimum criteria.

## Notes

Core phrase to preserve:

```text
Turn은 실행의 리듬이고, Event는 책임의 기록이다.
```


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

SSOT: Ilchul Agent OS — MCP TaskStream control plane MVP #236

SSOT status

Summary

Active child issue queue

Historical reference issues

MVP proof shape

Naming decision

Core / adapter boundary decision

Implementation language decision

Storage / source-of-truth decision

Required artifact surface

Required event policy

Adapter boundary

TaskStream / RunState boundary

Event compatibility boundary

Non-goals

Acceptance criteria

Verification

Notes

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

SSOT: Ilchul Agent OS — MCP TaskStream control plane MVP #236

Description

SSOT status

Summary

Active child issue queue

Historical reference issues

MVP proof shape

Naming decision

Core / adapter boundary decision

Implementation language decision

Storage / source-of-truth decision

Required artifact surface

Required event policy

Adapter boundary

TaskStream / RunState boundary

Event compatibility boundary

Non-goals

Acceptance criteria

Verification

Notes

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions