You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This issue is the active SSOT for the Ilchul Agent OS rewrite.
Ilchul is being reset around a Rust-first, DDD-style MCP TaskStream kernel/control plane.
Previous roadmap issues such as #114, #167, #190, #195, and #201 are no longer active implementation owners. They may be used as design references, but they do not define the organizing spine for this rewrite.
Implement the first Ilchul Agent OS MVP as a single-session governed coding TaskStream over MCP.
main submits a human goal and queries status/evidence, while the Rust TaskStream kernel owns stream state, event/evidence records, runtime dispatch, verification, and the final completion decision.
Runtime output is never completion authority. Runtime output may be a claim or evidence candidate; the kernel decides completion from accepted evidence and verification.
Use a small repo coding task with a clear failing test:
Human goal:
Fix the implementation so the intentionally failing test passes.
Expected flow:
MCP goal submission
→ TaskStream created
→ Run created
→ one Pi RuntimeSession started
→ run contract dispatched
→ implementation changed
→ test evidence collected
→ verification gate evaluated
→ completion decided by kernel
Naming decision
Workspace folder names should stay generic and structural: core, store, artifacts, mcp, runtime-pi, cli, testkit.
Rust package/crate names should use the repository/product namespace: ilchul-core, ilchul-store, ilchul-artifacts, ilchul-mcp, ilchul-runtime-pi, ilchul-cli, ilchul-testkit.
Domain type names should remain generic: TaskStream, Run, RuntimeSession, EvidenceRecord, VerificationResult, CompletionDecision, RuntimeAdapter, ArtifactStore, EventStore.
Avoid proper-noun leakage into reusable domain concepts unless the name is clearly packaging/product namespace rather than domain semantics.
Core / adapter boundary decision
The #236 rewrite must keep a hard core/adapter boundary. The Rust core is the only place where TaskStream authority lives; adapters are replaceable projections and integrations.
Do not enforce Clean Architecture dogmatically. Use it only where it clarifies dependency direction.
Prefer domain-driven development: model the ubiquitous language and invariants first, then place ports/adapters around the domain.
The primary boundary is not domain/application/infrastructure purity; it is whether TaskStream authority, state transitions, evidence, verification, and completion semantics remain in the Rust core.
Module boundaries should follow domain concepts before framework or transport concerns.
Dependency rule:
core domain model -> no adapter imports
core services -> domain + explicit ports only
adapters -> implement ports around the core
facades -> call core services through adapters
Acceptance rule: a PR that introduces Pi/GitHub/Discord/CLI transport concerns or runtime-specific prompt mechanics into the Rust core authority model fails the #236 architecture boundary. A PR does not fail merely because it is not textbook Clean Architecture.
The canonical TaskStream kernel, storage authority, event/evidence/verification/completion model, and MCP control-plane server should be designed Rust-first.
TypeScript remains acceptable for Pi extension facades, compatibility shims, UI/command surfaces, and adapter glue where Pi/Node integration requires it.
The architecture must avoid letting the TS/Pi facade become the canonical authority; facades call into the Rust kernel/control-plane boundary rather than owning stream state or completion decisions.
Storage / source-of-truth decision
MVP 1 uses a hybrid storage model, with a stricter authority boundary than a plain filesystem workflow:
Canonical authority = kernel-owned SQLite/store
Filesystem = artifacts, runtime logs, transcripts, diffs, test output, and human-readable projections
Events = append-only accountability history, not raw transcript storage
Required boundaries:
Kernel-owned typed records are canonical for current/queryable state: TaskStream, TaskStreamContract, Run, RuntimeSession, EvidenceRecord, VerificationResult, and CompletionDecision.
Append-only events are canonical for history/accountability: lifecycle transitions, contract finalization, runtime dispatch/output observation, evidence registration, verification, and completion decisions.
Filesystem artifacts are referenced by ID/path/digest/summary; they are not completion authority by themselves.
Runtime workers may write only to controlled artifact inboxes or sandboxed workspaces. They must not directly mutate canonical stream/run/evidence/verification/completion state.
Raw transcripts, long stdout/stderr, diffs, test logs, screenshots, and generated reports stay as artifacts; events carry bounded summaries, hashes/digests, artifact refs, actor/source, and decision facts.
completion.decided must be emitted by Ilchul/kernel authority and must reference the evidence and verification result IDs that justify the decision.
Required artifact surface
The MVP must write enough artifacts for mechanical verification:
stream.json
TaskStream identity, goal, contract/status, selected runtime, completion state
events.jsonl
compact accountability timeline, not full turn/raw transcript
run/session records
Run id, runtime adapter, session id, dispatch contract, exit/runtime state
evidence records
normalized evidence used for completion, including test command, exit code, changed files, before/after status where available
verification output
raw stdout/stderr/logs stored as artifact refs, not inline event payloads
completion_decision
completed/failed, rationale, evidence_ids, decided_by=ilchul
Required event policy
Events must not record every token, every turn, or full raw runtime logs inline.
Event log principle:
Event log = compact accountability timeline
Artifacts = raw logs, transcripts, diffs, stdout/stderr, test outputs
Evidence = normalized facts used for completion
Turn and event are not 1:1:
one turn may produce zero events
many turns may be summarized into one event
critical runtime output may become one bounded event plus artifact refs
Use runtime.output_observed / bounded progress summaries instead of mandatory turn_observed.
completion.decided must reference evidence_ids and include decided_by=ilchul.
Adapter boundary
pi and Codex are both first-class reference adapter targets so the core contract does not become dependent on one runtime vocabulary.
However, each MVP execution uses only one adapter/session at a time:
one TaskStream
one Run
one RuntimeSession
one selected adapter: pi OR Codex
The same TaskStreamContract should be projectable to both paths.
Core must avoid adapter-specific leakage:
no tmux-specific fields in generic TaskStream core
no Codex-specific approval/sandbox fields in generic TaskStream core
runtime-specific details belong in adapter projection/session records
TaskStream / RunState boundary
TaskStream must not accidentally replace or duplicate the existing runtime spine.
Use this relationship unless later design explicitly changes it:
TaskStream = Agent OS-facing purpose/governance contract or projection
Run / RunState = runtime operational execution state
RuntimeSession = adapter/substrate execution session
Event = accountability/runtime transition record
Evidence = normalized completion basis
Artifact = raw logs/diffs/stdout/stderr/transcripts/test outputs
The first implementation should prefer a thin projection/contract relationship over a second competing source of truth.
Event compatibility boundary
The event names in this issue define the Agent OS-level accountability timeline. They should remain compatible with #186's runtime event taxonomy.
Acceptable implementation approaches:
shared EventStore shape for StreamEvent and RuntimeEvent;
StreamEvent records that reference lower-level RuntimeEvent ids;
a projection layer that maps runtime events into TaskStream accountability events.
Avoid creating an isolated event vocabulary that cannot replay or audit against the runtime event model.
Non-goals
Out of scope for MVP 1:
multi-runtime handoff
multi-runtime parallel execution
N-session orchestration
long-lived ambient agent behavior
robot real-time control
automatic durable memory promotion
full GitHub PR lifecycle automation
raw turn/token transcript persistence as default event behavior
destructive cleanup of tmux/worktree/session state
Session cleanup must follow #201. MVP 1 may retain runtime session artifacts rather than trying to clean them destructively.
Acceptance criteria
Main can submit a human goal to Ilchul through MCP.
Ilchul creates one TaskStream from the human goal.
Ilchul finalizes a TaskStreamContract.
Ilchul creates one Run inside the TaskStream.
Ilchul starts one pi or CodexRuntimeSession.
Ilchul dispatches the Run contract to the selected runtime adapter.
Runtime output is observed or summarized as bounded progress.
Ilchul records normalized evidence from the coding fixture.
Ilchul runs a verification gate based on test evidence.
Ilchul writes stream.json.
Ilchul writes events.jsonl with the minimum required events.
Ilchul writes run/session records.
Ilchul writes evidence records.
Ilchul writes verification output as artifact refs.
Ilchul writes a final report or completion_decision.
completion_decision.status is completed or failed.
completion_decision.evidence_ids is non-empty.
completion_decision.decided_by is ilchul.
A mechanical check can verify required events and completion fields without subjective interpretation.
TaskStream is documented or implemented as an Agent OS-facing layer over/alongside the runtime spine rather than an uncoordinated replacement for RunState.
SSOT status
This issue is the active SSOT for the Ilchul Agent OS rewrite.
Ilchul is being reset around a Rust-first, DDD-style MCP TaskStream kernel/control plane.
Previous roadmap issues such as #114, #167, #190, #195, and #201 are no longer active implementation owners. They may be used as design references, but they do not define the organizing spine for this rewrite.
Core identity:
MVP boundary:
Non-goals for MVP:
Summary
Implement the first Ilchul Agent OS MVP as a single-session governed coding TaskStream over MCP.
mainsubmits a human goal and queries status/evidence, while the Rust TaskStream kernel owns stream state, event/evidence records, runtime dispatch, verification, and the final completion decision.Runtime output is never completion authority. Runtime output may be a claim or evidence candidate; the kernel decides completion from accepted evidence and verification.
Active child issue queue
Historical reference issues
These prior issues are design reference only. They are not active implementation parents for #236.
MVP proof shape
Use a small repo coding task with a clear failing test:
Expected flow:
Naming decision
core,store,artifacts,mcp,runtime-pi,cli,testkit.ilchul-core,ilchul-store,ilchul-artifacts,ilchul-mcp,ilchul-runtime-pi,ilchul-cli,ilchul-testkit.TaskStream,Run,RuntimeSession,EvidenceRecord,VerificationResult,CompletionDecision,RuntimeAdapter,ArtifactStore,EventStore.Core / adapter boundary decision
The #236 rewrite must keep a hard core/adapter boundary. The Rust core is the only place where TaskStream authority lives; adapters are replaceable projections and integrations.
Core owns:
Core must not own:
Adapters own:
Architecture style rule:
domain/application/infrastructurepurity; it is whether TaskStream authority, state transitions, evidence, verification, and completion semantics remain in the Rust core.Dependency rule:
Acceptance rule: a PR that introduces Pi/GitHub/Discord/CLI transport concerns or runtime-specific prompt mechanics into the Rust core authority model fails the #236 architecture boundary. A PR does not fail merely because it is not textbook Clean Architecture.
Implementation language decision
Storage / source-of-truth decision
MVP 1 uses a hybrid storage model, with a stricter authority boundary than a plain filesystem workflow:
Required boundaries:
TaskStream,TaskStreamContract,Run,RuntimeSession,EvidenceRecord,VerificationResult, andCompletionDecision.completion.decidedmust be emitted by Ilchul/kernel authority and must reference the evidence and verification result IDs that justify the decision.Required artifact surface
The MVP must write enough artifacts for mechanical verification:
Required event policy
Events must not record every token, every turn, or full raw runtime logs inline.
Event log principle:
Turn and event are not 1:1:
Use
runtime.output_observed/ bounded progress summaries instead of mandatoryturn_observed.Minimum required event set:
Mechanical success minimum must include at least:
completion.decidedmust referenceevidence_idsand includedecided_by=ilchul.Adapter boundary
piandCodexare both first-class reference adapter targets so the core contract does not become dependent on one runtime vocabulary.However, each MVP execution uses only one adapter/session at a time:
The same
TaskStreamContractshould be projectable to both paths.Core must avoid adapter-specific leakage:
TaskStream / RunState boundary
TaskStreammust not accidentally replace or duplicate the existing runtime spine.Use this relationship unless later design explicitly changes it:
The first implementation should prefer a thin projection/contract relationship over a second competing source of truth.
Event compatibility boundary
The event names in this issue define the Agent OS-level accountability timeline. They should remain compatible with #186's runtime event taxonomy.
Acceptable implementation approaches:
Avoid creating an isolated event vocabulary that cannot replay or audit against the runtime event model.
Non-goals
Out of scope for MVP 1:
Session cleanup must follow #201. MVP 1 may retain runtime session artifacts rather than trying to clean them destructively.
Acceptance criteria
TaskStreamfrom the human goal.TaskStreamContract.Runinside the TaskStream.piorCodexRuntimeSession.stream.json.events.jsonlwith the minimum required events.completion_decision.completion_decision.statusiscompletedorfailed.completion_decision.evidence_idsis non-empty.completion_decision.decided_byisilchul.TaskStreamis documented or implemented as an Agent OS-facing layer over/alongside the runtime spine rather than an uncoordinated replacement for RunState.Verification
Implement targeted tests or fixtures that prove:
events.jsonl;completion.decidedfails if required evidence ordecided_by=ilchulis missing;Notes
Core phrase to preserve: