Goal: mostly-autonomous software development while maintaining rigorous quality control
- Recursive decomposition of tasks
- Automated quality control using role-based agents
- Web interface as primary mode of interaction
- Merge train for CI
- Built-in issue tracking: tasks
- One task is:
- A description of an actionable issue:
- A plan generated by an agent that should be executed (or split into sub-tasks)
- A bug report from the user or an agent which should be investigated
- A vague or open-ended request from the user which must be expanded into a concrete plan
- Maybe just the initial prompt of a conversation with the user
- An agent session which will handle this issue
- All sessions are tied to a task
- Tasks start out pending (no session) until they are dequeued and given to an agent
- Metadata:
- Link to a parent issue
- Role of the agent who will handle this issue
- Optional existing session to resume, and optionally from where (session forking)
- Dependencies
- Approvals
- Operator
- Stewards
- Link to execution session
- "Committable" flag - whether the work should be saved as an individual commit, or is part of a larger commit (parent issue)
- When "committable" is true, execution runs in a new worktree
- Status is implicitly derived from dependencies / approvals / execution session status
- A description of an actionable issue:
- One task is:
- Parallel heavy:
- All work happens in work trees
- Merge train - all commits must pass CI
- CI can be slow, so it is handled by the system (optimized with a merge train) - not executed by agents directly
- Stewards:
- Stateful agents which concern themselves with one aspect of the project
- Stewards review all plans and patches
- Any steward can reject any plan or patch
- Examples:
- Steward of Vision (does this align with the project's overall vision?)
- Steward of Documentation (does this include necessary documentation updates?)
- Steward of Testing (does this include necessary tests?)
- Steward of Security (does this introduce any potential security issues?)
- Steward of Quality (does this plan use any known antipatterns?)
- Stewards are notified of landed changes, and maintain their own internal documentation and tooling of what they need to care about, especially regressions in that area
- Stewards can interrogate the agent which generated the artifact (plan/patch)
- For plans, stewards either approve or reject with a rationale
- For patches, stewards may approve, reject, or approve with a list of issues to be addressed in the future
- Encourage splitting up work:
- After an agent reads an issue description, the very first thing they must do is decide if it should be split up into multiple issues
- Wrap Claude Code in headless mode (instead of e.g. tmux like Gas Town)
- Claude runs sandboxed with
--dangerously-skip-permissions(no interactive permission prompts)
- Claude runs sandboxed with
- Provide a custom set of tools instead of the built-in Claude Code ones
- Hierarchy:
- Workspace (sandbox boundary)
- Project (has .git)
- Branch (has merge train)
- Task (runs in parallel)
- User interaction is via a web interface
- Task tree - how tasks have been recursively split up
- Agent log - view agent logs for each task
- Work queue - for roles that do work synchronously (stewards), show their work queue
- Pausing/stopping - operator can pause (next tool call will block) or halt (kill process) an agent
- Live steering - allow injecting user messages into the agent session in real time
- Interrogation - finished sessions can be forked to create an interactive session
- "Full-auto" (a.k.a. AFK / overnight) mode (human approval of plans is skipped)
- Backend: D + ae (~/work/ae)
- Frontend: Typescript + Preact
Wrap Claude Code CLI in a web UI with real-time streaming.
Deliverable: web UI with same capabilities as the first-party CLI.
- Claude Code CLI wrapped in stream-json protocol
- Web UI renders sessions with real-time streaming via WebSocket
Multiple concurrent agent sessions managed from a single interface.
Deliverable: sidebar for selecting and interacting with all running sessions, with live steering (inject user messages into running sessions).
- Sidebar for selecting and interacting with multiple sessions
- Live steering: inject user messages into running sessions
All state persisted so no data is lost on backend restart.
Deliverable: backend restart preserves all sessions and their full message logs; UI can display previous sessions immediately on reconnect.
- SQLite data model for tasks (tid, session ID, description, type, parent, status)
- Session history loaded from Claude's JSONL files
Project discovery and bwrap-based sandbox isolation for agent sessions.
Deliverable: agents run inside sandboxed containers with configurable filesystem access; projects are auto-discovered from workspace roots.
- Workspace configuration with per-workspace sandbox overrides
- bwrap isolation with read-only / read-write path control
- Config hot-reload on file change
MCP server delivering custom tools to Claude Code sessions.
Deliverable: agents use CyDo's Task tool via MCP to create child sessions visible in the UI task tree, with promise-based result return.
- Task tool (sub-task creation with result await)
- Agents use Claude Code's built-in tools for everything else
YAML-driven task type definitions controlling agent behavior, capabilities, and flow control.
Deliverable: task types configure model, tools, prompt, and sub-task permissions declaratively; a simulator and dot generator validate the design.
- YAML-defined types with model_class, read_only, output_type, prompt_template
- creatable_tasks enforcement (parent controls which sub-task types child can create)
- Prompt template rendering with {{task_description}} substitution
- Simulator and Graphviz dot generator for design validation
Tasks with worktree: true (implement, spike) run in their own git worktree.
Deliverable: an implement sub-task produces a commit in an isolated worktree; the parent can adopt the result without conflicts in the main tree.
- Create git worktrees on task spawn, pass working directory to agent session
- Include worktree path in sub-task result for parent to adopt changes
When a task completes, the system automatically spawns a successor task based on the type's continuation definitions.
Deliverable: the full plan → triage → implement/decompose → review chain runs end-to-end without manual intervention. A full-auto toggle skips operator approval on continuation gates (stewards still review).
- On task exit, look up chosen continuation and spawn successor
- keep_context: fork the session (reuse existing JSONL fork logic) so successor inherits conversation history
Session forking for user-initiated interrogation already works (JSONL truncation). Richer communication between tasks in the tree is needed.
Deliverable: agents can ask questions up the task tree (child → parent → user) and parents can re-engage completed children for follow-up.
- Parent wakes a completed child to ask follow-up questions
- Child asks parent for clarification of its prompt
- Clarification requests bubble up the task tree, ultimately reaching the user if no ancestor can answer
Stateful review agents that gate continuation spawning via approval.
Deliverable: every plan and patch is reviewed by stewards before landing. Rejections feed back to the originating agent, which can rework and retry.
- Approval gates on continuations invoke all stewards in parallel (reviews are read-only)
- Rejection modeled as a retryable tool call — agent reworks and resubmits
- Knowledge bases loaded from
knowledge_basepath into steward sessions - Steward upkeep: notified of landed changes, maintain internal docs. Upkeep tasks are serial (one at a time per steward); may require a task queue.
Worktree commits land via a merge queue that runs CI and handles conflicts.
Deliverable: agents produce commits; the system lands them via a serialized queue with CI validation, without agents running CI directly.
- Investigate whether implementable within the task system (as task types and continuations) or requires a dedicated facility
- Rebase and retry on conflict or CI failure
- May require a task queue as prerequisite