AgentGuardHQ · jpleva91 · Apr 1, 2026 · Apr 1, 2026
diff --git a/README.md b/README.md
@@ -2,14 +2,14 @@
 
 # ShellForge
 
-**Governed AI agent runtime — one Go binary, local or cloud.**
+**Governed AI coding CLI and agent runtime — one Go binary, local or cloud.**
 
 [![Go](https://img.shields.io/badge/Go-1.18+-00ADD8?style=for-the-badge&logo=go&logoColor=white)](https://go.dev)
 [![GitHub Pages](https://img.shields.io/badge/Live_Site-agentguardhq.github.io/shellforge-ff6b2b?style=for-the-badge)](https://agentguardhq.github.io/shellforge)
 [![License: MIT](https://img.shields.io/badge/License-MIT-blue?style=for-the-badge)](LICENSE)
 [![AgentGuard](https://img.shields.io/badge/Governed_by-AgentGuard-green?style=for-the-badge)](https://github.com/AgentGuardHQ/agentguard)
 
-*Run autonomous AI agents with policy enforcement on every tool call. Local via Ollama or cloud via Anthropic API — your choice.*
+*Interactive pair-programming with local models + autonomous multi-task execution — with governance on every tool call.*
 
 [Website](https://agentguardhq.github.io/shellforge) · [Docs](docs/architecture.md) · [Roadmap](docs/roadmap.md) · [AgentGuard](https://github.com/AgentGuardHQ/agentguard)
 
@@ -54,12 +54,17 @@ shellforge setup                 # creates agentguard.yaml + output dirs
 
 This creates `agentguard.yaml` (governance policy) in your project root. Edit it to customize which actions are allowed/denied.
 
-### 5. Run an agent
+### 5. Start a chat session
+
+```bash
+shellforge chat                     # interactive REPL — pair-program with a local model
+```
+
+Or run a one-shot agent:
 
 ```bash
 shellforge agent "describe what this project does"
 shellforge agent "find test gaps and suggest improvements"
-shellforge agent "create a hello world program"
 ```
 
 Every tool call (file reads, writes, shell commands) passes through governance before execution.
@@ -70,17 +75,55 @@ Every tool call (file reads, writes, shell commands) passes through governance b
 
 ## What Is ShellForge?
 
-ShellForge is a **governed agent runtime** — not an agent framework, not an orchestration layer, not a prompt wrapper.
+ShellForge is a **governed AI coding CLI and agent runtime** — like Claude Code or Cursor, but with local models and policy enforcement built in.
 
-It sits between any agent driver and the real world. The agent decides what it wants to do. ShellForge decides whether it's allowed.
+Two modes:
+
+1. **Interactive REPL** (`shellforge chat`) — pair-program with a local or cloud model. Persistent conversation history, shell escapes, color output.
+2. **Autonomous agents** (`shellforge agent`, `shellforge ralph`) — one-shot tasks or multi-task loops with automatic validation and commit.
+
+Both modes share the same governance layer. Every tool call passes through [AgentGuard](https://github.com/AgentGuardHQ/agentguard) policy enforcement before execution.
 
 ```
-Agent Driver (Goose, Claude Code, Copilot CLI)
-  → ShellForge Governance (allow / deny / correct)
-    → Your Environment (files, shell, git)
+You (chat) or Octi Pulpo (dispatch)
+  → ShellForge Agent Loop (tool calling, drift detection)
+    → AgentGuard Governance (allow / deny / correct)
+      → Your Environment (files, shell, git)
 ```
 
-**The core insight:** ShellForge's value is governance, not the agent loop. [Goose](https://block.github.io/goose) handles local agent execution. [Dagu](https://github.com/dagu-org/dagu) handles workflow orchestration. ShellForge wraps them all with [AgentGuard](https://github.com/AgentGuardHQ/agentguard) policy enforcement on every tool call.
+---
+
+## Interactive REPL (`shellforge chat`)
+
+Pair-programming mode. Persistent conversation history across prompts — the model remembers what you discussed.
+
+```bash
+shellforge chat                          # local model via Ollama (default)
+shellforge chat --provider anthropic     # Anthropic API (Haiku/Sonnet/Opus)
+shellforge chat --model qwen3:14b        # pick a specific model
+```
+
+Features:
+- **Color output** — green prompt, red errors, yellow governance denials
+- **Shell escapes** — `!git status` runs a command without leaving the session
+- **Ctrl+C** — interrupts the current agent run without killing the session
+- **Governance** — every tool call checked against `agentguard.yaml`, same as autonomous mode
+
+---
+
+## Ralph Loop (`shellforge ralph`)
+
+Stateless-iterative multi-task execution. Each task gets a fresh context window — no accumulated confusion across tasks.
+
+```bash
+shellforge ralph tasks.json                    # run tasks from a JSON file
+shellforge ralph --validate "go test ./..."    # validate after each task
+shellforge ralph --dry-run                     # preview without executing
+```
+
+The loop: **PICK** a task → **IMPLEMENT** it → **VALIDATE** (run tests) → **COMMIT** on success → **RESET** context → next task.
+
+Tasks come from a JSON file or Octi Pulpo MCP dispatch. Failed validations skip the commit and move on — no broken code lands.
 
 ---
 
@@ -112,8 +155,14 @@ shellforge status
 
 | Command | Description |
 |---------|-------------|
-| `shellforge agent "prompt"` | Run a governed agent (Ollama, default) |
-| `shellforge agent --provider anthropic "prompt"` | Run via Anthropic API (Haiku/Sonnet/Opus, prompt caching) |
+| `shellforge chat` | Interactive REPL — pair-program with a local or cloud model |
+| `shellforge chat --provider anthropic` | REPL via Anthropic API (Haiku/Sonnet/Opus) |
+| `shellforge chat --model qwen3:14b` | REPL with a specific Ollama model |
+| `shellforge ralph tasks.json` | Multi-task loop — stateless-iterative execution |
+| `shellforge ralph --validate "go test ./..."` | Ralph Loop with post-task validation |
+| `shellforge ralph --dry-run` | Preview tasks without executing |
+| `shellforge agent "prompt"` | One-shot governed agent (Ollama, default) |
+| `shellforge agent --provider anthropic "prompt"` | One-shot via Anthropic API (prompt caching) |
 | `shellforge agent --thinking-budget 8000 "prompt"` | Enable extended thinking (Sonnet/Opus) |
 | `shellforge run <driver> "prompt"` | Run a governed CLI driver (goose, claude, copilot, codex, gemini) |
 | `shellforge setup` | Install Ollama, create governance config, verify stack |
@@ -125,6 +174,23 @@ shellforge status
 
 ---
 
+## Built-in Tools
+
+The agent loop (used by `chat`, `agent`, and `ralph`) has 8 built-in tools, all governed:
+
+| Tool | What It Does |
+|------|-------------|
+| `read_file` | Read file contents |
+| `write_file` | Write a complete file |
+| `edit_file` | Targeted find-and-replace (like Claude Code's Edit tool) |
+| `glob` | Pattern-based file discovery with recursive `**` support |
+| `grep` | Regex content search with `file:line` output |
+| `run_shell` | Execute shell commands (via RTK for token compression) |
+| `list_directory` | List directory contents |
+| `search_files` | Search files by name pattern |
+
+---
+
 ## Multi-Driver Governance
 
 ShellForge governs any CLI agent driver via AgentGuard hooks. Each driver keeps its own model and agent loop — ShellForge ensures governance is active and spawns the driver as a subprocess.
@@ -151,13 +217,20 @@ See `dags/multi-driver-swarm.yaml` and `dags/workspace-swarm.yaml` for examples.
 
 ```
 ┌───────────────────────────────────────────────────┐
+│  Entry Points                                      │
+│  chat (REPL) · agent (one-shot) · ralph (multi)   │
+│  run <driver> · serve (daemon)                     │
+└────────────────────┬──────────────────────────────┘
+                     │ prompt / task
+┌────────────────────▼──────────────────────────────┐
 │  Octi Pulpo (Coordination)                         │
 │  Budget-aware dispatch · Memory · Model cascading  │
 └────────────────────┬──────────────────────────────┘
                      │ task
 ┌────────────────────▼──────────────────────────────┐
 │  ShellForge Agent Loop                             │
 │  LLM provider · Tool calling · Drift detection     │
+│  Sub-agent orchestrator (spawn sync/async)         │
 │  Anthropic API or Ollama                           │
 └────────────────────┬──────────────────────────────┘
                      │ tool call
@@ -171,6 +244,7 @@ See `dags/multi-driver-swarm.yaml` and `dags/workspace-swarm.yaml` for examples.
 ┌────────────────────▼──────────────────────────────┐
 │  Your Environment                                  │
 │  Files · Shell (RTK) · Git · Network               │
+│  8 tools: read/write/edit/glob/grep/shell/ls/find  │
 │  Sandboxed by OpenShell                            │
 └───────────────────────────────────────────────────┘
 ```

diff --git a/docs/architecture.md b/docs/architecture.md
@@ -4,6 +4,41 @@
 
 ShellForge is a single Go binary (~7.5MB) that provides governed AI agent execution. Its core value is **governance** — every agent driver, whether a CLI tool, browser session, or local model, runs through AgentGuard policy enforcement on every action.
 
+## Entry Points
+
+ShellForge provides multiple entry points, all sharing the same agent loop and governance layer:
+
+| Entry Point | Mode | Context |
+|-------------|------|---------|
+| `shellforge chat` | Interactive REPL | Persistent — conversation history across prompts |
+| `shellforge agent "prompt"` | One-shot | Single task, single context window |
+| `shellforge ralph tasks.json` | Multi-task loop | Stateless-iterative — fresh context per task |
+| `shellforge run <driver>` | CLI driver | Governed subprocess (Goose, Claude Code, etc.) |
+| `shellforge serve agents.yaml` | Daemon | 24/7 swarm with memory-aware scheduling |
+
+### Interactive REPL (`chat`)
+
+Pair-programming mode. The user and model share a persistent conversation — the model remembers previous prompts and results within the session. Color output (green prompt, red errors, yellow governance denials). Shell escapes via `!command`. Ctrl+C interrupts the current agent run without killing the session.
+
+### Ralph Loop (`ralph`)
+
+Stateless-iterative execution for multi-task workloads. Each task gets a fresh context window to prevent accumulated confusion:
+
+```
+PICK task from queue → IMPLEMENT → VALIDATE (run tests) → COMMIT on success → RESET context → next
+```
+
+Tasks come from a JSON file or Octi Pulpo MCP dispatch. `--validate` runs a command (e.g., `go test ./...`) after each task. `--dry-run` previews without executing.
+
+### Sub-Agent Orchestrator
+
+The agent loop can spawn sub-agents for parallel work:
+
+- **SpawnSync** — block and wait for a sub-agent to complete
+- **SpawnAsync** — fire multiple sub-agents, collect results
+- Concurrency controlled via semaphore
+- Sub-agent results compressed to ~750 tokens before returning to parent
+
 ## Execution Model
 
 ShellForge supports three classes of agent driver, all governed uniformly:
@@ -110,7 +145,6 @@ Octi Pulpo routes tasks to the cheapest capable driver:
 | **Optimize** | [RTK](https://github.com/rtk-ai/rtk) | Token compression — 70-90% reduction on shell output |
 | **Execute** | [Goose](https://block.github.io/goose) / [OpenClaw](https://github.com/openclaw/openclaw) | Agent execution + browser automation |
 | **Coordinate** | [Octi Pulpo](https://github.com/AgentGuardHQ/octi-pulpo) | Budget-aware dispatch, episodic memory, model cascading |
-| **Coordinate** | [Octi Pulpo](https://github.com/AgentGuardHQ/octi-pulpo) | Swarm coordination via MCP |
 | **Govern** | [AgentGuard](https://github.com/AgentGuardHQ/agentguard) | Policy enforcement on every action |
 | **Sandbox** | [OpenShell](https://github.com/NVIDIA/OpenShell) | Kernel-level isolation (Docker on macOS) |
 | **Scan** | [DefenseClaw](https://github.com/cisco-ai-defense/defenseclaw) | Supply chain scanner — AI Bill of Materials |
@@ -120,6 +154,8 @@ Octi Pulpo routes tasks to the cheapest capable driver:
 ```
 cmd/shellforge/
 ├── main.go         # CLI entry point (cobra-style subcommands)
+├── chat.go         # Interactive REPL (`shellforge chat`)
+├── ralph.go        # Multi-task loop (`shellforge ralph`)
 └── status.go       # Ecosystem health check
 
 internal/
@@ -128,10 +164,13 @@ internal/
 │   └── anthropic.go# Anthropic API adapter (stdlib HTTP, prompt caching, tool_use)
 ├── agent/          # Agentic loop
 │   ├── loop.go     # runProviderLoop (Anthropic) + runOllamaLoop, drift detection wiring
-│   └── drift.go    # Drift detector — self-score every 5 calls, steer/kill on low scores
+│   ├── drift.go    # Drift detector — self-score every 5 calls, steer/kill on low scores
+│   └── repl.go     # Interactive REPL — persistent history, color output, shell escapes
+├── ralph/          # Ralph Loop — stateless-iterative multi-task execution
+│   └── loop.go     # PICK → IMPLEMENT → VALIDATE → COMMIT → RESET cycle
 ├── governance/     # agentguard.yaml parser + policy engine
 ├── ollama/         # Ollama HTTP client (chat, generate)
-├── tools/          # 5 tool implementations + RTK wrapper
+├── tools/          # 8 tool implementations (read/write/edit/glob/grep/shell/ls/find) + RTK wrapper
 ├── engine/         # Pluggable engine interface (Goose, OpenClaw, OpenCode)
 ├── logger/         # Structured JSON logging
 ├── scheduler/      # Memory-aware scheduling + cron
@@ -146,17 +185,19 @@ internal/
 
 ShellForge uses a pluggable engine system:
 
-1. **Goose** (preferred local driver) — subprocess, native Ollama support, SHELL wrapped via `govern-shell.sh`
-2. **OpenClaw** (browser + integrations) — browser automation, web app access, 100+ skills
-3. **NemoClaw** (enterprise) — OpenClaw + NVIDIA OpenShell sandbox + Nemotron local models
-4. **CLI Drivers** (cloud coding) — Claude Code, Codex, Copilot CLI, Gemini CLI
-5. **Native** (fallback) — built-in multi-turn loop with Ollama + tool calling
+1. **Native REPL** (`shellforge chat`) — interactive pair-programming, persistent history, 8 built-in tools
+2. **Native Agent** (`shellforge agent`) — one-shot autonomous execution with the same tool set
+3. **Ralph Loop** (`shellforge ralph`) — stateless-iterative multi-task with validation and auto-commit
+4. **Goose** (local driver) — subprocess, native Ollama support, SHELL wrapped via `govern-shell.sh`
+5. **OpenClaw** (browser + integrations) — browser automation, web app access, 100+ skills
+6. **NemoClaw** (enterprise) — OpenClaw + NVIDIA OpenShell sandbox + Nemotron local models
+7. **CLI Drivers** (cloud coding) — Claude Code, Codex, Copilot CLI, Gemini CLI
 
 ## Governance Flow
 
 ```
-User Request → Engine (Goose/OpenClaw/CLI/Native)
-  → Tool Call → Governance Check (agentguard.yaml)
+User Request → Entry Point (chat/agent/ralph/run/serve)
+  → Agent Loop → Tool Call → Governance Check (agentguard.yaml)
     → ALLOW → Execute Tool → Return Result
     → DENY  → Log Violation → Correction Feedback → Retry
 ```

diff --git a/docs/roadmap.md b/docs/roadmap.md
@@ -40,7 +40,7 @@
 - [x] Fixed catch-all deny bug (bounded-execution policy was denying everything)
 - [x] Dagu DAG templates (sdlc-swarm, studio-swarm, workspace-swarm, multi-driver)
 
-### v0.7.0 — Anthropic API Provider ← CURRENT
+### v0.7.0 — Anthropic API Provider
 - [x] LLM provider interface (`llm.Provider`) — pluggable Ollama vs Anthropic backends
 - [x] Anthropic API adapter — stdlib HTTP, structured `tool_use` blocks, multi-turn history
 - [x] Prompt caching — `cache_control: ephemeral` on system + tools, ~90% savings on cached tokens
@@ -49,6 +49,21 @@
 - [x] Drift detection — self-score every 5 tool calls, steer below 7, kill below 5 twice
 - [x] RTK token compression wired into `runShellWithRTK()` (70-90% savings on shell output)
 
+### v0.8.0 — UMAAL (Interactive REPL + Ralph Loop + Enhanced Tools)
+- [x] Interactive REPL (`shellforge chat`) — pair-programming with persistent conversation history
+- [x] Color output (green prompt, red errors, yellow governance denials)
+- [x] Shell escapes (`!command`) and Ctrl+C interrupt without session kill
+- [x] Ollama (local) and Anthropic API provider support in REPL
+- [x] Ralph Loop (`shellforge ralph`) — stateless-iterative multi-task execution
+- [x] PICK → IMPLEMENT → VALIDATE → COMMIT → RESET cycle
+- [x] Task input from JSON file or Octi Pulpo MCP dispatch
+- [x] `--validate` flag for post-task test commands, `--dry-run` for preview
+- [x] Sub-agent orchestrator — SpawnSync (block), SpawnAsync (fire and collect)
+- [x] Concurrency control via semaphore, context compression (~750 tokens)
+- [x] `edit_file` tool — targeted find-and-replace
+- [x] `glob` tool — pattern-based file discovery with recursive `**` support
+- [x] `grep` tool — regex content search with `file:line` output
+
 ---
 
 ## In Progress
@@ -142,17 +157,21 @@ Bugs identified during v0.6.x development. Fix before v1.0.
 
 ---
 
-## Stack (as of v0.6.1)
+## Stack (as of v0.8.0)
 
 | Component | Role | Status |
 |---|---|---|
+| `shellforge chat` | Interactive REPL | Working |
+| `shellforge ralph` | Multi-task loop | Working |
+| `shellforge agent` | One-shot agent | Working |
 | Goose (Block) | Local model driver | Working |
 | Claude Code | API driver (Linux) | Working (via hooks) |
 | Copilot CLI | API driver (Linux) | Working (via hooks) |
 | Codex CLI | API driver (Linux) | Coming soon |
 | Gemini CLI | API driver (Linux) | Coming soon |
 | Ollama | Local inference | Working |
+| Anthropic API | Cloud inference | Working (prompt caching) |
 | AgentGuard | Governance kernel | Working (YAML eval + Go kernel) |
-| Dagu | Orchestration | Working (DAGs + web UI) |
+| Octi Pulpo | Swarm coordination | Working (MCP) |
 | RTK | Token compression | Optional |
 | Docker | Sandbox | Optional |