Aegis is a guarded, skill-discovering autonomous agent for the UpMoltWork marketplace. It monitors tasks, bids intelligently, executes deliverables with security guardrails, and extends its own capabilities by discovering new skills from online catalogs.
- Dynamic Skill Discovery: Autonomously searches online catalogs, evaluates relevance, downloads + verifies + sandboxes new skills with 3-gate trust model
- 3-Gate Security: Prompt Guard (<10ms screening) + Llama Guard 3 (deep taxonomy) + sandboxed execution for 3rd party skills
- Credential Isolation: API keys stored in env vars, accessed only by wallet client — never exposed to LLM context
- Full Observability: OpenTelemetry tracing for all LLM calls, phase transitions, skill activations — self-hosted Phoenix UI
- Retro Terminal UI: Textual-based TUI with 4 regions (tasks, errors, status, commands) and slash command interface
Orchestrator Engine (State Machine)
├── PHASE_DISCOVERY → bidding-strategy
├── PHASE_RESEARCH → research
├── PHASE_DELIVERY → code-delivery
├── PHASE_VALIDATION → validation
└── PHASE_SUBMISSION → wallet-management
Supporting Services:
├── Guardrails: Prompt Guard + Llama Guard 3 (direct imports)
├── Wallet: UpMoltWork API client with tenacity retries
├── Sandbox: LXC containers for code execution
├── Skills: 5 built-in + dynamic cache + 3-gate vetting
└── State: SQLite (tasks, skills, review queue, command log)
# 1. Install dependencies
uv sync
# 2. Configure credentials
cp .env.example .env
# Edit .env with your API keys
# 3. Run the agent
uv run python -m src.cli.uiSee .env.example for all available options:
| Variable | Required | Description |
|---|---|---|
UPMOLTWORK_API_KEY |
Yes | UpMoltWork marketplace API key |
OPENROUTER_API_KEY |
Yes | OpenRouter API key for LLM access |
IMAP_HOST |
Yes | IMAP server for command polling |
IMAP_USER |
Yes | IMAP username/email |
IMAP_PASS |
Yes | IMAP password/app password |
VALIDATION_CONFIDENCE_THRESHOLD |
No | Min quality confidence (default: 0.8) |
MAX_VALIDATION_ITERATIONS |
No | Max validation retries (default: 3) |
SPECIALIZATIONS |
No | Comma-separated task categories |
| Command | Purpose |
|---|---|
/status |
Overall system status, current phase |
/skills |
List all available skills |
/tasks |
Active tasks with status |
/review |
Halted tasks awaiting review |
/balance |
Points and USDC balance |
/trace <id> |
Phoenix trace deep link |
/halt <id> |
Halt a running task |
/config |
System configuration |
sdd-swe/
├── src/
│ ├── cli/ # Terminal UI (Textual)
│ ├── orchestrator/ # State machine engine
│ ├── skills/ # Skill management
│ ├── guardrails/ # Security pipeline
│ ├── wallet/ # API client
│ ├── execution/ # LXC sandbox
│ ├── alerts/ # Email polling
│ ├── config/ # .env loader
│ └── db/ # SQLite store
├── skills/ # Built-in SKILL.md files
├── docs/ # Scope, PRD, spec, checklist
└── tests/ # Unit + integration tests
- Discovery: Orchestrator scans
/tasks, bidding strategy evaluates fit, places bids - Research: When bid won, research skill investigates requirements
- Delivery: Code delivery skill generates solution, tests in sandbox
- Validation: LLM-as-judge checks acceptance criteria + architectural quality
- Submission: Wallet client submits result, earns points, returns to discovery
- 3-Gate Skill Verification: Checksum → Heuristic Scan → Sandbox → Human Approval
- Guardrails: All LLM inputs/outputs filtered through Prompt Guard + Llama Guard 3
- Credential Isolation: API keys in env vars, never exposed to LLM
- Sandboxed Execution: All code runs in ephemeral LXC containers (network disabled, read-only FS)
- Language: Python 3.12+
- Terminal UI: Textual 0.80+
- Package Manager: uv
- State: SQLite (aiosqlite)
- LLM: OpenRouter (provider-agnostic interface)
- Retries: tenacity (exponential backoff + jitter)
- Tracing: OpenTelemetry + Phoenix (self-hosted)
MIT
Built with Claude Code for the UpMoltWork hackathon. Architecture designed with Spec-Driven Development methodology.