Skip to content

get2knowio/maverick

Repository files navigation

Maverick

AI-powered development workflow orchestration. Point your AI agents at a flight plan and let them fly — Maverick handles implementation, code review, spec compliance, human escalation, and commit curation autonomously.

What is Maverick?

Maverick is a Python CLI that orchestrates the complete development lifecycle on top of airframe, a vendor-neutral agent runtime protocol. From a PRD, it generates a flight plan, decomposes it into work units, implements them with AI agents, validates against project conventions, reviews code, escalates to humans when needed, and curates clean commit history — all driven by a bead-based work graph where humans and agents create work for each other.

Core idea: Everything is a bead. A bead is a unit of work managed by the bd CLI tool. Implementation beads, review findings, human escalations, and correction tasks all live in the same dependency graph.

Key Features

  • Full PRD-to-code pipelineplan generates a flight plan, refuel decomposes into work units with acceptance criteria, fly implements and validates, land curates commits and merges
  • Deterministic spec compliance — Grep-based convention checker catches unwrap() in runtime code, blocking std::process::Command in async functions, and other project-specific anti-patterns before review
  • Enriched code review — Reviewer agents receive the full work unit spec (acceptance criteria, file scope), pre-flight briefing (contrarian findings, risk assessment), and runway historical context
  • Post-flight aggregate review — After all beads complete, a cross-bead review checks architectural coherence and dead code across the full diff
  • Human-in-the-loop via assumption beads — When agents exhaust fix attempts, they create human-assigned review beads with full escalation context. Humans review via maverick review (approve/reject/defer) and correction beads flow back to agents. Works from a phone terminal.
  • Continuous fly modemaverick fly --watch polls for new beads, enabling concurrent plan/refuel in another terminal while fly continuously drains work
  • Cross-epic dependency wiring — New epics automatically depend on existing open epics, serializing execution while allowing tasks within each epic to parallelize
  • Multi-provider routing — Each of the five canonical roles (implement, review, briefing, decompose, generate) binds to one airframe-supported provider + model in maverick.yaml. Airframe ships adapters for claude, github-copilot, opencode, opencode-go, opencode-zen, openrouter, and bedrock — each speaks its native vendor SDK directly (OAuth for Claude Max, copilot CLI, API keys elsewhere). Misconfigured bindings fail at squadron-open with a clear UnsupportedBindingError, not mid-workflow.
  • Runway knowledge store — Episodic records of bead outcomes, review findings, and fix attempts build project-specific context. Agents progressively discover this context via the .maverick/runway/ directory
  • Apache Burr workflows — Each command (plan, refuel, fly) is an Apache Burr Application: a state machine of @action-decorated async functions over a per-workflow Squadron that owns the airframe runtimes. Single-process, all coroutines on one loop. Transitions encode the control flow (per-bead pipeline, fix loops, tier escalation, watch-mode polling, post-loop aggregate review); progress events stream out via a hook.
  • Jujutsu (jj) VCS — Write operations use jj for snapshot/rollback safety. Curation skips immutable commits gracefully.

Quick Start

Prerequisites

  • Python 3.11+
  • uv — Fast Python package manager
  • GitHub CLI (gh)
  • Jujutsu (jj)
  • bd for bead/work-item management
  • Authentication for at least one airframe-supported provider:
    • claude — Claude Max subscription OAuth (~/.claude/.credentials.json), or ANTHROPIC_API_KEY
    • github-copilot — GitHub Copilot subscription via the gh copilot CLI
    • opencode-go / opencode-zenopencode auth login <provider>
    • openrouterOPENROUTER_API_KEY
    • bedrock — AWS credentials + AWS_REGION
  • Git repository with remote origin

Installation

uv tool install maverick-cli

Or from the repository:

uv tool install git+https://github.com/get2knowio/maverick.git

The Pipeline

# 1. Initialize the project
maverick init

# 2. Seed the runway with codebase knowledge
maverick runway seed

# 3. Generate a flight plan from a PRD
maverick plan generate my-feature --from-prd spec.md

# 4. Decompose into work units and create beads
maverick refuel my-feature

# 5. Implement beads
maverick fly --epic <epic-id> --auto-commit

# 6. Review any human-escalated beads
maverick brief --human
maverick review <bead-id>

# 7. Curate history and merge
maverick land --yes

Continuous Mode

Run fly as a long-lived daemon while adding work from another terminal:

# Terminal 1: fly drains beads continuously
maverick fly --watch --auto-commit

# Terminal 2: keep adding work
maverick plan generate feature-2 --from-prd feature-2.md
maverick refuel feature-2

# Terminal 3: review escalations
maverick brief --human
maverick review <bead-id> --reject "use tokio::process::Command instead"

Commands

maverick plan generate — Flight Plan from PRD

Runs a Pre-Flight Briefing Room — four parallel AI agents analyze the PRD, then a generator synthesizes a flight plan with success criteria and scope.

Agent Role
Scopist Defines scope boundaries
Codebase Analyst Maps relevant modules and patterns
Criteria Writer Drafts acceptance criteria
Contrarian Identifies risks and blind spots
maverick plan generate my-feature --from-prd spec.md
maverick plan generate my-feature --from-prd spec.md --skip-briefing

maverick refuel — Decompose into Beads

Decomposes a flight plan into work units with acceptance criteria, file scope, and verification commands. Creates epic + task beads via bd. New epics automatically chain behind existing open epics.

maverick refuel my-feature
maverick refuel my-feature --skip-briefing
maverick refuel my-feature --dry-run

maverick fly — Bead-Driven Development

Iterates over ready beads. For each bead:

Implement → Gate (fmt/lint/test) → AC Check → Spec Check → Review → Commit
     ↑                                              |
     └──── Fix (if rejected, up to 3 rounds) ───────┘
     
If fix attempts exhausted → escalate to human (assumption bead)
                          → commit optimistically → continue

After all beads: aggregate cross-bead review, then report with structured per-bead events and ACTION REQUIRED for any needs-human-review beads.

Flag Default Description
--epic <id> (any) Filter to beads under this epic
--max-beads <n> 30 Maximum beads to process
--watch false Poll for new beads when queue is empty
--watch-interval <s> 30 Seconds between polls
--auto-commit false Auto-commit uncommitted changes
--skip-review false Skip code review step
--dry-run false Preview mode

maverick land — Curate and Merge

AI curator reorganizes commits — squashes fix commits, strips bead IDs, writes conventional commit messages, reorders for logical flow. Skips immutable commits gracefully. Consolidates runway after merge.

maverick land --yes              # Curate + merge + cleanup
maverick land --dry-run          # Show plan only
maverick land --no-curate        # Skip curation, just merge
maverick land --heuristic-only   # Keyword-based curation (no agent)

maverick review — Human Decision Capture

Lightweight review of human-assigned assumption beads. Displays escalation context and captures a decision: approve, reject (with guidance), or defer. Rejection spawns a correction bead back into the agent pipeline.

maverick brief --human                              # See the queue
maverick review <bead-id>                           # Interactive
maverick review <bead-id> --approve                 # Scriptable
maverick review <bead-id> --reject "use Dockerfile" # With guidance

maverick brief — Bead Dashboard

maverick brief                   # All ready/blocked beads
maverick brief --epic <id>       # Children of an epic
maverick brief --human           # Human review queue only
maverick brief --watch           # Live polling
maverick brief --format json     # JSON output

maverick runway — Knowledge Store

maverick runway init             # Initialize the store
maverick runway seed             # AI-generated codebase analysis
maverick runway status           # Show metrics
maverick runway consolidate      # Distill old records into summaries

The runway records bead outcomes, review findings, and fix attempts as JSONL. Agents discover this context progressively via the .maverick/runway/ directory. Consolidation (automatic during land) distills episodic records into semantic summaries.

maverick init — Project Setup

Initializes maverick.yaml, probes installed airframe adapters via runtime.list_models() to discover which providers are authenticated, and writes a starter agents: block binding the five canonical roles (implement, review, briefing, decompose, generate).

maverick init                                        # auto-detect
maverick init --providers claude,opencode-go         # narrow the spread
maverick init --models claude:claude-sonnet-4-6      # pin per-provider models

Configuration

project_type: rust

github:
  owner: your-org
  repo: your-repo
  default_branch: main

validation:
  format_cmd: [cargo, fmt]
  lint_cmd: [cargo, clippy, --fix, --allow-dirty]
  test_cmd: [make, test-nextest-fast]
  timeout_seconds: 600

# One airframe binding per canonical role. Provider IDs come from
# airframe.list_providers(); model IDs are whatever that adapter's
# list_models() returns. Bindings are validated at squadron-open —
# misconfigurations fail fast.
agents:
  implement:
    provider: claude
    model_id: claude-sonnet-4-6
  review:
    provider: claude
    model_id: claude-haiku-4-5
  briefing:
    provider: opencode-go
    model_id: minimax-m2.7
  decompose:
    provider: claude
    model_id: claude-sonnet-4-6
  generate:
    provider: github-copilot
    model_id: gpt-5-mini

# Per-complexity overrides for a specific actor still work — they take
# priority over the role binding for that (workflow, actor, tier).
actors:
  fly:
    implementer:
      tiers:
        complex:
          provider: claude
          model_id: claude-opus-4-7

Architecture

CLI (Click)
  │
Workflow Layer (async Python)
  │ Resolves cwd at the entry boundary and threads it down.
  │
Squadron (per-workflow lifecycle container)
  │ Constructs one airframe.AgentRuntime per role via runtime_for_agent(),
  │ runs validate_binding() at open, exposes the typed agents the actions
  │ call directly (coder.implement, reviewer.review, decomposer.outline, ...).
  │
Apache Burr Application (one per workflow run)
  │ A state machine of @action-decorated async functions over a shared
  │ State dict. Transitions encode the control flow. A ProgressEventHook
  │ streams StepStarted / StepCompleted / AgentStarted / StepOutput
  │ events into a queue the CLI drains for the live Rich tables.
  │
  ├── fly_beads graph:
  │   select_next_bead → implement → gate → ac_check → spec_check
  │     → review → (create_human_bead?) → commit → record_outcome
  │     → ... cycle ... → aggregate_review → done
  │
  ├── refuel_maverick graph:
  │   parallel_briefings + contrarian_briefing → synthesize_briefing
  │     → outline → detail_fan_out → validate → request_fix?
  │     → create_beads → done
  │
  └── generate_flight_plan graph:
      parallel_briefings + contrarian_briefing → synthesize_briefing
        → generate → done
  │
airframe runtime (one AgentRuntime per role)
  ClaudeCodeRuntime, CopilotRuntime, OpenCodeRuntime,
  OpenCodeGoRuntime, OpenCodeZenRuntime, OpenRouterRuntime,
  BedrockRuntime — each fronts its vendor SDK directly behind a
  uniform Runtime protocol (execute / reset / validate_binding).

How Agents Communicate

Each Burr action calls the role's typed Agent directly (coder = squadron.coder_for(tier); await coder.implement(prompt)). Domain methods invoke _send_structured(prompt), which calls runtime.execute() with format=json_schema derived from the agent's result_model. Airframe synthesizes a StructuredOutput tool the model is forced to call, normalises any vendor-specific envelope wrapping, and validates the result against the Pydantic model. The typed payload flows back into Burr State via the action's state.update(...) return; downstream actions read it from there. Built-in tools (Read, Write, Bash) are for doing work; the StructuredOutput tool is for reporting results.

Technology Stack

Category Technology
Language Python 3.11+
Package Manager uv
Agent Runtime airframe — one AgentRuntime per role, vendor SDKs behind a uniform protocol
Workflow Engine Apache Burr — state machines of @action-decorated async functions
Structured Output Pydantic + format=json_schema (via airframe StructuredOutput)
CLI Click + Rich
VCS (writes) Jujutsu (jj) in colocated mode
VCS (reads) GitPython
Logging structlog
Testing pytest + pytest-asyncio
Linting Ruff

Development

git clone https://github.com/get2knowio/maverick.git
cd maverick
uv sync
make check           # lint + format + test
make ci-coverage     # Full CI pipeline
uv run maverick --help

License

MIT

About

AI-powered bead-driven development workflow orchestration with autonomous agents and YAML DSL

Resources

License

Contributing

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages