A voice-first mobile coding assistant that turns spoken requests into code changes, runs tests, and pings you only when approval is needed.
This doc is both a dev reference and an LLM agent brief. It defines the product behavior, architecture, APIs, data contracts, and success criteria so humans and agents stay aligned.
- You press one mic button and say what you want: “Add an
--stdinflag” - The app uploads a short WAV to the backend
- Backend transcribes the audio → builds an instruction → runs an autonomous code job
- A runner writes the patch, runs lint + tests, and either auto-merges or opens a PR
- You get a push: ✅ merged or
⚠️ approve? - Nightly we email simple cost/usage totals
- Optional: weekly sprint tag auto-generates an 8-second demo video
Latency target 2–4 s for simple tasks, typical cloud cost ≈ $0.017/task
- iOS/Android app with one mic screen and a notifications view
- Single-shot WAV upload over HTTPS (no streaming)
- Whisper v3 Turbo for transcription
- Anthropic Claude Code for patch generation
- Python projects only (CLI utilities, small FastAPI APIs)
- Lint + test gates:
ruff+pytest - Heuristic merge rule: auto-merge if LOC delta < 500 and no
security/path touched - Push notifications with Approve / Details
- Supabase for Postgres/Storage/Auth
- One EC2 runner (t4g.micro spot) executing a Docker image
- Weekly Veo demo clip on tag
- On-device wake-word, on-device Whisper, WebSocket streaming
- Risk-engine microservice, vector/RAG memory
- Non-Python stacks
- Production SSO, org admin, payments
flowchart TD
A[Mobile App (React Native + Swift AudioRecorder)] -->|POST /task (JWT + WAV)| B(API Gateway + FastAPI Lambda)
B -->|Whisper Turbo| C[Transcript]
B -->|SQS publish| D[SQS runner-jobs]
E[Runner-0 EC2 t4g.micro] -->|poll| D
E -->|docker run| F[Claude Code container\n+ ruff + pytest]
F -->|patch + status| E --> G[GitHub\n(auto-merge or PR)]
B --> H[Supabase Postgres\nusers, task_events]
B ---> I[Push notifications (APNS/FCM)]
G --> J((Weekly tag))
J --> K[Vertex AI Veo\n8s demo]
K --> L[Supabase Storage\n(mp4)] --> I
-
Mobile
- React Native (Expo) UI
- Swift
AudioRecorderusing AVFoundation → writestask.wav→ returns file path to RN - Expo / FCM push
-
API Edge (Lambda)
- FastAPI handlers
- Supabase JWT verification
- Whisper Turbo call
- Build prompt for Claude Code
- Write
task_eventsrow (pending) - Publish job JSON to SQS
-
Runner-0 (EC2)
- Tiny Python agent polls SQS
- For each job:
git clone→ run Docker image with Claude Code - Run
ruffandpytest, collect results - If small/safe → commit + push to
main, else open PR - Send compact status back to Lambda callback endpoint
-
Supabase
- Postgres:
users,task_events - Storage: WAV files (optional), Veo videos
- Auth: issue JWT for app
- Postgres:
-
Veo (optional)
- GitHub Action on tag → screenshot staging → Veo → mp4 → push link
{
"job_id": "uuid",
"user_id": "uuid",
"repo": "git@github.com:org/project.git",
"branch": "main",
"task_text": "Add an --stdin flag so the tool can read from STDIN",
"style_guide": "PEP8, use argparse, avoid globals",
"heuristics": { "auto_merge_loc_limit": 500, "blocked_paths": ["security/"] }
}{
"job_id": "uuid",
"commit_sha": "abcd1234",
"pr_url": null,
"loc_delta": 42,
"files_touched": ["cli.py", "tests/test_cli.py"],
"tests_passed": true,
"lint_passed": true,
"status": "auto_merged",
"tok_in": 15000,
"tok_out": 5000,
"duration_ms": 2800,
"notes": "Added --stdin flag, updated docs, 3 tests"
}create table if not exists users (
id uuid primary key,
email text unique not null,
provider_choice text default 'claude',
notify_threshold_loc int default 500,
created_at timestamptz default now()
);
create table if not exists task_events (
id uuid primary key,
user_id uuid references users(id),
ts_start timestamptz default now(),
duration_ms int,
tok_in int,
tok_out int,
loc_delta int,
files_touched jsonb,
status text check (status in ('ok','fail','auto_merged','pr_opened')),
notes text
);- Auth: Bearer JWT (Supabase)
- Body:
multipart/form-datawithaudio(WAV), optionalrepooverride - Behavior:
- Whisper Turbo → transcript
- Construct prompt + job JSON → SQS
- Insert
task_eventspending row - Return
202 Accepted{ job_id }
- Runner → Lambda
- Updates
task_events - Triggers push: ✅ or
⚠️ (with Approve action)
- Merges PR created for a job (idempotent)
- Auto-merge if:
loc_delta < 500- and none of
files_touchedstart withsecurity/ - and
lint_passed&&tests_passed
- Otherwise open PR and push an approval request
- Goal: Convert user’s natural-language task into minimal, correct Python code changes plus tests
- Constraints:
- Must pass
ruffandpytest - Follow repository style (argparse, docstrings)
- Prefer small diffs that are easy to review
- Must pass
- Patch etiquette:
- Write tests for new behavior
- Update README/help text when flags change
- Avoid touching
security/unless explicitly asked
- Commit message format:
feat(cli): add --stdin flag (tests included)- Body: short rationale + test summary
Prompt skeleton the backend will send:
Project: <name>
Task: <transcribed text>
Repository map (top-level): <files/dirs>
Style: PEP8, argparse, no globals, ruff + pytest must pass
Deliverables: minimal diff, tests updated, README/help updated when applicable
- Mock Whisper: load fixture transcript from
tests/fixtures/utterance.txt - Mock Claude: apply a canned patch in
/mocks/patches/*.diff - Toggle via
MVP_MOCK=1
pytestfor Lambda helpers and runner agent- Integration: spin a local SQS (e.g., LocalStack) and run one full job with a small sample repo
- Launch Runner-0 locally (docker + poller)
curl -F "audio=@sample.wav" -H "Authorization: Bearer <jwt>" https://api/.../task- Verify commit on
mainand push notification log
- OpenAI Whisper key
- Anthropic Claude key
- Supabase project + service key
- GitHub PAT (
reposcope) - Expo/FCM/APNS push credentials
- Supabase: create
users,task_events, bucketmedia - Build runner Docker image: Claude Code +
ruff+pytest→ push to ECR - EC2: launch t4g.micro spot, install Docker, runner poller
- SQS:
runner-jobs - API Gateway + Lambda (FastAPI):
/task,/callback/runner-status,/webhook/approve - Supabase Edge cron: nightly SQL aggregate → email CSV
- GitHub Action (on tag): screenshot staging → Veo → upload mp4 → notify
# Lambda
SUPABASE_URL=...
SUPABASE_SERVICE_KEY=...
OPENAI_API_KEY=...
ANTHROPIC_API_KEY=...
PUSH_API_KEY=... # Expo/FCM/APNS
GITHUB_TOKEN=...
# Runner EC2
GITHUB_TOKEN=...
ECR_IMAGE=...
SQS_URL=...- Latency: p50 voice→merge < 4 s, p95 < 8 s (simple tasks)
- Success rate: > 85% tasks auto-merge without human fix
- Cost: < $0.03 per task average
- Error budget: < 2% Lambda 5xx, Runner job failure rate < 5%
Nightly email includes:
sum(tok_in),sum(tok_out),sum(duration_ms),count(*) by status
- Least-privilege IAM: Lambda can put to SQS and read Secrets, Runner can get from SQS and pull ECR
- GitHub PAT restricted to target repo(s)
- JWT checked on every
/task - Supabase bucket private; signed URLs for media when needed
- Latency: WebSocket streaming + on-device Whisper tiny
- Throughput: Runner Auto-Scaling Group / Fargate
- Safety: risk engine (OPA + semgrep/bandit), review routing
- Breadth: Node/TS support (ESLint + Jest), then Go
- Observability: Grafana/Loki dashboards
- Runner: a small EC2 that executes Claude Code inside Docker and runs tests
- Claude Code: Anthropic’s agentic code editor capable of planning & patching
- Whisper Turbo: fast cloud speech-to-text
- Veo: Google Vertex AI image-to-video model used for 8 s demo clips
- Talk → transcript → code patch → lint/tests pass
- Auto-merge happens for small/safe diffs
- PR + push approval for risky diffs
- Push notification actions work on device
- Nightly cost email received
- Weekly tag produces a Veo clip and sends a link
SUPABASE_URL=...
SUPABASE_SERVICE_KEY=...
OPENAI_API_KEY=...
ANTHROPIC_API_KEY=...
PUSH_API_KEY=...
GITHUB_TOKEN=...
SQS_URL=...
AUTO_MERGE_LOC_LIMIT=500
BLOCKED_PATHS=security/const form = new FormData()
form.append('audio', {
uri: fileUri,
name: 'task.wav',
type: 'audio/wav',
} as any)
await fetch(`${API_BASE}/task`, {
method: 'POST',
headers: { Authorization: `Bearer ${jwt}` },
body: form,
})