An autonomous fintech-aware pipeline: a Bugsnag incident in, a reviewed pull request out.
An autonomous pipeline that picks up a Bugsnag incident, gathers code context (stacktrace, blame, related PRs, Jira), fixes the bug via Claude Code inside an isolated workspace, opens a Bitbucket pull request and pings Slack — all while a PII scrubber, a protected-path blocklist and a full audit log keep the financial core off-limits.
Built in a weekend by a 6-person team. The demo is reproducible today: bash scripts/bootstrap.sh and you get a UI with 6 seeded tasks across every lifecycle state (queued, running, done, failed, needs_human, rejected_blocklist), plus a kill-switch (POST /api/consumer/pause).
git clone git@github.com:Shahinyanm/hackaton.git
cd hackaton
bash scripts/bootstrap.shbootstrap.sh is idempotent and does, in order:
- Copies
.env.example→.env(if missing) and prompts you to fill in credentials. - Clones
workspace/financeif not already cloned. docker compose build && docker compose up -d.- Waits for PostgreSQL to be healthy.
- Runs
composer install, Doctrine migrations, andmessenger:setup-transports. - Runs
app:demo-seed— creates 6 demo tasks across all lifecycle states.
After it finishes:
| URL | What you get |
|---|---|
| http://localhost:5174 | UI (Next.js 15) — task list, 2 s polling, status chips, audit timeline |
| http://localhost:4001/api/health | API health check |
| http://localhost:4001/api/tasks | Raw JSON task list |
# Pipeline smoke test (skips Claude, Bitbucket, Slack)
docker exec hackathon-php-cli php bin/console app:smoke-task --skip-workspace
# Smoke test against a protected path (must land in rejected_blocklist)
docker exec hackathon-php-cli php bin/console app:smoke-task --skip-workspace --protected
# Re-seed demo data (wipes + recreates 6 tasks)
docker exec hackathon-php-cli php bin/console app:demo-seed --clean
# Pause / resume worker (kill switch for live demo)
curl -X POST http://localhost:4001/api/consumer/pause
curl -X POST http://localhost:4001/api/consumer/resume
# Logs
docker compose logs -f php-cli-consumer
docker compose logs -f php-fpm-inputsequenceDiagram
participant BS as Bugsnag
participant API as Input API (Symfony)
participant DB as PostgreSQL
participant W as Consumer Worker
participant CC as Claude Code
participant BB as Bitbucket
participant SL as Slack
BS->>API: new event (polled every 2 min, or seed endpoint)
API->>API: PII scrub + context gather
API->>DB: INSERT Task(queued) + dispatch ProcessTaskMessage
W->>DB: pull ProcessTaskMessage (FOR UPDATE SKIP LOCKED)
W->>DB: Task → running, audit_event(started)
W->>CC: claude -p (headless, agents from agents-config/)
CC->>CC: read context, edit code, commit
W->>W: BlocklistChecker.check(git diff)
W->>BB: git push + open PR
W->>SL: post message with PR link
W->>DB: Task → done, audit_event(done)
sequenceDiagram
participant W as Consumer Worker
participant CC as Claude Code
participant BL as BlocklistChecker
participant DB as PostgreSQL
participant SL as Slack
W->>CC: claude -p (with bad bug pointing at src/Ledger/)
CC->>CC: commit changes to src/Ledger/Account.php
W->>BL: check(git diff --name-only)
BL-->>W: BLOCKED on rule "src/Ledger/**"
W->>DB: Task → rejected_blocklist
W->>DB: INSERT BlockedAttempt + audit_event(rejected)
W->>SL: ⚠️ Auto-fix blocked: protected path. Manual review required.
Note over W,BB: No git push, no PR. Branch stays local.
| Layer | Stack |
|---|---|
| Backend API + Worker | PHP 8.4, Symfony 7.3 (Messenger, Doctrine, Scheduler, Console, HTTP Client) |
| Database | PostgreSQL 16-alpine (Doctrine ORM 3.3 + Doctrine Messenger transport) |
| Frontend | Next.js 15.1.6, React 19, TanStack Query 5.66, Tailwind CSS 3.4 |
| AI Orchestrator | Claude Code (claude -p headless) + Ralphex CLI for plan execution |
| Source Integration | Bitbucket CLI (bbkt), Atlassian MCP (OAuth-remote) for Jira / Confluence |
| Notify | Slack Web API (signing-secret verified webhook) |
| Infra | Docker Compose (nginx, php-fpm-input, php-cli-consumer, postgres, claude-consumer, front) |
| Node runtime | Node.js ≥ 22 (pnpm), Next.js dev server on port 5174 |
High-level data flow (from architecture.md):
┌─────────────┐ poll ┌──────────────┐ Messenger ┌──────────────────────┐
│ Bugsnag │ ◀──────── │ Input svc │ dispatch │ PostgreSQL │
│ REST API │ (2 min) │ (Symfony, │ ────────────▶ │ messenger_messages │
│ │ │ Scheduler) │ │ tasks │
└─────────────┘ └──────┬───────┘ │ audit_events │
│ writes Task + context │ blocked_attempts │
▼ └────┬─────────────────┘
Front UI (Next.js) │ messenger:consume
polls /api/tasks 2s ▼
┌──────────────────┐
│ Consumer svc │──► Claude Code (headless)
│ (Symfony CLI) │──► Bitbucket PR
└──────────────────┘──► Slack
Key decisions:
- One Symfony codebase, two entrypoints.
php-fpm-inputserves HTTP (Bugsnag poll + UI API).php-cli-consumerrunsbin/console messenger:consume async scheduler_defaultas a long-running worker. Shared entities, repositories and services. - Polling, not webhooks. The MVP has no public URL, so a
#[AsCronTask('*/2 * * * *')]task hits the Bugsnag Data Access API every 2 minutes. APOST /webhooks/testseed endpoint lets the demo bypass the wait. - Shared workspace clone.
workspace/financeis cloned once and reused: every task doesgit fetch && checkout master && pull && checkout -b hackathon/bugsnag-{taskId}. Sequential consumer means no race conditions. - Claude Code as the framework. No custom orchestrator. Agents live in
agents-config/.claude/agents/*.md;initial-prompt.mdandprotected-paths.ymlare copied into the workspace before each run.
Full diagram + sequence in architecture.md.
Core tables (all in PostgreSQL 16):
| Entity | Key fields | Purpose |
|---|---|---|
| Task | id (UUID), bugsnag_id (UNIQUE), error_title, status, context (JSONB), branch_name, pr_url, jira_key, cost_usd, duration_ms |
Incident lifecycle record |
| TaskStatus (enum) | Queued → Running → ContextReady → ContextDispatched → ContextConsumed → Done, plus Failed, NeedsHuman, RejectedBlocklist |
Lifecycle state machine |
| AuditEvent | id, task_id (FK), agent_name, event_type, payload (JSONB), created_at |
One row per phase / agent action; UI timeline reads from here |
| BlockedAttempt | id, task_id (FK), blocked_path, diff_excerpt |
Recorded when Claude tried to modify a protected path |
| Setting | key (PK), value |
KV store — used for bugsnag_last_polled_at watermark and consumer_paused flag |
Messenger transports (app/config/packages/messenger.yaml):
async— default work queue (ProcessTaskMessage).max_retries = 0for MVP (manual failure handling).context_ready— separate queue an external Node-based consumer polls (atomicSELECT FOR UPDATE SKIP LOCKEDvia/api/external/tasks/next).failed— dead-letter queue.
The schema lives in app/migrations/Version20260517120000.php (initial) and Version20260519140000.php (incremental).
The financial nature of the target codebase makes safety non-optional. Three pillars enforce it:
A recursive walker over context payloads with regex matchers for EMAIL, IBAN, CARD (13–19 digits), PHONE. Matches are replaced with [REDACTED-{LABEL}] and the list of scrubbed field paths is written into the corresponding AuditEvent payload, so reviewers can see what was sanitized without seeing the original data.
After Claude Code commits, git diff --name-only is matched against globs (fnmatch with FNM_PATHNAME, plus a custom ** → regex expansion for nested patterns):
blocked:
- "src/Ledger/**" # financial core
- "src/Payment/Core/**" # payments
- "src/Compliance/**" # GDPR / audit
- "migrations/**" # DB schema — human review only
- "config/*.prod.yaml" # prod configs
- ".env*" # environment variables
- "**/*.key"
- "**/*.pem"
- "**/*.crt"Any veto → task is moved to RejectedBlocklist, push is blocked, a BlockedAttempt row is written, Slack is notified. The branch stays local.
Every phase transition, every agent action and every veto produces an audit_events row with a JSONB payload (including the PII-scrub field list). The UI task-detail timeline is a direct read from this table. Indexed by (task_id, created_at) for fast scans.
- Slack webhook signature verification in
SlackWebhookController(whenSLACK_SIGNING_SECRETis set). - Wall-clock + budget caps on each Claude Code run: 5 minutes / 50 turns /
$5per task (whichever hits first). - No JWT in MVP for
/api/external/*endpoints — they trust the internal Docker network. Documented honestly rather than hidden. - Kill switch —
POST /api/consumer/pauseflipssettings.consumer_paused = true, which the worker checks before pulling each task.
| Role | Person | Owns |
|---|---|---|
| Consumer / Front | Mher Shahinyan | docker-compose, consumer service, UI |
| Input + Infra co-owner | Yahor Dziukarau | Bugsnag poller, context gather, queue producer |
| AI Engineer | Vitautas Brazas | Claude Code wrapper, agent configs (CLAUDE.md, .claude/agents/*) |
| QA / Test | Tetiana Kryvko | Demo dataset, unit + integration tests, humanity-of-output checks |
| PM / Storytelling | Oksana Titarenko | Demo narrative, slide deck, humanity review |
| Lead / Demo | Konstantin Bogomolov | Live presentation, voice, fallback video |
In scope:
- One Bugsnag project: finapi-prod (Finance PayIn).
- One Bitbucket repository: finance (Finance API).
- One sequential consumer (no parallelism).
- One task queue + one audit history table.
- Read-only UI with a Pause button.
- Trust layer: PII scrubber + path blocklist + audit log.
Out of scope (shown only as roadmap):
- Slack as an input source (not just Bugsnag).
- Multiple teams / repositories.
- Settings UI.
- Sandboxed test execution by the agent.
- Deploy to staging / production.
- Parallel consumers.
Jury-facing (English, read these first):
| File | About |
|---|---|
| architecture.md | System architecture — services, data flow, DB schema, workspace layout |
| demo-script.md | 5-minute demo script with live walkthrough, fallback plans, Q&A answers |
| risks.md | R1–R11 risk register + the must-have trust-layer items |
| feasibility-audit.md | Pre-hackathon component-by-component confidence audit |
| handoff.md | Team handoff — what's done, what's next, quick start |
Internal / team-only (some still in Russian):
| File | About |
|---|---|
| plan.md | Overall plan, milestones, time budget |
| branch-walkthrough.md | What's in each git branch (12 commits, demo plan) |
| qa/README.md | QA stories — 6 manual UI checks |
| day-0-prep.md | Sunday pre-hackathon checklist |
| learning-roadmap.md | Self-learning track for the team |
| ralphex-integration.md | Roadmap to switch to ralphex (OSS) — 5-phase review |
| HANDOFF-TO-MHER.md | Consumer container handoff Vitautas → Mher |
| contracts/queue-schemas.md | JSON schemas for queue messages |
| contracts/agent-md-template.md | Template for Claude Code agent MDs |
| roles/*.md | Per-person plans |
- Maximize parallelism. Yahor (Input), Vitautas (Claude Code) and Mher (Consumer + Front) work independently. Module contracts are frozen at Day-0.
- Trust layer is not optional. PII scrubber + path blocklist + audit log are required before the demo. The fintech story is hollow without them.
- Demo dataset ready before the hackathon. Tetiana collects 5–10 real bugs + their fix-PRs over the weekend.
- Claude Code as the framework. No custom orchestrator. We use
claude -p(headless), agents in.claude/agents/, context viaCONTEXT.md. - One shared clone. Per-task branch, no fresh clone per task.