docs(specs): add iterative spec documents for StackMemory

StackMemory Bot (CLI) · StackMemory Bot (CLI) · commit b40a572924b6 · 2026-02-06T16:01:14.000-05:00
Complete 4-doc spec chain: ONE_PAGER, DEV_SPEC, PROMPT_PLAN, AGENTS.md
with real StackMemory project content. Stages A-D marked complete,
E-G are the roadmap for team collaboration, hosted service, and polish.
diff --git a/docs/specs/AGENTS.md b/docs/specs/AGENTS.md
@@ -0,0 +1,65 @@
+# StackMemory — AGENTS.md
+
+> Generated from ONE_PAGER.md, DEV_SPEC.md, PROMPT_PLAN.md
+
+## Repository Structure
+```
+src/
+  cli/           → CLI entry point + commands (commander.js)
+  core/          → Business logic (frames, database, query, retrieval)
+  integrations/  → External services (Linear, MCP, Claude Code, Ralph)
+  skills/        → Claude Code skills (spec, linear-run, orchestrator)
+  features/      → Feature modules (tasks, TUI)
+  utils/         → Shared utilities
+packages/        → Workspace packages (linear-extension)
+.claude/         → Claude Code config (hooks, skills, settings)
+docs/specs/      → Iterative spec documents
+```
+
+## Agent Responsibilities
+
+### When editing `src/core/`
+- Frame operations are the foundation — test thoroughly
+- FrameManager, DualStackManager, ContextRetriever are hot paths
+- SQLiteAdapter uses better-sqlite3 (synchronous API, not async)
+- Always use `.js` extensions on relative ESM imports
+
+### When editing `src/skills/`
+- Skills return `SkillResult { success, message, data?, action? }`
+- Register new skills in `ClaudeSkillsManager.executeSkill()` switch
+- Add to `getAvailableSkills()` and `getSkillHelp()` too
+- RecursiveAgentOrchestrator has 8 subagent types — don't add without reason
+
+### When editing `src/cli/`
+- Feature-flagged commands use async `import()` collected in `lazyCommands[]`
+- All lazy commands must resolve before `program.parse()`
+- Use `ora` for spinners, `chalk` for colors
+
+### When editing `src/integrations/`
+- Linear: always check for `LINEAR_API_KEY` before API calls
+- MCP: tools must follow `@modelcontextprotocol/sdk` patterns
+- Claude Code: agent bridge maps oracle/worker/reviewer types
+
+## Guardrails
+- **Never** commit secrets (`.env`, API keys, tokens)
+- **Never** use `--no-verify` on git operations
+- **Never** use `jest` — this project uses `vitest`
+- **Never** skip `.js` extensions on relative imports (ESM requirement)
+- **Always** run `npm run lint && npm run test:run && npm run build` before pushing
+- **Always** return `undefined` over throwing exceptions
+- **Always** log + continue rather than crash
+
+## Testing
+- Framework: Vitest (not Jest)
+- Run: `npm run test:run` (single pass) or `npm test` (watch mode)
+- Location: `src/**/__tests__/*.test.ts` or colocated `*.test.ts`
+- Target: 498 tests, all passing, ~17s
+- New features require tests — no untested code paths
+
+## When to Ask the User
+- Before creating new subagent types in RecursiveAgentOrchestrator
+- Before modifying database schema (migrations needed)
+- Before changing feature flag defaults
+- Before adding new npm dependencies
+- Before modifying `.claude/hooks/` behavior
+- When a test fails and the fix isn't obvious
diff --git a/docs/specs/DEV_SPEC.md b/docs/specs/DEV_SPEC.md
@@ -0,0 +1,96 @@
+# StackMemory — Development Specification
+
+> Generated from ONE_PAGER.md
+
+## Architecture
+
+```
+┌──────────────────────────────────────────────────┐
+│  CLI (commander)              MCP Server (SSE)   │
+├──────────────────────────────────────────────────┤
+│  Skills Layer                                    │
+│  ├─ SpecGeneratorSkill    (4-doc chain)          │
+│  ├─ LinearTaskRunner      (task → RLM → Linear)  │
+│  ├─ ClaudeSkillsManager   (skill router)         │
+│  └─ UnifiedRLMOrchestrator (RLM + skills)        │
+├──────────────────────────────────────────────────┤
+│  Core                                            │
+│  ├─ FrameManager          (push/pop/query)       │
+│  ├─ DualStackManager      (hot + cold stacks)    │
+│  ├─ ContextRetriever      (semantic search)      │
+│  ├─ RecursiveAgentOrchestrator (8 subagents)     │
+│  └─ ParallelExecutor      (concurrent tasks)     │
+├──────────────────────────────────────────────────┤
+│  Storage                                         │
+│  ├─ SQLiteAdapter         (local, better-sqlite3)│
+│  └─ ParadeDB adapter      (hosted, optional)     │
+├──────────────────────────────────────────────────┤
+│  Integrations                                    │
+│  ├─ Linear                (OAuth + webhook)      │
+│  ├─ Claude Code           (agent bridge + hooks) │
+│  └─ Ralph                 (swarm coordinator)    │
+└──────────────────────────────────────────────────┘
+```
+
+## Tech Stack
+- **Language**: TypeScript (strict mode, ESM with .js extensions)
+- **Runtime**: Node.js 20+
+- **Build**: esbuild (fast, single-pass)
+- **Test**: Vitest (498 tests, ~17s)
+- **Lint**: ESLint + Prettier
+- **Database**: better-sqlite3 (local), ParadeDB (hosted)
+- **CLI**: Commander.js
+- **MCP**: Custom SSE server (@modelcontextprotocol/sdk)
+
+## API Contracts
+
+### MCP Tools
+- `stackmemory_push_frame` — Push context frame onto stack
+- `stackmemory_pop_frame` — Pop and return top frame
+- `stackmemory_query` — Semantic search across frames
+- `stackmemory_capture` — Snapshot current state for handoff
+- `stackmemory_restore` — Rehydrate from captured state
+
+### CLI Commands
+- `stackmemory capture` / `restore` — Session handoff
+- `stackmemory skills spec <cmd>` — Spec document generation
+- `stackmemory skills linear-run <cmd>` — Linear task execution
+- `stackmemory ralph linear <cmd>` — Ralph-Linear bridge
+
+### Internal Interfaces
+- `SkillContext` — Shared context passed to all skills
+- `SkillResult` — Uniform { success, message, data?, action? } return
+- `SubagentConfig` — Model, tokens, temperature, systemPrompt, capabilities
+- `TaskNode` — Recursive task tree with dependencies and status
+
+## Data Models
+
+### Frame
+```typescript
+{ id, projectId, type, topic, summary, content, metadata,
+  parentId, status, score, createdAt, updatedAt }
+```
+
+### LinearTask (synced)
+```typescript
+{ id, identifier, title, description, status, priority,
+  labels[], team, assignee, url }
+```
+
+## Auth
+- Local: no auth (single-user SQLite)
+- Hosted: JWT via `stackmemory login`
+- Linear: OAuth2 flow or `LINEAR_API_KEY` env var
+- Claude: `ANTHROPIC_API_KEY` env var
+
+## Error Handling
+- Return `undefined` over throwing (per CLAUDE.md convention)
+- Log + continue over crash
+- Skills return `{ success: false, message }` on failure
+- Hooks silently fail to not block Claude
+
+## Deploy
+- npm package: `@stackmemoryai/stackmemory`
+- Binary: `stackmemory` (global install)
+- Feature flags: `STACKMEMORY_SKILLS`, `STACKMEMORY_RALPH`, etc.
+- Auto-update check on CLI startup
diff --git a/docs/specs/ONE_PAGER.md b/docs/specs/ONE_PAGER.md
@@ -0,0 +1,42 @@
+# StackMemory — One-Pager
+
+## Problem
+AI coding agents (Claude Code, Cursor, Copilot) lose all context between sessions. Developers repeat themselves, decisions get lost, and handoffs between agents or team members are painful. Current tools treat chat as linear logs — there's no structured memory layer.
+
+## Audience
+- Solo developers using AI coding assistants daily
+- Engineering teams (2-20) collaborating with AI agents across sessions
+- AI-first startups where agents do 50%+ of the coding
+
+## Platform
+CLI + MCP server — runs locally alongside Claude Code / VS Code. Optional hosted sync for teams.
+
+## Core Flow
+1. Developer works with Claude Code — StackMemory auto-captures context as frames on a call stack
+2. Session ends — `stackmemory capture` commits state + generates handoff prompt
+3. New session starts — `stackmemory restore` rehydrates full context
+4. Team member picks up — frames show decisions, progress, blockers with full provenance
+5. Ralph (RLM orchestrator) decomposes complex tasks into parallel subagent execution
+
+## MVP Features
+- [x] Frame-based context management (push/pop/query)
+- [x] Session capture and restore with handoff prompts
+- [x] SQLite local storage with dual-stack manager
+- [x] MCP server for Claude Desktop integration
+- [x] Linear task sync (bidirectional)
+- [x] Recursive Language Model (RLM) orchestrator with 8 subagent types
+- [x] Claude Code skills system (/spec, /linear-run, checkpoint, dig)
+- [ ] Team collaboration with shared frame stacks
+- [ ] Hosted sync service (Railway/Supabase)
+- [ ] Browser extension for context capture
+
+## Non-Goals
+- Not a replacement for git — complements version control
+- Not a chat UI — headless memory layer for existing tools
+- Not a project management tool — integrates with Linear, not replaces it
+
+## Metrics
+- Session restoration accuracy (% of context successfully rehydrated)
+- Handoff quality score (does the next agent/human have sufficient context?)
+- Token savings (fewer repeated explanations across sessions)
+- Task completion rate via Ralph orchestrator
diff --git a/docs/specs/PROMPT_PLAN.md b/docs/specs/PROMPT_PLAN.md
@@ -0,0 +1,50 @@
+# StackMemory — Prompt Plan
+
+> Generated from ONE_PAGER.md, DEV_SPEC.md
+
+## Stage A: Foundation (Complete)
+- [x] Initialize repository and tooling
+- [x] Configure CI/CD pipeline (lint-staged + pre-commit)
+- [x] Set up development environment (esbuild, vitest)
+- [x] Define database schema (SQLite frames table)
+- [x] Implement FrameManager (push/pop/query)
+- [x] Implement DualStackManager (hot + cold stacks)
+
+## Stage B: Core Features (Complete)
+- [x] Session capture and restore
+- [x] Handoff prompt generation
+- [x] Context retrieval with semantic search
+- [x] CLI commands (capture, restore, status, context)
+- [x] MCP server with SSE transport
+
+## Stage C: Integrations (Complete)
+- [x] Linear OAuth + task sync
+- [x] Linear webhook handler
+- [x] Claude Code agent bridge
+- [x] Claude Code hooks system
+
+## Stage D: Skills & Orchestration (Complete)
+- [x] RecursiveAgentOrchestrator with 8 subagent types
+- [x] ClaudeSkillsManager with skill routing
+- [x] SpecGeneratorSkill (4-doc chain)
+- [x] LinearTaskRunner (task → RLM → Linear)
+- [x] Agent prompt consolidation (structured templates, latest models)
+- [x] Workflow integration (hooks, skill-rules, CLI)
+
+## Stage E: Team Collaboration (Next)
+- [ ] Shared frame stacks across team members
+- [ ] Conflict resolution for concurrent frame edits
+- [ ] Team activity feed and notifications
+- [ ] Role-based access control for frames
+
+## Stage F: Hosted Service
+- [ ] Railway/Supabase hosted database
+- [ ] User signup and JWT auth
+- [ ] Remote MCP server (HTTP/SSE)
+- [ ] Cross-device sync
+
+## Stage G: Polish & Scale
+- [ ] Browser extension for context capture
+- [ ] Performance optimization (frame indexing, lazy loading)
+- [ ] Telemetry and usage analytics
+- [ ] Plugin marketplace for custom skills