⚠️ Breaking change since062f127(2026-04-04). Everything from hook behavior to memory file layout to the TOML schema has changed between062f127(2026-04-04) andcfd5fb5(2026-04-17). If you have memories written before this change, do not start a new session until you run the/migrateskill. See Changes since062f127and Migrating existing memories at the end of this document.
Engram is a Claude Code plugin that gives agents persistent memory. Four skills help agents prepare for work, recall context, learn from experience, and remember explicitly.
flowchart LR
prepare["/prepare<br/>load context"] --> work[work]
work --> learn["/learn<br/>extract feedback"]
learn -. next session .-> prepare
work -. any time .-> recall["/recall<br/>search"]
work -. any time .-> remember["/remember<br/>capture"]
/prepare loads relevant context before starting work. /learn extracts feedback after work completes. /recall searches session history and memories on demand. /remember captures explicit knowledge immediately.
- Run
/pluginin Claude Code and selectengram. That's the whole install. - On first use (and whenever Go sources change), the
SessionStarthook builds theengrambinary in the background into~/.claude/engram/bin/engramand symlinks it to~/.local/bin/engram. You nevergit cloneor runtarg buildyourself. Requires Go 1.25+ onPATH. - Upgrading from a pre-
cfd5fb5install? Run/migratebefore your first new session. See Migrating existing memories.
These are the two you reach for by hand. Everything else in the core loop should happen without you asking.
| Command | When you use it |
|---|---|
/recall |
Ask Claude to recall something. With no query, Claude pulls recent history for this project and any engram memories from that same span of time. With a query, Claude searches all memory — engram's own feedback/facts plus Claude Code's various memory files — for anything relevant, and pulls it into the current context. |
/remember "..." |
Tell Claude to remember something going forward — a project fact, a convention, or a correction Claude just received. Instead of "don't do that again," say /remember not to do that again, so the lesson carries into future sessions instead of living only in this one. |
/prepare and /learn are expected to fire without you asking, via two complementary mechanisms:
- Skill frontmatter. Each skill's
descriptionlists triggering conditions ("before starting new work", "at completion boundaries"). Claude Code auto-invokes the matching skill when those conditions are met. - Hook reminders. On every
UserPromptSubmit, engram injects a reminder to call/preparebefore new work. On everyPostToolUse, it injects a reminder to call/learnat completion boundaries. The hooks don't force the call — they make sure the model considers it.
When /prepare fires, it runs /recall against the current task and primes Claude with the highest-ranked memories, rules, and skill pointers before work begins. When /learn fires, it reviews what happened in the session so far, identifies reusable patterns (not one-off events), and writes SBIA feedback memories. /learn runs autonomously at completion boundaries and interactively when you invoke it by hand.
You can still call /prepare and /learn by hand, but if the automation is working you shouldn't need to. If you notice you're running them manually a lot, that's a sign the skill frontmatter or the hook reminders need sharpening — please open an issue so we can tune them.
All data lives under ~/.local/share/engram/ (respects $XDG_DATA_HOME):
~/.local/share/engram/memory/
├── feedback/ Behavioral observation memories (SBIA)
└── facts/ Declarative knowledge memories (SPO)
Memories are TOML files with two types: feedback and fact.
Feedback (behavioral observations, SBIA: situation/behavior/impact/action):
schema_version = 2
type = "feedback"
source = "agent"
situation = "implementing new features"
[content]
behavior = "skipped writing tests before implementation"
impact = "bugs found late, rework required"
action = "always write failing test first (TDD red phase)"
created_at = "2026-04-14T12:00:00Z"
updated_at = "2026-04-14T12:00:00Z"Fact (declarative knowledge, SPO: subject/predicate/object):
schema_version = 2
type = "fact"
source = "human"
situation = "building engram"
[content]
subject = "engram"
predicate = "uses"
object = "targ build system for all build/test/check operations"
created_at = "2026-04-14T12:00:00Z"
updated_at = "2026-04-14T12:00:00Z"Memories split into feedback and fact because they answer different questions and surface at different moments.
- Feedback answers "how should I behave here?" — a correction or pattern you want applied or avoided. Only useful if you can tell when to apply it and what to do differently.
- Fact answers "what's true about this project or environment?" — declarative knowledge the agent would otherwise have to rediscover.
Mixing them under a single schema forces every memory to either lose structure ("freeform blob") or pretend to be one type. Keeping them separate means recall can weight them differently and /learn / /remember can guide you through the right fields for each kind.
This split mirrors a long-standing distinction in cognitive psychology. Tulving (1972)¹ distinguished semantic memory (general facts about the world) from episodic memory (personal experiences); Graf & Schacter (1985)² formalized the explicit/implicit divide; Squire (2004)³ consolidated these into the modern taxonomy of long-term memory systems (declarative, with semantic and episodic subtypes, vs. non-declarative, including procedural memory). The distinction also appears in computational cognitive architectures: Anderson's ACT-R⁴ treats declarative facts (chunks) and procedural knowledge (productions) as separate memory stores with different retrieval and learning rules.
Engram's fact type maps to semantic memory — declarative knowledge about a project or environment. Engram's feedback type is closer to an explicit behavioral rule: a procedure stated as a rule the agent can read and apply, rather than truly implicit procedural memory (which is unconscious skill acquired through practice and not directly readable).
¹ Tulving, E. (1972). "Episodic and semantic memory." In Organization of Memory, ed. Tulving & Donaldson. Academic Press. ² Graf, P. & Schacter, D. L. (1985). "Implicit and explicit memory for new associations in normal and amnesic subjects." Journal of Experimental Psychology: Learning, Memory, and Cognition, 11(3), 501–518. ³ Squire, L. R. (2004). "Memory systems of the brain: A brief history and current perspective." Neurobiology of Learning and Memory, 82(3), 171–177. ⁴ Anderson, J. R., Bothell, D., Byrne, M. D., Douglass, S., Lebiere, C., & Qin, Y. (2004). "An integrated theory of the mind." Psychological Review, 111(4). See the Carnegie Mellon ACT-R project for a current overview of the declarative/procedural split.
SBIA = Situation, Behavior, Impact, Action.
- Situation — the task the agent would be doing before this lesson is known (e.g. "writing async Go tests"). This is what the recall query matches against. If it describes the diagnosed problem instead ("after a race condition in test X"), the memory never surfaces for a fresh attempt.
- Behavior — what was actually done that needs changing.
- Impact — the consequence. Without impact, the agent has no reason to prefer the new action over the old one.
- Action — what to do instead.
SBIA is the minimum structure that makes a behavioral correction generalize. Without Situation you can't retrieve it; without Impact the agent rationalizes around it; without Action it isn't actionable.
SBIA builds on the SBI model (Situation-Behavior-Impact™), a feedback framework developed by the Center for Creative Leadership⁵ and canonicalized in Weitzel (2000)⁶. The Action extension has been applied in medical education — notably in analyzing narrative feedback from clinical preceptors in Entrustable Professional Activities (EPAs) e-portfolio systems — where structured action steps are needed alongside the observational triad (Hsu et al., 2021⁷). SBI's three-part structure was designed for live human conversations: pin the moment, describe the observable behavior, name its impact. That's enough in person because the receiver can ask "what should I do differently?" A memory file has no such feedback channel, so engram uses the SBIA form with Action written down at capture time. Engram didn't invent the "SBI + Action" shape — it adopts an existing research-grade framework for the same reason medical educators did: written feedback without an action step isn't actionable.
⁵ Center for Creative Leadership. "Use SBI to Understand Intent vs. Impact." ⁶ Weitzel, S. R. (2000). Feedback That Works: How to Build and Deliver Your Message. Center for Creative Leadership. ⁷ Hsu, T.-C. et al. (2021). "A Study to Analyze Narrative Feedback Record of an Emergency Department." International Journal of Environmental Research and Public Health 18(12): 6265. — uses the SBIA framework to evaluate clinical preceptor feedback in EPA-based e-portfolios. ⁸ Mindtools. "The Situation-Behavior-Impact™ Feedback Tool."
Facts use SPO because declarative knowledge is naturally a triple: some entity has some relationship to some value. "engram uses targ", "the reorder-decls linter requires alphabetical test functions", "the auto-memory path derives from the git main repo root".
The triple form forces you to name the entity explicitly, which keeps facts from drifting into freeform prose that's hard to search and impossible to update. It also makes duplicate detection tractable — two facts with the same subject and predicate are candidates for merging.
Situation still matters for facts — it's when the fact is relevant, not what the fact says. A fact with no situation surfaces for every query; a fact with a sharp situation surfaces only when it matters.
This shape is the RDF triple — the atomic data unit in the W3C Resource Description Framework⁹, introduced in RDF 1.0 (Lassila & Swick, 1999)¹⁰ and refined as RDF 1.1 (2014)¹¹. Triples are also the foundation of Berners-Lee, Hendler & Lassila's (2001) vision of the Semantic Web¹². A triple makes a claim: the relationship indicated by the predicate holds between the subject and the object. Engram doesn't build an RDF graph, but it borrows the shape because the same constraint — you have to name the entity explicitly to say anything about it — keeps facts retrievable rather than drifting into freeform paragraphs.
⁹ Semantic triple — Wikipedia for an overview of how triples are used in knowledge graphs. ¹⁰ Lassila, O. & Swick, R. R., eds. (1999). Resource Description Framework (RDF) Model and Syntax Specification. W3C Recommendation. ¹¹ W3C (2014). RDF 1.1 Concepts and Abstract Syntax. W3C Recommendation. ¹² Berners-Lee, T., Hendler, J. & Lassila, O. (2001). "The Semantic Web." Scientific American, 284(5), 34–43.
Claude Code already persists memory across sessions via CLAUDE.md, .claude/rules/, and auto memory — the last of which has Claude write notes to itself across sessions. The built-in surfaces get loaded either in full or by a static cap (auto memory's MEMORY.md is capped at the first 200 lines or 25 KB), and there is no query-based retrieval. The more memories accumulate, the more context gets spent on irrelevant ones.
Engram doesn't replace any of this — engram reads from all of it, and adds a query-ranked retrieval layer on top.
| Dimension | Auto memory (built-in) | Engram |
|---|---|---|
| Who writes | Claude, implicitly during a session | You (via /remember) or Claude (via /learn), with a quality gate before write |
| Shape | Freeform markdown | Typed TOML: feedback (SBIA) or fact (SPO) |
| How retrieved | First 200 lines / 25 KB of MEMORY.md loaded at session start; topic files read on-demand during the session by Claude's judgement |
Query-ranked via Haiku against the task in /prepare or /recall |
| Scope of retrieval | Auto memory only | Auto memory plus .claude/rules/, CLAUDE.md (with @-imports), project/user/plugin skills, and engram's own feedback/facts — merged and ranked together |
| Duplicate detection | None — Claude just appends | engram learn returns CREATED / DUPLICATE / CONTRADICTION; /learn can broaden an existing memory's situation when it sees a near-miss |
| Capture point | Implicit, inside a session | Explicit: /learn at completion boundaries, /remember on user cue |
The short version: auto memory persists text. Engram persists structure, ranks against a query, and pulls from every surface Claude Code exposes — not just its own directory. Feedback memories enforce situation/behavior/impact/action; fact memories enforce situation + subject/predicate/object. The shape exists so ranking has something sharper to match against than freeform prose.
/prepare and /recall merge memories from several sources before ranking them with Haiku. You write only to engram's own data directory; you read from everywhere Claude Code already knows about.
| Kind | Path (resolved per session) | Source |
|---|---|---|
| Engram feedback | ~/.local/share/engram/memory/feedback/*.toml |
Written by /learn, /remember |
| Engram facts | ~/.local/share/engram/memory/facts/*.toml |
Written by /learn, /remember |
| Claude Code auto-memory | ~/.claude/projects/<project-slug>/memory/*.md |
Written by Claude Code's built-in memory system |
| Project rules | .claude/rules/*.md (walking up from cwd) |
Written by the project (checked in) |
| CLAUDE.md | Project root, user (~/.claude/CLAUDE.md), plus @-imports |
Written by project and user |
| Skills | Project .claude/skills/, user ~/.claude/skills/, plugin cache |
Written by project, user, and installed plugins |
The <project-slug> for auto-memory is derived from the git main repo root (not the worktree), replacing / with -. External sources are discovered once per recall invocation, cached via a per-call FileCache, then ranked against your query.
flowchart LR
src1[CLAUDE.md] -. read .-> eng
src2[.claude/rules/] -. read .-> eng
src3[auto memory] -. read .-> eng
src4[skills<br/>project + user + plugin] -. read .-> eng
own[engram feedback/facts] -. read .-> eng
eng{{engram}} == write ==> own
The asymmetry is deliberate. Reading everywhere surfaces knowledge already captured in CLAUDE.md, rules, or auto memory without requiring you to re-enter it. Writing only to engram's own directory means engram never rewrites your CLAUDE.md, edits a skill it didn't author, or mutates auto-memory Claude Code manages. Uninstall engram and every other source survives untouched; only ~/.local/share/engram/ and the built binary at ~/.claude/engram/bin/engram need cleanup.
/recall is the retrieval engine the other skills build on. No query ⇒ plain session read (Mode A). Query ⇒ six extraction phases (Mode B), each sharing one byte budget, with early exits when the budget is exhausted or nothing was found:
flowchart TB
start([recall]) --> qcheck{query<br/>given?}
qcheck -- no --> modeA["Mode A<br/>read recent sessions<br/>until Mode-A budget cap"]
modeA --> ret([return context])
qcheck -- yes --> p1["Phase 1<br/>engram memories"]
p1 --> g1{budget<br/>left?}
g1 -- no --> empty{buffer<br/>empty?}
g1 -- yes --> p2["Phase 2<br/>auto memory"]
p2 --> g2{budget<br/>left?}
g2 -- no --> empty
g2 -- yes --> p3["Phase 3<br/>session transcripts"]
p3 --> g3{budget<br/>left?}
g3 -- no --> empty
g3 -- yes --> p4["Phase 4<br/>skills"]
p4 --> g4{budget<br/>left?}
g4 -- no --> empty
g4 -- yes --> p5["Phase 5<br/>CLAUDE.md + rules"]
p5 --> empty
empty -- yes --> ret_empty([return empty])
empty -- no --> p6["Phase 6<br/>Haiku summarize"]
p6 --> ret
Every extraction phase (Phases 1–5) shares the same inner shape. The rank call is a single Haiku request; the extract calls are one per winner, and the loop can cancel on context or budget mid-phase:
flowchart LR
enter([enter phase]) --> bcheck{budget<br/>remaining?}
bcheck -- no --> skip([skip phase])
bcheck -- yes --> idxcheck{index<br/>file<br/>readable?}
idxcheck -- no --> skip
idxcheck -- yes --> rank["Haiku rank<br/>one call"]
rank --> loop{next<br/>ranked<br/>item?}
loop -- none --> done([return bytes added])
loop -- yes --> itemok{budget<br/>and ctx<br/>ok?}
itemok -- no --> done
itemok -- yes --> extract["Haiku extract<br/>snippet → buffer"]
extract --> loop
So "the pipeline" is really six conditional phases feeding a shared buffer through an inner rank-then-iterate loop, with three classes of early exit: budget exhausted, context cancelled, or nothing relevant found. File I/O is cached once per invocation via FileCache so the same path isn't read twice across phases.
/prepare answers "what do I need to know before starting this?" It breaks the pending task into 2–3 sub-topic queries, runs /recall for each, and presents a combined briefing. The situations engram writes at learn-time are phrased as tasks ("writing Go tests in internal/"), so /prepare queries in the same voice — by task, not by fear. Asking "common mistakes when writing tests" will miss memories that were actually stored against "writing tests."
flowchart LR
start([prepare]) --> analyze["Analyze pending task<br/>activity + domain"]
analyze --> queries["Pick 2–3 sub-topic queries<br/>(phrased by task)"]
queries --> fan(("fan out"))
fan --> r1[["/recall sub-topic 1"]]
fan --> r2[["/recall sub-topic 2"]]
fan --> r3[["/recall sub-topic 3"]]
r1 --> brief
r2 --> brief
r3 --> brief["Summarize into<br/>task briefing"]
brief --> done([ready to work])
/learn fires at completion boundaries (via skill trigger, the PostToolUse reminder, or an explicit invocation). It scans the recent session for lessons worth persisting, then filters each candidate through three gates before writing. The gates exist to keep retrieval clean: memories that won't recur, aren't actionable, or belong in a different home (code, CLAUDE.md, a rule, a skill) get dropped rather than poisoning future recalls. Autonomous runs use --source agent; interactive runs present findings for approval first.
flowchart TB
start([learn]) --> scan["Scan session for candidates<br/>corrections · failures · facts · patterns"]
scan --> loop{next<br/>candidate?}
loop -- none --> fin([done])
loop -- yes --> recurs{Recurs<br/>beyond this<br/>project?}
recurs -- no --> drop[drop]
drop --> loop
recurs -- yes --> actionable{Actionable<br/>change to<br/>behavior?}
actionable -- no --> drop
actionable -- yes --> home{Home missed<br/>or no home<br/>fits?}
home -- home already surfaced --> drop
home -- yes --> write["engram learn<br/>--source agent"]
write --> r{result}
r -- CREATED --> loop
r -- DUPLICATE --> broaden["update existing<br/>situation to broaden"]
broaden --> loop
r -- CONTRADICTION --> skip["skip (autonomous)<br/>or ask (interactive)"]
skip --> loop
/remember is /learn's user-triggered sibling. The same three gates apply, but nothing writes without your approval, and the resulting memory is marked --source human. A DUPLICATE doesn't get silently dropped — engram already knew this but failed to surface it in time, so the existing memory's situation gets broadened so next session finds it.
flowchart TB
user(["user: /remember X"]) --> classify{"feedback<br/>or fact?<br/>(maybe both)"}
classify --> gates["Same 3 gates as /learn<br/>Recurs · Actionable · Right home"]
gates -- drop --> tell["tell user why,<br/>suggest the right home"]
gates -- pass --> draft["Draft fields,<br/>present for approval"]
draft --> approve{user<br/>approves?}
approve -- no --> tell
approve -- yes --> write["engram learn<br/>--source human"]
write --> r{result}
r -- CREATED --> saved([saved])
r -- DUPLICATE --> broaden["broaden existing<br/>situation"] --> saved
r -- CONTRADICTION --> conflict["present conflict:<br/>update · replace · keep both"] --> saved
The engram binary provides CLI access to memory operations:
engram recall Recall recent session context
engram list List all memories with type, name, and situation
engram learn feedback --behavior "..." --impact "..." --action "..." --source human --situation "..."
engram learn fact --subject "..." --predicate "..." --object "..." --source human --situation "..."
engram update Modify fields of an existing memory (--name required)
engram show Display full memory details (--name required)
cmd/engram/ CLI entry point (thin wiring layer)
internal/ Business logic (DI boundaries)
anthropic/ Anthropic API client
cli/ CLI command wiring
context/ Context extraction
externalsources/ CLAUDE.md, rules, skills, auto-memory discovery
memory/ Memory storage/retrieval
recall/ Recall pipeline (six phases)
tokenresolver/ Token budgeting
tomlwriter/ TOML serialization
skills/ Plugin skills (recall, prepare, learn, remember, migrate)
hooks/ Shell hooks for Claude Code integration
.claude-plugin/ Plugin manifest
archive/ Historical planning artifacts
targ build— build theengrambinarytarg test— run unit + integration teststarg check-full— lint + coverage (use this to see ALL errors at once)- Never run
go test/go build/go vetdirectly — usetarg
- DI everywhere — No function in
internal/callsos.*,http.*, or any I/O directly. All I/O through injected interfaces, wired at CLI edges. - Pure Go, no CGO — TF-IDF for text similarity. External API for LLM classification only.
- Plugin form factor — Skills for behavior, slim Go binary for computation.
- Measure impact, not frequency — Content quality over mechanical sophistication.
- Read everywhere, write only what you own — Pull context from every surface Claude Code exposes; never mutate files engram didn't create.
- No always-on surfacing. The BM25 hook-surfacing pipeline was removed in this revision. Skills decide when to load memories; nothing is injected at every prompt.
- No self-tuning adaptation loop. The previous
/adaptskill and effectiveness quadrants are gone. Memory quality depends on situation-query alignment (see/learnand/rememberskill guidance). - No cross-agent coordination. Engram is per-user, per-machine. For multi-agent coordination, use
mycelium. - No vector embeddings. Text similarity uses TF-IDF + Haiku classification. Pure Go, no CGO, no ONNX.
This is a substantial rewrite. Everything listed below changed between commit 062f127 (2026-04-04) and commit cfd5fb5 (2026-04-17).
| Area | Before (at 062f127) |
After (at cfd5fb5) |
|---|---|---|
| Surfacing | BM25 scoring on every UserPromptSubmit, hook-driven |
Skills load context on demand (/prepare, /recall) |
| Memory file layout | Flat ~/.local/share/engram/memories/*.toml |
Split: ~/.local/share/engram/memory/feedback/*.toml and ~/.local/share/engram/memory/facts/*.toml |
| TOML schema | Flat fields: title, content, concepts, keywords, principle, anti_pattern, confidence, outcome counters |
schema_version = 2, type discriminator, source, situation, [content] sub-table |
| Outcome tracking | Per-memory counters (surfaced_count, followed_count, not_followed_count, irrelevant_count) |
Removed — focus moved to situation-query matching |
| Confidence tiers | A / B / C | Removed — replaced by source = "human" or source = "agent" |
| Adaptation | /adapt skill, effectiveness quadrants, proposals |
Removed — simpler model, no self-tuning loop |
| Hooks | Stop (async extract), UserPromptSubmit (surface) |
SessionStart, UserPromptSubmit, PostToolUse — reminders only, no surfacing |
| Recall | Always-on injection via BM25 | Three-phase pipeline: auto-memory ranking, skill frontmatter ranking, CLAUDE.md/rules extraction. Haiku filters for relevance. Triggered by /recall or /prepare. |
If you ran engram before cfd5fb5 (2026-04-17), you have memories in the old flat layout with confidence, surfaced_count, and other fields that no longer exist. They will not load under the current code.
Do not start a new session on the updated engram until you migrate. Running fresh against unmigrated memories will create a mixed state that's hard to untangle.
The /migrate skill walks you through:
- Locating your existing memory files under
~/.local/share/engram/. - Reading each file and classifying it as feedback or fact (judgement required — the skill guides this).
- Rewriting the situation to a task-shaped form that recall can actually match.
- Dropping obsolete fields (
surfaced_count,followed_count,not_followed_count,irrelevant_count,project_scoped,confidence,keywords,concepts,title). - Writing the new file to
~/.local/share/engram/memory/feedback/or~/.local/share/engram/memory/facts/. - Verifying each migrated file with
engram show --name <slug>. - Archiving the old files to a dated directory (not deleting them).
The skill file is at skills/migrate/SKILL.md. Invoke with /migrate in Claude Code.