Local, project-scoped memory system for language models with evidence-based truth validation.
tinyMem is a standalone Go executable that gives small and medium language models reliable long-term memory in complex codebases. It acts as a truth-aware prompt governor, sitting between the developer and the LLM to inject verified context and capture validated facts—all without requiring model retraining or cloud services.
tinyMem exists because language models forget context quickly, hallucinate unchecked facts, and do not know when they should double-check their answers. By keeping a local, project-scoped database of classified, evidence-backed statements, it:
- empowers developers with deterministic, token-budgeted context injection so models can remember decisions without re-reading every file,
- enforces truth discipline so claims are only promoted to facts when local verification (greps, tests, file checks) succeeds,
- stays entirely on your machine, avoiding cloud lock-in while remaining auditable and transparent.
These benefits make tinyMem particularly useful for teams that use smaller models (7B–13B) or who need consistent memory across multiple IDEs and tools.
tinyMem operates on three core principles:
- Memory is not gospel – Model output is never trusted by default
- Facts require evidence – Claims without verification are stored as claims, not facts
- Reality checks are free – Evidence verification happens locally using filesystem checks, grep, command execution, and test runs
This approach prevents language models from hallucinating institutional knowledge while dramatically improving their ability to maintain context across long development sessions.
- Evidence-Based Truth System: All memory entries are typed (fact, claim, plan, decision, constraint, observation, note). Only claims with verified evidence become facts.
- Chain-of-Verification (CoVe): Optional LLM-based quality filter that reduces hallucinated memory candidates before storage (disabled by default).
- Local Execution: Runs entirely on your machine as a single executable. No cloud dependencies.
- Project-Scoped: All state lives in
.tinyMem/directory within your project - Streaming First: Responses stream immediately—no buffering delays
- Zero Configuration: Works out of the box with sensible defaults
- Dual Integration Mode: Operates as HTTP proxy or MCP server for IDE integration
- Token Budget Control: Deterministic prompt injection with configurable limits
- Hybrid Search: Combines FTS (lexical) with optional semantic search
- Recall Tiers: Memories are categorized into recall tiers (always, contextual, opportunistic) for efficient token usage
- Truth State Management: Memories have truth states (tentative, asserted, verified) for better context prioritization
- Comprehensive Metrics: Built-in instrumentation for monitoring token usage and recall effectiveness
For a simpler, step-by-step guide aimed at less technical users, please see the Quick Start Guide for Beginners.
Download the latest release for your platform from the releases page.
For macOS and Linux:
# The command will download the correct binary for your system.
curl -L "https://github.com/andrzejmarczewski/tinyMem/releases/latest/download/tinymem-$(uname -s | tr '[:upper:]' '[:lower:]')-$(uname -m)" -o tinymem
chmod +x tinymem
# For global access, move it to a directory in your PATH:
sudo mv tinymem /usr/local/bin/For Windows:
- Download the
tinymem-windows-amd64.exe(or other architecture) file from the releases page. - Rename it to
tinymem.exefor convenience. - Place it in a folder (e.g.,
C:\Program Files\tinymem). - Add that folder to your system's
Pathenvironment variable to runtinymemfrom any terminal.
See Adding tinymem to your PATH for detailed instructions on updating your PATH.
git clone https://github.com/andrzejmarczewski/tinyMem.git
cd tinymem
# Build with all features (FTS5 enabled by default)
makeThe build-minimal target still enforces FTS5 and is provided for compatibility with older workflows:
make build-minimalOr run go build directly with the required tag:
go build -tags fts5 -o tinymem ./cmd/tinymemFTS5 support is mandatory; there is no supported build that omits the fts5 tag.
Once built, the tinymem executable will be in your current directory. For easier access, consider moving it to a directory included in your system's PATH (e.g., /usr/local/bin/ on macOS/Linux) or adding your project directory to your PATH environment variable.
It's highly recommended to have the tinymem executable available in your system's PATH. This allows you to run tinymem commands from any directory without specifying the full path (e.g., tinymem health instead of ./tinymem health). This is particularly important for seamless integration with IDEs and other tools that expect tinymem to be globally accessible.
To make tinymem easily callable from any directory:
Option 1: Move to a system PATH directory (recommended for global access)
# For macOS/Linux users, after building or downloading:
# Move the compiled binary to a directory already in your PATH, like /usr/local/bin/
sudo mv tinymem /usr/local/bin/Note: This requires administrator/root privileges.
Option 2: Add your project directory to your PATH (recommended for project-specific versions)
If you prefer to keep the tinymem binary within your project directory, you can add that directory to your PATH. This is useful if you work on multiple projects that might require different tinymem versions.
-
macOS/Linux (Bash/Zsh): Open your
~/.bashrc,~/.bash_profile, or~/.zshrcfile and add the following line. Replace/path/to/your/projectwith the actual absolute path to yourtinymemexecutable.export PATH="/path/to/your/project:$PATH"
After saving, run
source ~/.bashrc(or your respective shell config file) or restart your terminal. -
Windows (Command Prompt): Open Command Prompt as administrator and run:
setx PATH "%PATH%;C:\path\to\your\project"
Replace
C:\path\to/your/projectwith the actual absolute path. You may need to restart your command prompt or computer for changes to take effect. -
Windows (PowerShell): Run PowerShell as administrator and execute:
[Environment]::SetEnvironmentVariable("Path", "$env:Path;C:\path\to\your\project", "User")
Replace
C:\path\to\your\projectwith the actual absolute path. Restart PowerShell for changes to apply.
Requirements: Go 1.22 or later
cd /path/to/your/project
tinymem healthThis creates .tinyMem/ directory structure and initializes the SQLite database.
# Start the proxy server
tinymem proxyNow, in a separate terminal where you run your LLM client (e.g., a script using the OpenAI library), configure it to use the tinymem proxy by setting the API base URL. This directs your client to send requests to tinymem instead of directly to the LLM provider.
# In your LLM client's terminal:
export OPENAI_API_BASE_URL=http://localhost:8080/v1The proxy intercepts requests to your LLM, injects relevant memories, and captures new context automatically.
# Start MCP server for stdio-based IDEs
tinymem mcpConfigure your IDE (Cursor, VS Code, etc.) to use tinyMem as an MCP server. See IDE Integration below.
# Health and diagnostics
tinymem health # Check system health
tinymem doctor # Run detailed diagnostics
tinymem stats # Show memory statistics
tinymem dashboard # Show memory state snapshot
# Memory operations
tinymem query "authentication flow" # Search memories
tinymem recent # Show recent memories
tinymem write --type note --summary "My note" # Write a new memory
# Server modes
tinymem proxy # Start HTTP proxy server
tinymem mcp # Start MCP server
# Utilities
tinymem run -- your-command # Run command with memory context
tinymem version # Show version
tinymem addContract # Add tinyMem protocol to agent config files
tinymem completion [bash|zsh|fish|powershell] # Generate shell completion scriptThe write command allows you to add memories directly from the command line:
# Write a simple note
tinymem write --type note --summary "API refactoring complete"
# Write a decision with detail
tinymem write --type decision --summary "Use PostgreSQL for production" \
--detail "SQLite for dev, PostgreSQL for prod due to concurrency needs" \
--source "architecture review"
# Write a claim
tinymem write --type claim --summary "Frontend uses React 18" \
--key "frontend-framework"Available types: claim, plan, decision, constraint, observation, note
Note: fact type cannot be created directly via CLI as facts require verified evidence. Use claim instead, or use the MCP interface with evidence.
tinyMem categorizes all memory entries into typed buckets:
| Type | Description | Evidence Required | Auto-Promoted | Default Recall Tier | Default Truth State |
|---|---|---|---|---|---|
| fact | Verified truth about the codebase | Yes | No | Always | Verified |
| claim | Model assertion not yet verified | No | No | Contextual | Tentative |
| plan | Intended future action | No | No | Opportunistic | Tentative |
| decision | Confirmed choice or direction | Yes (confirmation) | No | Contextual | Asserted |
| constraint | Hard requirement or limitation | Yes | No | Always | Asserted |
| observation | Neutral context or state | No | Yes (low priority) | Opportunistic | Tentative |
| note | General information | No | Yes (lowest priority) | Opportunistic | Tentative |
| task | Specific task or action item | No | No | Contextual | Tentative |
Memory entries can have an optional classification field to improve recall precision:
| Classification | Purpose |
|---|---|
| decision | Important architectural or design decisions |
| constraint | Technical or business constraints |
| glossary | Definitions of terms or concepts |
| invariant | System invariants or guarantees |
| best-practice | Recommended approaches or patterns |
| pitfall | Common mistakes or gotchas to avoid |
Classification is optional and does not affect memory behavior, but can be used to improve search precision.
Memory entries are assigned recall tiers that determine their inclusion priority during prompt injection:
- Always: High-priority memories (facts, constraints) that are always included when relevant
- Contextual: Medium-priority memories (decisions, claims) included based on relevance and token budget
- Opportunistic: Low-priority memories (observations, notes) only included if space permits
To prevent token waste and irrelevant recall, follow these guidelines:
- Startup Phase: Use empty query (
memory_query(query="")) ormemory_recent()to establish initial context - Working Phase: Use targeted keyword queries (
memory_query(query="authentication flow")) for specific topics - Token Efficiency: Limit results with
limitparameter when appropriate - Classification Filtering: Use classification field to improve precision when available
- Avoid Over-Recall: Don't use broad queries that return many irrelevant memories
Memory entries have truth states that indicate their reliability level:
- Verified: Confirmed with evidence (facts that have passed verification)
- Asserted: Confirmed importance (decisions and constraints)
- Tentative: Unverified or lower-confidence information (claims, observations, notes)
Evidence is verified locally without LLM calls:
# Example: Model claims "User authentication is handled in auth.go"
# tinyMem checks:
- file_exists: auth.go
- grep_hit: "func.*[Aa]uthenticate" in auth.go
- test_pass: go test ./internal/auth/...
# If checks pass → stored as fact
# If checks fail → stored as claimEvidence types:
file_exists: File or directory existsgrep_hit: Pattern matches in filecmd_exit0: Command exits successfullytest_pass: Test suite passes
When using command-based evidence verification (cmd_exit0 and test_pass), be aware of the following security implications:
- Command Execution: These evidence types execute commands on your system, which could pose security risks if malicious patterns are introduced.
- Whitelist Configuration: By default, command execution is disabled. To enable it, you must explicitly set
TINYMEM_EVIDENCE_ALLOW_COMMAND=trueand configureTINYMEM_EVIDENCE_ALLOWED_COMMANDSwith a whitelist of permitted commands. - Path Safety: TinyMem implements path traversal protection to prevent access to files outside the project directory.
- Command Validation: All commands undergo validation to prevent shell injection attacks.
For production environments, carefully consider whether to enable command-based evidence verification and maintain a strict whitelist of allowed commands.
┌─────────────┐
│ LLM Client │ (IDE, CLI tool, API client)
└──────┬──────┘
│
↓
┌─────────────────────────────────────────┐
│ tinyMem Proxy/MCP │
│ ┌───────────────────────────────────┐ │
│ │ 1. Recall Engine │ │
│ │ - FTS search (BM25) │ │
│ │ - Optional semantic search │ │
│ │ - Token budget enforcement │ │
│ │ - Recall tier prioritization │ │
│ │ - Truth state filtering │ │
│ │ - Optional CoVe filtering │ │
│ └───────────────────────────────────┘ │
│ ↓ │
│ ┌───────────────────────────────────┐ │
│ │ 2. Prompt Injection │ │
│ │ - Bounded system message │ │
│ │ - Type annotations │ │
│ │ - Evidence markers │ │
│ │ - Tier and truth state info │ │
│ └───────────────────────────────────┘ │
└──────────┬──────────────────────────────┘
│
↓
┌──────────────┐
│ LLM Backend │ (Ollama, LM Studio, etc.)
└──────┬───────┘
│
↓ (streaming response)
┌──────────────────────────┐
│ 3. Extraction │
│ - Parse response │
│ - Extract claims │
│ - CoVe filter (opt.) │◄─ Quality gate
│ - Validate evidence │
│ - Store safely │
│ - Apply truth states │
└──────────────────────────┘
↓
┌──────────────────┐
│ SQLite Storage │
│ (.tinyMem/store.sqlite3)
└──────────────────┘
tinyMem works with zero configuration. Override defaults via .tinyMem/config.toml:
[proxy]
port = 8080
base_url = "http://localhost:11434/v1" # Ollama default
[recall]
max_items = 10
max_tokens = 2000
semantic_enabled = false
hybrid_weight = 0.5 # Balance between FTS and semantic
[memory]
auto_extract = true
require_confirmation = false
[cove]
# Chain-of-Verification quality filter (enabled by default)
enabled = true
confidence_threshold = 0.6 # Min confidence to keep (0.0-1.0)
max_candidates = 20 # Max candidates per batch
timeout_seconds = 30 # LLM call timeout
model = "" # Empty = use default LLM
recall_filter_enabled = false # Enable recall filtering
[logging]
level = "info" # off, error, warn, info, debug
file = ".tinyMem/logs/tinymem.log"
[metrics]
enabled = false # Enable comprehensive metrics and logging (off by default)TINYMEM_PROXY_PORT=8080
TINYMEM_LLM_BASE_URL=http://localhost:11434/v1
TINYMEM_LOG_LEVEL=debug
TINYMEM_METRICS_ENABLED=false # Enable comprehensive metrics and logging
# CoVe settings (optional)
TINYMEM_COVE_ENABLED=true
TINYMEM_COVE_CONFIDENCE_THRESHOLD=0.7
TINYMEM_COVE_MAX_CANDIDATES=20CoVe is an optional quality filter that evaluates memory candidates before storage. When enabled (now the default), it:
- Assigns confidence scores (0.0-1.0) to each candidate based on specificity and certainty
- Filters out low-confidence, speculative, or hallucinated extractions
- Operates transparently with fail-safe fallback (errors don't block storage)
- Never participates in fact promotion (evidence verification is separate)
CoVe significantly improves memory quality by reducing hallucinated extractions, but it does add some overhead:
- Token Usage: CoVe makes additional LLM calls to evaluate memory candidates, which can slightly increase your token usage.
- Latency: Each extraction event will have a small delay while CoVe evaluates candidates (typically 0.5-2 seconds).
- Cost: Additional API calls to your LLM provider may incur extra costs.
If you need to disable CoVe (for performance reasons or to reduce token usage), you can do so:
Via TOML configuration:
[cove]
enabled = false # CoVe completely disabledVia environment variable:
export TINYMEM_COVE_ENABLED=falseSee docs/COVE.md for detailed documentation, configuration examples, and performance considerations.
When using tinyMem as an MCP server for AI agents, ensure that your agents follow the MANDATORY TINYMEM CONTROL PROTOCOL.
Include the contract content from docs/AGENT_CONTRACT.md in your agent's system prompt to ensure proper interaction with tinyMem.
Quick Start: Run the verification script to ensure MCP is ready:
./verify_mcp.shThis will test your setup and provide the exact configuration to copy.
Manual Configuration:
Add the following server configuration to your claude_desktop_config.json file. Note that the exact path to this file may vary slightly depending on your operating system and how you installed Claude Desktop.
- macOS:
~/Library/Application Support/Claude/claude_desktop_config.json - Windows:
%APPDATA%\Claude\claude_desktop_config.json - Linux:
~/.config/Claude/claude_desktop_config.json
{
"mcpServers": {
"tinymem": {
"command": "/path/to/tinymem",
"args": ["mcp"],
"env": {}
}
}
}Important: Use the absolute path to your tinymem executable. After updating the configuration, restart Claude Desktop.
For detailed MCP troubleshooting, see docs/MCP_TROUBLESHOOTING.md.
Available MCP tools:
memory_query- Search memories using full-text or semantic searchmemory_recent- Retrieve the most recent memoriesmemory_write- Create a new memory entry with optional evidencememory_stats- Get statistics about stored memories (includes CoVe metrics when enabled)memory_health- Check the health status of the memory systemmemory_doctor- Run diagnostics on the memory system
Configure your LLM extension to use the tinymem proxy. Since the proxy forwards the request to your actual LLM backend (which is configured with the real API key), you can often use a dummy key in your editor's settings.
{
"continue.apiBase": "http://localhost:8080/v1",
"continue.apiKey": "dummy"
}See docs/agents/QWEN.md for detailed Qwen integration setup.
For Qwen and Gemini, you can configure MCP integration by adding the following to your respective configuration files:
For Qwen (in .qwen/QWEN.md or project configuration):
{
"mcpServers": {
"tinymem": {
"command": "/path/to/tinymem",
"args": ["mcp"],
"env": {}
}
}
}For Gemini (in .gemini/CONFIG.md or project configuration):
{
"mcpServers": {
"tinymem": {
"command": "/path/to/tinymem",
"args": ["mcp"],
"env": {}
}
}
}Important: Use the absolute path to your tinymem executable. After updating the configuration, restart your respective application.
tinyMem is designed to be integrated with AI agents, providing them with a local, project-scoped memory system. To ensure effective and reliable interaction, AI agents should adhere to specific directives when using tinyMem.
Your primary function is to leverage tinyMem's memory to provide contextually-aware answers. Before providing any code or explanation from your general knowledge, you MUST first consult tinyMem's memory using memory_query.
memory_query(query: str, limit: int = 10)- Search project memory (use this FIRST)memory_recent(count: int = 10)- Get recent memory entriesmemory_write(type: str, summary: str, ...)- Create new memory entriesmemory_stats()- View memory statistics (includes CoVe metrics when enabled)memory_health()- Check system healthmemory_doctor()- Run detailed diagnostics
User asks: "How should we implement authentication?"
Wrong approach (❌):
You respond with general JWT/OAuth advice from your training...
Correct approach (✅):
# Step 1: Query memory FIRST
memory_query(query='authentication implementation')
# Step 2: Synthesize from results
# Found: DECISION - "Use OAuth2 with Google/GitHub"
# Found: CONSTRAINT - "Must support enterprise SSO"
# Step 3: Answer based on memory
"Based on project decisions, you've chosen OAuth2 with
Google and GitHub providers, with plans to add enterprise
SSO. Would you like me to outline the implementation?"For comprehensive AI Assistant Directives including:
- Mandatory reasoning process for every query
- Detailed tool usage guidelines
- Chain-of-Verification (CoVe) transparency
- Memory type best practices
- Complete workflow examples
- Critical reminders and error patterns
See the full directives in docs/agents/CLAUDE.md
Pick the directive file that matches your model and paste its contents verbatim into your system prompt or project instructions (do not summarize or paraphrase).
Choose the correct file:
- Claude:
docs/agents/CLAUDE.md - Gemini:
docs/agents/GEMINI.md - Qwen:
docs/agents/QWEN.md - Custom/other agents:
docs/AGENT_CONTRACT.md
This ensures the assistant:
- Always queries memory before providing project-specific answers
- Understands all memory types and tools
- Knows when to create new memories
- Recognizes CoVe's transparent operation
- Follows evidence-based truth discipline
Concrete examples:
- Claude Desktop/Cursor: Paste
docs/agents/CLAUDE.mdinto project instructions or.clinerules - Continue (VS Code): Paste the matching file into the system message or a context file
- Custom agents: Prepend the matching file to the system prompt at initialization
-
Start tinyMem proxy in your project:
cd ~/projects/myapp tinymem proxy
-
Configure your LLM client to point to
http://localhost:8080/v1 -
Work naturally with your LLM:
- Ask questions about your codebase
- Request changes
- Discuss architecture decisions
-
tinyMem automatically:
- Injects relevant memories into each prompt
- Captures facts from responses (with evidence)
- Maintains truth discipline (claims ≠ facts)
- Streams responses without delay
-
Query memory state:
tinymem stats tinymem query "database schema" tinymem recent
# Query specific topic
tinymem query "API endpoints" --limit 5
# View recent activity
tinymem recent --count 20
# Clear all memories (nuclear option)
rm -rf .tinyMem/store.sqlite3
tinymem health # Recreates DB# Inject memory context into command environment
tinymem run -- your-test-runner --verbose# Run comprehensive diagnostics
tinymem doctor
# Check what's failing:
# - DB connectivity
# - FTS availability
# - Semantic search status
# - LLM backend reachability
# - Filesystem permissions
# - Port conflictsError: "Request timed out" or "Client is not connected"
These errors indicate the MCP server isn't maintaining a stable connection. The most common cause is logging output interfering with the stdio protocol. The latest version fixes this by using silent logging (file-only) for MCP mode.
To verify the fix worked:
-
Verify tinymem path is absolute:
# Find the full path which tinymem # or if it's in your project directory pwd # then use /full/path/to/tinymem
-
Test MCP server manually:
cd /path/to/your/project ./tinymem mcp # Then send a test message: {"jsonrpc":"2.0","method":"initialize","params":{"protocolVersion":"2024-11-05"},"id":1} # You should see a JSON response immediately
-
Check logs:
# If logging is enabled, check the logs cat .tinyMem/logs/tinymem.log -
Verify database initialization:
# Run from your project directory ./tinymem health -
Restart Claude Desktop after updating the MCP configuration
Error: "Tool not found"
If you get "tool not found" errors, make sure you're using underscore names (memory_query) not dot names (memory.query).
MCP Logging
When running in MCP mode, tinyMem automatically uses silent logging - all log messages go to .tinyMem/logs/tinymem-YYYY-MM-DD.log and nothing is written to stderr/stdout (which are reserved for JSON-RPC). This prevents log output from interfering with the MCP protocol.
To view logs while MCP is running:
tail -f .tinyMem/logs/tinymem-$(date +%Y-%m-%d).logEnable semantic search for better phrasing flexibility:
-
Install Ollama with an embedding model:
ollama pull nomic-embed-text
-
Update config:
[recall] semantic_enabled = true embedding_model = "nomic-embed-text"
-
Restart tinyMem
Semantic search gracefully degrades to FTS-only if unavailable.
.tinyMem/
├── store.sqlite3 # Memory database with FTS5
├── config.toml # Optional configuration
├── logs/ # Log files (if enabled)
└── run/ # Runtime state
When a prompt arrives, tinyMem:
- Searches memories using FTS (BM25 ranking)
- Optionally combines with semantic similarity
- Applies recall tier prioritization (Always → Contextual → Opportunistic)
- Filters by truth state (Verified → Asserted → Tentative)
- Prioritizes constraints and decisions
- Enforces token budget
Selected memories are formatted into a bounded system message:
[tinyMem Context]
[ALWAYS] CONSTRAINT: API keys must be stored in environment variables
(evidence: .env.example exists, grep confirms pattern)
(truth state: asserted)
[CONTEXTUAL] FACT: Authentication uses JWT tokens
(evidence: auth.go:42, test suite passes)
(truth state: verified)
[OPPORTUNISTIC] CLAIM: Frontend uses React 18
(no evidence verification yet)
(truth state: tentative)
After the LLM responds:
- Parse output for claims, plans, decisions
- Optional: CoVe filtering - Assign confidence scores and filter low-quality candidates
- Apply recall tiers and truth states
- Default to non-fact types (never auto-promote to facts)
- Verify evidence for fact promotion (independent of CoVe)
- Store with timestamps and supersession tracking
Note: CoVe filtering is disabled by default and operates transparently when enabled. It reduces hallucinated extractions but never participates in fact promotion—only evidence verification can promote claims to facts.
When enabled, tinyMem provides comprehensive metrics:
- Per-request recall statistics (total memories, token counts, memory IDs and types)
- Tier-based breakdowns (always, contextual, opportunistic counts)
- Response token counts
- Token delta measurements for optimization
Keep memory lean and relevant with these manual practices:
- Review obsolete memories: Periodically review and remove outdated information
- Consolidate duplicates: Merge similar memories with overlapping information
- Verify stale facts: Check if facts are still accurate and relevant
- Regular audits: Manually review memory content periodically
- Prune unused memories: Remove memories that haven't been recalled in months
- Update classifications: Ensure classifications remain accurate over time
Note: All cleanup is intentional and manual. No automatic compaction or scheduled cleanup occurs.
These guarantees hold everywhere in tinyMem:
- Memory ≠ Gospel: Model output never auto-promoted to truth
- Typed Memory: All entries have explicit types
- Evidence Required: No evidence → not a fact (CoVe cannot bypass this)
- Bounded Injection: Prompt injection is deterministic and token-limited
- Streaming Mandatory: No response buffering (where supported)
- Project-Scoped: All state lives in
.tinyMem/ - Single Executable: No dependencies beyond SQLite (embedded)
- CoVe Safety: When enabled, CoVe filters quality but never changes types, creates facts, or overrides evidence verification
- Tiered Recall: Memories are prioritized by recall tier (Always → Contextual → Opportunistic)
- Truth State Management: Memories are filtered by reliability (Verified → Asserted → Tentative)
- Metrics Transparency: Comprehensive logging available when enabled for performance monitoring
Violating any invariant is a bug, not a feature gap.
TinyMem includes a protocol for managing multi-step tasks using tinyTasks.md files. This system allows for structured task tracking that integrates with the memory system:
# Tasks – <Goal>
- [ ] Top-Level Task
- [ ] Atomic subtask
- [ ] Atomic subtask
- [ ] Next Task
- [ ] Atomic subtask
The protocol requires:
- Two levels only (top-level tasks and subtasks)
- One responsibility per top-level task
- Subtasks must be atomic and verifiable
- Checkboxes define all state
- Order is execution order
This system helps track complex development work and integrates with the memory system for persistent task management.
# Build with all features (FTS5 enabled by default)
make
# Or build with the minimal target (FTS5 remains enabled)
make build-minimalBoth make and make build-minimal enforce -tags fts5 so every build includes SQLite FTS5 support.
Alternatively, use go build directly:
go build -tags fts5 -o tinymem ./cmd/tinymemBuild scripts are located in the build/ directory but can be executed from the project root:
build.sh- Unix/Linux/macOS build scriptbuild.bat- Windows build script
# Run the standard Go tests
go test ./...Comprehensive test suite (located in the test/ directory):
# Run all tests using the test runner
cd test && python3 run_tests.py
# Or run individual test files
cd test && python3 test_tinymem.py
cd test && python3 test_tinymem_mcp.py
cd test && python3 test_tinymem_config.pyYou can build tinymem for different operating systems and architectures by setting the GOOS (target operating system) and GOARCH (target architecture) environment variables before running the go build command. Each command below enables FTS5 by passing -tags fts5, mirroring the defaults used in build.sh (located in the build/ directory but referenced from project root). FTS5 support is required for every build.
Here are some common examples:
For Linux:
# For AMD64 (most common desktops and servers)
GOOS=linux GOARCH=amd64 go build -tags fts5 -o tinymem-linux-amd64 ./cmd/tinymem
# For ARM64 (e.g., Raspberry Pi, some cloud instances)
GOOS=linux GOARCH=arm64 go build -tags fts5 -o tinymem-linux-arm64 ./cmd/tinymemFor macOS:
# For Apple Silicon (M1, M2, etc.)
GOOS=darwin GOARCH=arm64 go build -tags fts5 -o tinymem-darwin-arm64 ./cmd/tinymem
# For Intel-based Macs
GOOS=darwin GOARCH=amd64 go build -tags fts5 -o tinymem-darwin-amd64 ./cmd/tinymemFor Windows:
# For AMD64
GOOS=windows GOARCH=amd64 go build -tags fts5 -o tinymem-windows-amd64.exe ./cmd/tinymem
# For ARM64
GOOS=windows GOARCH=arm64 go build -tags fts5 -o tinymem-windows-arm64.exe ./cmd/tinymemThe output binary will be named according to the -o flag in the command. You can then move this binary to the target machine and run it.
Contributions welcome! Please ensure:
- Truth discipline is maintained: No shortcuts around evidence validation
- Streaming is preserved: No buffering regressions
- Zero-config remains: Defaults must work out of the box
- Tests pass:
go test ./... - Doctor explains it: If it can fail,
tinymem doctorshould diagnose it
MIT License - see LICENSE for details.
Language models are powerful but have limited context windows and no persistent memory. Existing solutions either:
- Require expensive fine-tuning
- Depend on cloud services
- Trust model output uncritically
- Add latency through buffering
- Lack quality filtering for hallucinated extractions
tinyMem takes a different approach: treat the model as a conversational partner, but verify everything it claims against reality. Optional Chain-of-Verification (CoVe) filtering reduces hallucinated extractions before they pollute the memory system. This gives small models (7B-13B) the behavior of much larger models with long-term memory, while reducing token costs for all models through smart context injection.
The result: better model performance, lower costs, higher memory quality, and guaranteed truth discipline—all running locally with zero configuration.
These features are explicitly NOT goals for TinyMem, to protect against complexity creep:
- Chat History Storage: TinyMem does not store conversation history or chat logs
- Automatic Memory Management: No automatic cleanup, summarization, or lifecycle management
- Embeddings or Vector Search: No fuzzy similarity matching or neural search (only lexical FTS and optional semantic search)
- Agent Orchestration: No coordination between multiple agents or workflow management
- Predictive Prefetching: No speculative loading of memories based on patterns
- Multi-Modal Memory: No storage of images, audio, or other non-text content
- Real-Time Collaboration: No shared memory spaces between concurrent users
- External Knowledge Integration: No connections to external knowledge bases or APIs
- Machine Learning Models: No ML-based classification or clustering of memories
These limitations preserve TinyMem's focus: a simple, deterministic, auditable memory system that enhances LLM interactions without adding complexity.
Built for developers who want their LLMs to remember context without hallucinating facts.
