foxmemory-store

Long-term memory API for self-hosted AI applications.

foxmemory-store is the memory persistence layer in the FoxMemory stack. It wraps Mem0 OSS and exposes REST endpoints for writing, searching, and managing memories — with optional graph memory (Neo4j) for entity/relation extraction.

What it does

Memory writes — sends messages to an LLM, extracts facts worth remembering, stores them as vector embeddings in Qdrant
Semantic search — find relevant memories by query; returns ranked results and (optionally) related graph entities
Graph memory (optional) — on every write, extracts named entities and relations into Neo4j; surfaces them on search
Analytics DB — persists every write/search/graph event to SQLite; powers the /v2/stats/memories dashboard endpoint
Decision observability — every write response includes the LLM's per-decision reasoning, extracted facts, and candidates; persisted and queryable via GET /v2/write-events
Runtime config — LLM prompts (extraction, update, graph) and active models are runtime-configurable via API and survive restarts
Cascade delete — opt-in graph cleanup on memory delete: orphaned Neo4j nodes/edges are removed when no other memories back them

Requirements

Docker (recommended) — or Node.js 22+ for local dev
OpenAI-compatible inference — hosted OpenAI, local Ollama, or foxmemory-infer
Qdrant — the Docker image bundles an embedded Qdrant binary (simplest path); a separate container is also supported

Getting started

Option A — Docker with hosted OpenAI (simplest)

The Docker image bundles an embedded Qdrant binary, so this is the fastest way to a working setup.

1. Pull the image

docker pull foxlightfoundation/foxmemory-store:latest

2. Run it

docker run -d \
  --name foxmemory-store \
  -p 8082:8082 \
  -p 6333:6333 \
  -v foxmemory_data:/qdrant/storage \
  -e OPENAI_API_KEY=sk-... \
  -e OPENAI_BASE_URL=https://api.openai.com/v1 \
  -e MEM0_LLM_MODEL=gpt-4.1-nano \
  -e MEM0_EMBED_MODEL=text-embedding-3-small \
  -e QDRANT_HOST=127.0.0.1 \
  -e FOXMEMORY_ANALYTICS_DB_PATH=/qdrant/storage/foxmemory-analytics.db \
  foxlightfoundation/foxmemory-store:latest

3. Verify

curl -s http://localhost:8082/v2/health | jq .

You should see "ok": true with qdrantConnected: true.

Option B — Docker Compose (recommended for production)

See the foxmemory-deploy repo for full Compose recipes:

compose.one.yml — embedded Qdrant + Neo4j + foxmemory-infer, all in one stack
compose.split.yml — services split across hosts
compose.external.yml — external Qdrant container

Copy the relevant compose file and a .env from .env.example:

cp .env.example .env
# edit .env with your values
docker compose -f path/to/compose.one.yml up -d

Option C — Local dev

1. Install Node.js 22+

node --version   # must be 22+

2. Install dependencies

Note: This repo depends on @foxlight-foundation/mem0ai, a private package published to GitHub Packages. You need a GitHub personal access token (PAT) with read:packages scope. Set it in your environment before installing:
export NPM_TOKEN=ghp_your_token_here
The .npmrc in the project root is pre-configured to use it.

npm install

3. Configure environment

cp .env.example .env

Edit .env — minimum required values:

OPENAI_API_KEY=sk-...
OPENAI_BASE_URL=https://api.openai.com/v1
QDRANT_HOST=localhost        # assumes Qdrant running locally on port 6333

4. Start Qdrant (if not using the embedded image)

docker run -d -p 6333:6333 qdrant/qdrant

5. Start the server

npm run dev

Default port: 8082

6. Verify

curl -s http://localhost:8082/v2/health | jq .

Enabling graph memory (Neo4j)

Graph memory is disabled by default. When enabled, every write runs two extra LLM calls to extract entities and relations, which are stored in Neo4j and surfaced on search.

Add to your .env:

NEO4J_URL=bolt://localhost:7687
NEO4J_USERNAME=neo4j
NEO4J_PASSWORD=changeme
MEM0_GRAPH_LLM_MODEL=gpt-4.1-mini

Start Neo4j:

docker run -d \
  --name neo4j \
  -p 7474:7474 -p 7687:7687 \
  -e NEO4J_AUTH=neo4j/changeme \
  neo4j:5-community

Check connectivity after startup:

curl -s http://localhost:8082/v2/health | jq .diagnostics.neo4jConnected

Environment variables

Inference provider

Variable	Default	Description
`OPENAI_BASE_URL`	`https://api.openai.com/v1`	OpenAI-compatible endpoint
`OPENAI_API_KEY`	`local-infer-no-key`	API key for the inference provider
`MEM0_LLM_MODEL`	`gpt-4.1-nano`	Model for fact extraction and update decisions
`MEM0_EMBED_MODEL`	`text-embedding-3-small`	Embedding model — cannot change after collection is created

Vector store (Qdrant)

Variable	Default	Description
`QDRANT_HOST`	(required)	Qdrant hostname
`QDRANT_PORT`	`6333`	Qdrant port
`QDRANT_COLLECTION`	`foxmemory`	Collection name
`QDRANT_API_KEY`	(none)	Optional, if Qdrant requires auth

Analytics DB

Variable	Default	Description
`FOXMEMORY_ANALYTICS_DB_PATH`	`/data/foxmemory-analytics.db`	SQLite path — must be on a mounted volume or resets on restart

Graph memory (Neo4j — optional)

Variable	Default	Description
`NEO4J_URL`	(unset = disabled)	Bolt URL, e.g. `bolt://neo4j:7687`
`NEO4J_USERNAME`	`neo4j`	Neo4j username
`NEO4J_PASSWORD`	(required if URL set)	Neo4j password
`MEM0_GRAPH_LLM_MODEL`	`gpt-4.1-mini`	Separate model for entity extraction
`MEM0_GRAPH_CUSTOM_PROMPT`	(none)	Initial graph extraction prompt (also settable at runtime)
`MEM0_GRAPH_SEARCH_THRESHOLD`	`0.7`	Cosine similarity threshold for graph candidate retrieval
`MEM0_GRAPH_NODE_DEDUP_THRESHOLD`	`0.9`	Cosine similarity threshold for node deduplication
`MEM0_GRAPH_BM25_TOPK`	`5`	BM25 reranking top-K in graph search

Write gate (pre-flight filter)

Writes that fail the gate are returned immediately as mode: "skipped" — no LLM calls, no Qdrant writes, no graph ops.

Variable	Default	Description
`MEM0_MIN_INPUT_CHARS`	`1`	Skip writes whose total message content is below this character count
`MEM0_SKIP_PATTERNS`	(see below)	Comma-separated regex patterns (case-insensitive). Any write whose combined content matches is skipped.

Default skip patterns (always active): HEARTBEAT_OK, HEARTBEAT, bare [[tag]] messages, PING, PONG, OK.

Custom patterns via MEM0_SKIP_PATTERNS are appended to the defaults, not replacing them.

Retry / misc

Variable	Default	Description
`MEM0_ADD_RETRIES`	`3`	Retry attempts for `memory.add()` on exception
`MEM0_ADD_RETRY_DELAY_MS`	`250`	Delay between retries (ms)
`MEM0_HISTORY_DB_PATH`	`/tmp/history.db`	Mem0-internal dedup state — ephemeral, not needed for analytics

API overview

Full request/response contract: docs/API_CONTRACT.md or GET /v2/docs.md from a running instance.

Health & observability

Endpoint	Description
`GET /health`	Service metadata (flat)
`GET /v2/health`	Same, normalized envelope + live Neo4j connectivity check
`GET /v2/stats`	Runtime counters
`GET /v2/stats/memories`	SQLite analytics — write NONE rate, search quality, graph stats, byDay charts
`GET /v2/write-events`	Queryable write event log with per-decision LLM reasons
`GET /v2/openapi.json`	Machine-readable OpenAPI 3.0 spec
`GET /v2/docs`	Interactive Redoc UI
`GET /v2/docs.md`	Full API contract as Markdown (agent-friendly)

Memory (v2 — primary)

Endpoint	Description
`POST /v2/memories`	Write memories (returns `decisions`, `relations[]`, `added_entities[]` when graph enabled)
`POST /v2/memories/search`	Semantic search (returns `relations[]` when graph enabled)
`POST /v2/memories/forget`	Batch delete up to 1000 memories by ID; accepts `cascade_graph: true`
`GET /v2/memories`	List memories by `user_id` / `run_id`
`POST /v2/memories/list`	List with OR filters (body-based)
`GET /v2/memories/:id`	Get single memory
`PUT /v2/memories/:id`	Update memory text
`DELETE /v2/memories/:id`	Delete memory; accepts `?cascade_graph=true` for graph cleanup

Graph (v2 — graph-enabled only)

Endpoint	Description
`GET /v2/graph`	Full node+edge graph for a user/session
`GET /v2/graph/relations`	Raw Neo4j triples for a user/session
`GET /v2/graph/node/:id`	Single node + its neighbors

Config

Endpoint	Description
`GET/PUT /v2/config/prompt`	Call 1 (fact extraction) prompt
`GET/PUT /v2/config/update-prompt`	Call 2 (ADD/UPDATE/DELETE/NONE) prompt
`GET/PUT /v2/config/graph-prompt`	Graph entity extraction prompt (graph-enabled only)
`GET /v2/config/models`	Active LLM and embedding model overrides
`PUT /v2/config/models`	Override active LLM / graph LLM model at runtime
`GET /v2/config/catalog`	List all models in the model catalog
`POST /v2/config/catalog`	Add a model to the catalog
`GET /v2/config/catalog/:id`	Get catalog entry
`PUT /v2/config/catalog/:id`	Update catalog entry
`DELETE /v2/config/catalog/:id`	Remove from catalog

v1 (frozen — back-compat only)

POST /v1/memories, POST /v1/memories/search
GET /v1/memories, GET /v1/memories/:id, DELETE /v1/memories/:id
Legacy aliases: POST /memory.write, POST /memory.search, POST /memory.raw_write

Usage examples

Write a memory

curl -s -X POST http://localhost:8082/v2/memories \
  -H 'content-type: application/json' \
  -d '{
    "user_id": "demo",
    "messages": [
      { "role": "user", "content": "I prefer concise answers." }
    ]
  }' | jq .

Response includes decisions.actions[].reason — the LLM's explanation for each ADD/UPDATE/DELETE/NONE decision.

Search memories

curl -s -X POST http://localhost:8082/v2/memories/search \
  -H 'content-type: application/json' \
  -d '{
    "user_id": "demo",
    "query": "response style",
    "top_k": 5
  }' | jq .

List all memories for a user

curl -s "http://localhost:8082/v2/memories?user_id=demo" | jq .

Delete a memory (with graph cascade)

curl -s -X DELETE "http://localhost:8082/v2/memories/<id>?cascade_graph=true" | jq .

graph_cascade in the response reports how many orphaned Neo4j edges/nodes were removed. If the memory has no graph links recorded, cascade is a no-op regardless of the flag.

Inspect write decision reasoning

curl -s "http://localhost:8082/v2/write-events?user_id=demo&limit=10" | jq '.data[] | {event, reason, memory}'

Troubleshooting

Symptom	Likely cause	Fix
500 on every write	Qdrant unreachable	Check `QDRANT_HOST`/`QDRANT_PORT`; confirm Qdrant is running
Auth errors on write	Bad inference API key or wrong `OPENAI_BASE_URL`	Check `OPENAI_API_KEY` and `OPENAI_BASE_URL`
`neo4jConnected: false` in health	Neo4j unreachable or wrong credentials	Check `NEO4J_URL` and `NEO4J_PASSWORD`
Analytics reset on restart	Analytics DB not on a volume	Mount a volume and set `FOXMEMORY_ANALYTICS_DB_PATH` to a path on it
High write latency (>10s)	All 4 LLM calls serialized	Expected when graph is enabled; use a fast model for `MEM0_GRAPH_LLM_MODEL`
NONE rate very high	Extraction prompt too aggressive or content not worth storing	Tune via `PUT /v2/config/prompt`
`npm install` fails with 401	GitHub Packages auth not configured	Set `NPM_TOKEN` env var to a PAT with `read:packages` scope

Build

npm run build   # TypeScript → dist/
npm start       # run compiled output

License

MIT (see LICENSE)

Name		Name	Last commit message	Last commit date
Latest commit History 109 Commits
.cursor/rules		.cursor/rules
.github		.github
docs		docs
scripts		scripts
src		src
.editorconfig		.editorconfig
.env.example		.env.example
.gitignore		.gitignore
.npmrc		.npmrc
AGENTS.md		AGENTS.md
CHANGELOG.md		CHANGELOG.md
CLAUDE.md		CLAUDE.md
CODEX.md		CODEX.md
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
SECURITY.md		SECURITY.md
VERSIONING.md		VERSIONING.md
docker-entrypoint.sh		docker-entrypoint.sh
package-lock.json		package-lock.json
package.json		package.json
tsconfig.json		tsconfig.json
yarn.lock		yarn.lock

Folders and files

Latest commit

History

Repository files navigation

foxmemory-store

What it does

Requirements

Getting started

Option A — Docker with hosted OpenAI (simplest)

Option B — Docker Compose (recommended for production)

Option C — Local dev

Enabling graph memory (Neo4j)

Environment variables

Inference provider

Vector store (Qdrant)

Analytics DB

Graph memory (Neo4j — optional)

Write gate (pre-flight filter)

Retry / misc

API overview

Health & observability

Memory (v2 — primary)

Graph (v2 — graph-enabled only)

Config

v1 (frozen — back-compat only)

Usage examples

Write a memory

Search memories

List all memories for a user

Delete a memory (with graph cascade)

Inspect write decision reasoning

Troubleshooting

Build

License

About

Resources

License

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages