Neural Memory Adapter for Hermes Agent

Semantic memory system with knowledge graph, spreading activation, embedding-based recall, autonomous dream consolidation, and GPU-accelerated recall for the Hermes Agent.

🌐 Live Website — interactive demos, architecture, and dashboard.

Features

Semantic Memory Storage: Store memories with auto embedding (FastEmbed ONNX, 1024d)
Knowledge Graph: Auto-connect related memories via cosine similarity
Spreading Activation: Explore connected ideas (BFS graph traversal with decay)
Conflict Detection: Detect and supersede conflicting memories
Dream Engine: Autonomous consolidation (NREM/REM/Insight phases)
GPU Recall: CUDA-accelerated cosine similarity (~100ms, torch.matmul)
SQLite-First: Always works, no external DB needed
MSSQL Optional: Shared DB for multi-agent setups

Quick Start

cd ~/projects/neural-memory-adapter
bash install.sh          # auto-detect hermes-agent
bash install.sh /path    # explicit path

The installer handles everything:

Python deps (FastEmbed, torch, numpy)
CUDA detection for GPU recall
Plugin deployment to hermes-agent
Database init (SQLite at ~/.neural_memory/memory.db)
config.yaml setup

Restart hermes after install: hermes gateway restart

Live Dashboard — Knowledge Graph

Architecture

Embedding Backends (auto-priority)

Priority	Backend	Model	Speed	Requirements
1st	FastEmbed	intfloat/multilingual-e5-large	~50ms	`pip install fastembed`
2nd	sentence-transformers	BAAI/bge-m3 1024d	~200ms	GPU recommended
3rd	tfidf	—	varies	numpy only
4th	hash	—	instant	nothing

FastEmbed uses ONNX runtime — no PyTorch conflict, works on CPU. Falls back automatically.

GPU Recall Engine

# gpu_recall.py — CUDA cosine similarity
# Loads all embeddings into GPU, does torch.matmul for batch similarity
# ~100ms for 10K memories vs ~500ms CPU

from gpu_recall import GPURecall
engine = GPURecall()
results = engine.recall(query_embedding, all_embeddings, top_k=10)

Auto-detects CUDA. Falls back to Python/numpy if no GPU.

Data Flow

flowchart TD
    subgraph Store["neural_remember"]
        A[User content] --> B[FastEmbed encode]
        B --> C[1024d vector]
        C --> D[SQLite INSERT]
        C --> E[Cosine similarity search]
        E --> F[Create connections]
    end

    subgraph Recall["neural_recall"]
        G[User query] --> H[FastEmbed encode]
        H --> I{CUDA available?}
        I -->|Yes| J[GPU torch.matmul]
        I -->|No| K[CPU numpy dot]
        J --> L[Top-k results]
        K --> L
    end

    subgraph Think["neural_think"]
        M[Source memory] --> N[BFS on connections]
        N --> O[Apply decay factor]
        O --> P[Ranked activation]
    end

    subgraph Dream["neural_dream"]
        Q[Idle trigger] --> R[NREM replay]
        R --> S[REM bridge discovery]
        S --> T[Insight communities]
        T --> U[Consolidated graph]
    end

Storage

SQLite (always): ~/.neural_memory/memory.db — source of truth
Embeddings cache: ~/.neural_memory/models/ (auto-downloaded, ~2.2 GB)
GPU cache: ~/.neural_memory/gpu_cache/ (embeddings.npy + metadata.pkl)
Access logs: ~/.neural_memory/access_logs/ (JSON Lines)
MSSQL (optional): 127.0.0.1/NeuralMemory — multi-agent mirror

SQLite Schema

-- Core tables
memories (id, content, embedding, category, salience, ...)
connections (source_id, target_id, weight, edge_type)
connection_history (source_id, target_id, last_weight, last_updated)

-- Dream engine
dream_sessions (id, phase, started_at, completed_at, stats)
dream_insights (id, session_id, type, data)

-- Indexes
idx_memories_category ON memories(category)
idx_connections_source ON connections(source_id)
idx_connections_target ON connections(target_id)

Configuration

All settings in ~/.hermes/config.yaml:

memory:
  provider: neural
  neural:
    db_path: ~/.neural_memory/memory.db
    embedding_backend: fastembed       # auto | fastembed | sentence-transformers | tfidf | hash
    prefetch_limit: 10
    search_limit: 10
    dream:
      enabled: true
      idle_threshold: 600              # seconds before dream cycle
      memory_threshold: 50             # dream after N new memories
      mssql:                           # optional — only if using MSSQL
        server: 127.0.0.1
        database: NeuralMemory
        username: SA
        password: 'your_password'
        driver: '{ODBC Driver 18 for SQL Server}'

Tools

When active, these tools are available in Hermes:

Tool	Description
`neural_remember`	Store a memory (with conflict detection)
`neural_recall`	Search memories by semantic similarity
`neural_think`	Spreading activation from a memory
`neural_graph`	View knowledge graph statistics
`neural_dream`	Force a dream cycle (all/nrem/rem/insight)
`neural_dream_stats`	Dream engine statistics

Dream Engine

Autonomous background memory consolidation (biological sleep inspired):

flowchart LR
    subgraph Trigger
        T1[Idle 600s] --> D
        T2[50 new memories] --> D
        T3[Manual / Cron] --> D
    end

    D[Dream Cycle] --> NREM
    D --> REM
    D --> INSIGHT

    subgraph NREM["Phase 1 — NREM"]
        direction TB
        N1[Replay 100 recent memories] --> N2[Spreading activation]
        N2 --> N3{Connection active?}
        N3 -->|Yes| N4[Strengthen +0.05]
        N3 -->|No| N5[Weaken -0.01]
        N3 -->|Dead <0.05| N6[Prune]
    end

    subgraph REM["Phase 2 — REM"]
        direction TB
        R1[Find 50 isolated memories] --> R2[Search similar unconnected]
        R2 --> R3[Create bridge connections]
        R3 --> R4[weight = similarity × 0.3]
    end

    subgraph INSIGHT["Phase 3 — Insight"]
        direction TB
        I1[BFS connected components] --> I2[Identify communities]
        I2 --> I3[Find bridge nodes]
        I3 --> I4[Store dream_insights]
    end

Triggers

Automatic: after 600s idle (configurable)
Automatic: every 50 new memories (configurable)
Manual: neural_dream tool
Standalone: python python/dream_worker.py --daemon

Testing

Smoke Test (Quick)

cd ~/projects/neural-memory-adapter/python
python3 demo.py

Full Test Suite

# Plugin test suite
cd ~/.hermes/hermes-agent/plugins/memory/neural
python3 test_suite.py

# Upside-Down Test Suite — edge cases, corruption, concurrency, SQL injection
cd ~/projects/neural-memory-adapter
python3 tests/test_upside_down.py

Clean Smoke Test (Any Machine)

cd ~/projects/neural-memory-adapter
python3 -c "
import sys; sys.path.insert(0, 'python')
from neural_memory import NeuralMemory
nm = NeuralMemory(db_path='/tmp/test.db', embedding_backend='cpu', use_cpp=False)
mid = nm.remember('test memory', label='smoke')
results = nm.recall('test')
assert len(results) > 0, 'recall failed'
print(f'SMOKE TEST PASS: {len(results)} results')
"

Verified: Clean VM — Debian 12 (2026-04-21)

Tested on a fresh Debian 12 QEMU/KVM VM — hermes-agent + neural memory only, no jack-in-a-box.

Property	Value
VM	Debian 12, 4 GB RAM, KVM enabled
hermes-agent	git clone (itsXactlY fork)
neural-memory	git clone + FastEmbed ONNX
Embedding	intfloat/multilingual-e5-large (1024d)
C++ bridge	Not built (Python fallback)

All 12 integration tests passed:

#	Test	Result
1	NeuralMemory standalone (remember/recall/graph)	PASS
2	Memory Provider (FastEmbed 1024d)	PASS
3	NeuralMemoryProvider.init	PASS
4	is_available()	PASS
5	initialize(session_id)	PASS
6	get_tool_schemas() → 4 tools	PASS
7	system_prompt_block() (250 chars)	PASS
8	handle_tool_call — neural_remember	PASS
9	handle_tool_call — neural_recall	PASS
10	handle_tool_call — neural_graph	PASS
11	prefetch()	PASS
12	shutdown()	PASS

VM / Constrained Environment Notes

4 GB RAM minimum — FastEmbed model download (~500 MB). 2 GB = OOM killed.
HashBackend works as fallback on low-RAM systems (1024d, instant, no deps).
C++ bridge optional — Python fallback covers all functionality.
FastEmbed >= 0.5.1 — earlier versions default to CLS embedding (deprecated).
python3-venv required on Debian — apt install python3.11-venv if missing.
PEP 668 (Debian) — pip install needs venv or --break-system-packages.
Cloud-init delay — 60–90 s on first boot. Don't assume SSH is ready immediately.
prefetch() returns empty on fresh DB — expected, no prior memories to pre-load.

File Structure

neural-memory-adapter/
├── install.sh                    # Installer
├── hermes-plugin/                # Plugin (deployed to hermes-agent)
│   ├── __init__.py               # MemoryProvider + tools
│   ├── config.py                 # Config loader
│   ├── plugin.yaml               # Plugin metadata
│   ├── neural_memory.py          # Unified Memory class
│   ├── memory_client.py          # Main client (NeuralMemory, SQLiteStore)
│   ├── embed_provider.py         # Embedding backends (FastEmbed, st, tfidf, hash)
│   ├── gpu_recall.py             # CUDA cosine similarity engine
│   ├── dream_engine.py           # Dream engine (NREM/REM/Insight)
│   ├── dream_worker.py           # Standalone daemon
│   ├── access_logger.py          # Recall event logger
│   └── ...
├── python/                       # Python source (mirrors hermes-plugin)
│   └── ...
├── src/                          # C++ source (optional, legacy)
│   ├── memory/lstm.cpp           # LSTM predictor
│   ├── memory/knn.cpp            # kNN engine
│   └── memory/hopfield.cpp       # Hopfield network
└── README.md

Production Lessons

Embedding & Runtime

FastEmbed > sentence-transformers — ONNX runtime, no PyTorch conflict, fast on CPU.
FastEmbed >= 0.5.1 — earlier versions default to CLS embedding. Pin version or set add_custom_model.
GPU recall > C++ Bridge — C++ Hopfield had bias issues; GPU matmul is clean.
numpy before FastEmbed — FastEmbed imports numpy at load time; install order matters.
Don't force PyTorch — let FastEmbed handle CPU. torch only needed for GPU recall.

Storage & Architecture

SQLite = Source of Truth — MSSQL is optional. SQLite always works.
Auto-detect everything — CUDA, backends, venv paths. Minimize config burden.
4 tool schemas exposed by NeuralMemoryProvider: neural_remember, neural_recall, neural_think, neural_graph. (neural_dream / neural_dream_stats are standalone Memory class only.)

License

See LICENSE.

Name		Name	Last commit message	Last commit date
Latest commit History 139 Commits
.github		.github
assets		assets
docs		docs
hermes-plugin		hermes-plugin
include/neural		include/neural
python		python
skills/neural-dream-engine		skills/neural-dream-engine
sql		sql
src		src
tests		tests
tools		tools
.codex		.codex
.dockerignore		.dockerignore
.env.example		.env.example
.gitignore		.gitignore
CMakeLists.txt		CMakeLists.txt
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
__init__.py		__init__.py
index.html		index.html
install.sh		install.sh
install_database.sh		install_database.sh
migrate.sh		migrate.sh
sync.sh		sync.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Neural Memory Adapter for Hermes Agent

Features

Quick Start

Architecture

Embedding Backends (auto-priority)

GPU Recall Engine

Data Flow

Storage

SQLite Schema

Configuration

Tools

Dream Engine

Triggers

Testing

Smoke Test (Quick)

Full Test Suite

Clean Smoke Test (Any Machine)

Verified: Clean VM — Debian 12 (2026-04-21)

VM / Constrained Environment Notes

File Structure

Production Lessons

Embedding & Runtime

Storage & Architecture

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Neural Memory Adapter for Hermes Agent

Features

Quick Start

Architecture

Embedding Backends (auto-priority)

GPU Recall Engine

Data Flow

Storage

SQLite Schema

Configuration

Tools

Dream Engine

Triggers

Testing

Smoke Test (Quick)

Full Test Suite

Clean Smoke Test (Any Machine)

Verified: Clean VM — Debian 12 (2026-04-21)

VM / Constrained Environment Notes

File Structure

Production Lessons

Embedding & Runtime

Storage & Architecture

License

About

Topics

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages