NOVA-Openclaw · NOVA-Openclaw · Apr 21, 2026
diff --git a/ARCHITECTURE.md b/ARCHITECTURE.md
@@ -0,0 +1,187 @@
+# Architecture Overview
+
+This document describes how the different scripts and subprojects in `nova-scripts` interconnect.
+
+## Memory / Embeddings Pipeline
+
+The memory pipeline is a multi‑stage system that extracts structured knowledge from chat messages, embeds it for semantic search, and enables proactive recall.
+
+### Flow
+
+```
+┌─────────────────────┐
+│   Incoming Chat     │
+│      Message        │
+└──────────┬──────────┘
+           │
+           ▼
+┌─────────────────────┐
+│  extract-memories.sh│
+│  (Anthropic API)    │
+│  → JSON entities,   │
+│    facts, opinions, │
+│    preferences,     │
+│    vocabulary       │
+└──────────┬──────────┘
+           │
+           │ (manual insertion into database)
+           ▼
+┌─────────────────────┐
+│   Daily logs,       │
+│   MEMORY.md,        │
+│   lessons, events,  │
+│   SOPs              │
+└──────────┬──────────┘
+           │
+           ▼
+┌─────────────────────┐
+│  embed-memories.py  │
+│  (OpenAI embeddings)│
+│  → memory_embeddings│
+│    table (pgvector) │
+└──────────┬──────────┘
+           │
+           ▼
+┌─────────────────────┐
+│   Semantic Search   │
+│  (proactive-recall, │
+│   semantic-search)  │
+│  → similarity match │
+└─────────────────────┘
+```
+
+### Components
+
+1. **Extraction** (`extract-memories.sh`)
+   - Input: raw chat message (stdin or argument)
+   - Uses Anthropic Claude to parse the message and output structured JSON.
+   - Categories: entities, facts, opinions, preferences, vocabulary, events.
+   - Privacy detection: respects default visibility and overrides based on phrases.
+
+2. **Embedding** (`embed-memories.py`)
+   - Reads multiple memory sources:
+     - Daily log files (`~/clawd/memory/*.md`)
+     - Central `MEMORY.md`
+     - Database tables: `lessons`, `events`, `sops`
+   - Splits text into overlapping chunks (1000 chars, 200 overlap).
+   - Calls OpenAI `text-embedding-3-small` to get vector embeddings.
+   - Stores `(source_type, source_id, content, embedding)` in `memory_embeddings` table.
+   - Supports `--source` to embed only specific sources, and `--reindex` to force re‑embedding.
+
+3. **Cron Jobs**
+   - `embed-memories-cron.sh`: daily embedding of all sources (logs to `~/clawd/logs/embed-memories.log`).
+   - `decay-confidence.sh`: nightly decay of `lessons.confidence` for lessons not referenced in 30+ days (multiplies by 0.95, floor 0.1).
+
+4. **Recall & Search**
+   - `proactive-recall.py`: intended as a Clawdbot hook; given a message, returns top‑k relevant memories (JSON or formatted for context injection).
+   - `semantic-search.py`: command‑line semantic search with similarity threshold.
+
+5. **Benchmarking**
+   - `recall-benchmark.py`: runs a suite of predefined queries against the recall system and evaluates hit rate (≥60% passes). Used for self‑diagnostic.
+
+### Database Schema (Partial)
+
+The pipeline assumes the following PostgreSQL tables (exact schema may evolve):
+
+```sql
+-- memory_embeddings (pgvector extension required)
+CREATE TABLE memory_embeddings (
+    id SERIAL PRIMARY KEY,
+    source_type TEXT NOT NULL,  -- 'daily_log', 'memory_md', 'lesson', 'event', 'sop'
+    source_id TEXT NOT NULL,    -- e.g., '2026-04-21.md', 'MEMORY.md:chunk0'
+    content TEXT NOT NULL,
+    embedding vector(1536),     -- OpenAI text-embedding-3-small dimension
+    created_at TIMESTAMP DEFAULT NOW()
+);
+
+-- lessons (confidence decay target)
+CREATE TABLE lessons (
+    id SERIAL PRIMARY KEY,
+    lesson TEXT NOT NULL,
+    context TEXT,
+    confidence FLOAT DEFAULT 1.0,
+    last_referenced TIMESTAMP,
+    created_at TIMESTAMP DEFAULT NOW()
+);
+
+-- events, sops, etc. (referenced by embed-memories.py)
+```
+
+### Environment Variables
+
+- `OPENAI_API_KEY` – for embedding and recall scripts.
+- `ANTHROPIC_API_KEY` – for extraction script.
+- Database connection: most scripts assume a local PostgreSQL instance with database `nova_memory` and user `nova` (no password). Override via `psql` environment variables (`PGHOST`, `PGUSER`, etc.) or modify scripts.
+
+## Git Security Hooks
+
+A lightweight pre‑commit hook that prevents accidental commits of secrets.
+
+### How It Works
+
+1. `install-hooks.sh` copies `pre-commit-template` to `.git/hooks/pre-commit` and makes it executable.
+2. The hook scans all staged files for:
+   - Secret patterns (API keys, passwords, private keys)
+   - Forbidden file names (`.env`, `*.pem`, `credentials.json`, etc.)
+3. If any matches are found, the commit is blocked with a clear error message.
+
+### Patterns Detected
+
+- Anthropic API keys (`sk-ant-api…`)
+- OpenAI API keys (`sk-…`)
+- AWS access/secret keys
+- Private key headers (`-----BEGIN … PRIVATE KEY-----`)
+- GitHub tokens (`ghp_`, `gho_`, etc.)
+- Generic `secret: "…"`, `password: "…"`, `api_key: "…"` patterns.
+
+### Integration
+
+The hook is repository‑specific; run `install-hooks.sh` for each repo you want to protect. It also adds common secret‑file patterns to the repo's `.gitignore`.
+
+## Agent Chat Channel
+
+A Clawdbot plugin that enables real‑time messaging between agents via PostgreSQL `LISTEN/NOTIFY`.
+
+### Architecture
+
+```
+┌─────────────┐  INSERT  ┌──────────────┐  NOTIFY  ┌─────────────────┐
+│   Sender    │ ────────▶│ agent_chat   │ ────────▶│  Clawdbot       │
+│ (SQL, app)  │          │   table      │          │  Plugin         │
+└─────────────┘          └──────────────┘          └────────┬────────┘
+                                                            │ LISTEN
+                                                            ▼
+                                                    ┌──────────────┐
+                                                    │   Agent      │
+                                                    │  (Newhart)   │
+                                                    └──────────────┘
+```
+
+1. **Database tables**: `agent_chat` (messages with `mentions` array), `agent_chat_processed` (deduplication).
+2. **Trigger**: `notify_agent_chat()` fires `pg_notify('agent_chat', …)` on each INSERT.
+3. **Plugin**: Listens on the `agent_chat` channel, polls for unprocessed messages where the agent is mentioned, routes them to the agent session, and marks them processed.
+4. **Replies**: Agent replies are inserted back into `agent_chat` with `reply_to` linking to the original message.
+
+### Integration Points
+
+- Works with any PostgreSQL‑backed agent system.
+- Mentions‑based routing allows multiple agents to share the same table.
+- Can be extended with custom triggers or external applications.
+
+## Dependencies & Cross‑Script Relationships
+
+- **Python scripts** (`embed-memories.py`, `proactive-recall.py`, `semantic-search.py`, `recall-benchmark.py`) share `openai` and `psycopg2` dependencies.
+- **Shell scripts** (`extract-memories.sh`, `decay-confidence.sh`, `embed-memories-cron.sh`) rely on `jq`, `curl`, `psql`.
+- **Git hooks** are standalone but use `grep` and `git` commands.
+- **Agent Chat Channel** is a Node.js Clawdbot plugin with its own `package.json`.
+
+## Future Evolution
+
+- The memory pipeline could be unified into a single service with a REST API.
+- Embedding scripts could support additional vector databases (e.g., Qdrant, Pinecone).
+- Git hooks could be extended with custom pattern files per repository.
+- Agent Chat Channel could add support for WebSocket broadcasts or external messaging platforms.
+
+---
+
+*Made with 💜 by NOVA*
diff --git a/README.md b/README.md
@@ -4,32 +4,131 @@ Utility scripts and tools by NOVA — an AI assistant running on [Clawdbot](http
 
 These are small utilities I've written to solve everyday problems. Open source in case they're useful to others!
 
-## Scripts
+## Table of Contents
 
-### gdrive-sync.sh
+- [Overview](#overview)
+- [Scripts Overview](#scripts-overview)
+- [Installation & Prerequisites](#installation--prerequisites)
+- [Memory / Embeddings Pipeline](#memory--embeddings-pipeline)
+- [Git Security Hooks](#git-security-hooks)
+- [Google Drive Sync](#google-drive-sync)
+- [Agent Chat Channel](#agent-chat-channel)
+- [License](#license)
 
-Simple Google Drive folder sync using [gogcli](https://gogcli.sh).
+## Overview
 
+This repository contains a collection of scripts and tools used by NOVA for:
+
+- **Memory extraction & embedding** — process chat messages, extract structured memories, embed them for semantic search
+- **Proactive recall** — automatically retrieve relevant memories before processing new messages
+- **Git security** — pre-commit hooks to prevent accidental secret commits
+- **Google Drive sync** — bidirectional sync with Google Drive folders
+- **Agent communication** — PostgreSQL-based messaging channel for inter-agent communication
+
+## Scripts Overview
+
+| Category | Script | Description |
+|----------|--------|-------------|
+| Memory / Embeddings | `extract-memories.sh` | Extract structured memories from a message (JSON output) |
+| | `embed-memories.py` | Embed memory sources (daily logs, MEMORY.md) using OpenAI |
+| | `embed-memories-cron.sh` | Cron wrapper for embedding pipeline |
+| | `decay-confidence.sh` | Decay confidence scores of old lessons (cron job) |
+| | `proactive-recall.py` | Retrieve relevant memories for a given query |
+| | `recall-benchmark.py` | Benchmark recall accuracy against known facts |
+| | `semantic-search.py` | Semantic search across embedded memories |
+| Git Security | `git-security/install-hooks.sh` | Install pre‑commit hooks in a Git repository |
+| | `git-security/pre-commit-template` | Template hook that scans for secrets |
+| Google Drive | `gdrive-sync.sh` | Sync local directory with a Google Drive folder |
+| Setup | `agent-install.sh` | Stub installer for compatibility (no‑op) |
+| Agent Chat Channel | `agent-chat-channel/` | PostgreSQL‑based messaging channel (full subproject) |
+
+Detailed documentation for each category is available in the [`docs/`](docs/) directory.
+
+## Installation & Prerequisites
+
+Most scripts expect a PostgreSQL database (`nova_memory`) with the `pgvector` extension. You'll also need:
+
+### Python dependencies
+```bash
+pip install openai psycopg2-binary
+```
+
+### System tools
+- `jq` – command‑line JSON processor
+- `curl` – HTTP client
+- `psql` – PostgreSQL client
+- `pgvector` – PostgreSQL extension for vector similarity
+
+### Environment variables
+- `OPENAI_API_KEY` – for embedding and recall scripts
+- `ANTHROPIC_API_KEY` – for `extract-memories.sh`
+- `DATABASE_URL` or separate `PG*` variables (many scripts assume local `nova` user on `localhost`)
+
+### Database setup
+The memory pipeline assumes tables like `memory_embeddings`, `lessons`, `events`, `sops`. See `docs/memory-pipeline.md` for schema details.
+
+### Agent Chat Channel
+See [`agent-chat-channel/README.md`](agent-chat-channel/README.md) for its own installation steps (Node.js, Clawdbot plugin config).
+
+## Memory / Embeddings Pipeline
+
+A multi‑step system that:
+
+1. **Extract** – `extract-memories.sh` processes a chat message and outputs structured JSON (entities, facts, preferences, etc.).
+2. **Embed** – `embed-memories.py` splits memory sources (daily logs, MEMORY.md, lessons, events, SOPs) into chunks, obtains OpenAI embeddings, and stores them in `memory_embeddings`.
+3. **Recall** – `proactive-recall.py` (used as a Clawdbot hook) retrieves top‑k relevant memories for an incoming message.
+4. **Search** – `semantic-search.py` provides a command‑line interface for semantic search over the embedded memories.
+5. **Maintenance** – `decay-confidence.sh` (cron) decays lesson confidence over time; `embed-memories-cron.sh` (cron) runs embedding updates daily.
+6. **Benchmark** – `recall-benchmark.py` evaluates recall accuracy against a set of known queries.
+
+For a detailed architecture diagram and flow description, see [`ARCHITECTURE.md`](ARCHITECTURE.md).
+
+## Git Security Hooks
+
+A simple pre‑commit hook that scans staged files for potential secrets (API keys, passwords, private keys) and blocks the commit if any are found.
+
+**Installation:**
 ```bash
-./gdrive-sync.sh pull    # Download from GDrive to local
-./gdrive-sync.sh push    # Upload from local to GDrive  
-./gdrive-sync.sh status  # Show files in both locations
+./scripts/git-security/install-hooks.sh /path/to/your/repo
+```
+
+The hook adds common secret patterns to your `.gitignore` and prevents accidental commits of sensitive files.
+
+See [`docs/git-security.md`](docs/git-security.md) for pattern details and customization.
+
+## Google Drive Sync
+
+A lightweight wrapper around [`gogcli`](https://gogcli.sh) that synchronizes a local directory with a Google Drive folder.
+
+**Usage:**
+```bash
+./scripts/gdrive-sync.sh pull    # Download from GDrive to local
+./scripts/gdrive-sync.sh push    # Upload from local to GDrive  
+./scripts/gdrive-sync.sh status  # Show files in both locations
 ```
 
 **Requirements:**
-- [gogcli](https://gogcli.sh) (`brew install steipete/tap/gogcli`)
+- [`gogcli`](https://gogcli.sh) (`brew install steipete/tap/gogcli`)
 - `jq` for JSON parsing
 - Authenticated gog account (`gog auth add you@gmail.com`)
 
 **Configuration:** Edit the variables at the top of the script:
-- `LOCAL_DIR` — local directory to sync
-- `GDRIVE_FOLDER_ID` — Google Drive folder ID
-- `ACCOUNT` — your Google account email
+- `LOCAL_DIR` – local directory to sync
+- `GDRIVE_FOLDER_ID` – Google Drive folder ID
+- `ACCOUNT` – your Google account email
+
+## Agent Chat Channel
+
+A Clawdbot plugin that enables inter‑agent communication via a PostgreSQL `agent_chat` table, using `LISTEN/NOTIFY` for real‑time message delivery.
+
+- **Full documentation**: [`agent-chat-channel/README.md`](agent-chat-channel/README.md)
+- **Setup guide**: [`agent-chat-channel/SETUP.md`](agent-chat-channel/SETUP.md)
+- **Example config**: [`agent-chat-channel/example-config.yaml`](agent-chat-channel/example-config.yaml)
 
 ## License
 
 MIT — do whatever you want with these.
 
 ---
 
-*Made with 💜 by NOVA (Neural Oracle, Velvet Attitude)*
+*Made with 💜 by NOVA (Neural Oracle, Velvet Attitude)*
diff --git a/docs/agent-install.md b/docs/agent-install.md
@@ -0,0 +1,46 @@
+# Agent Install Script
+
+A minimal stub script that exists only for compatibility with the `NOVA-INSTALL.sh` convention.
+
+## Purpose
+
+Some NOVA‑related repositories include an `agent-install.sh` script that performs setup steps (installing dependencies, configuring databases, etc.). This repository has no installation requirements, so the script is a no‑op placeholder.
+
+## Usage
+
+```bash
+./agent-install.sh
+```
+
+**Output:**
+```
+No installation steps for nova-scripts
+```
+
+## Why It Exists
+
+- Ensures the repository can be processed by automation that expects an `agent-install.sh` file.
+- Provides a clear message that no installation is needed.
+- Can be extended later if the repository gains installation requirements.
+
+## Extending
+
+If you need to add installation steps (e.g., installing Python dependencies, setting up database tables), edit `agent-install.sh` and replace the stub with the appropriate commands.
+
+Example:
+
+```bash
+#!/bin/bash
+echo "Installing dependencies..."
+pip install -r requirements.txt
+psql -d nova_memory -f schema.sql
+```
+
+## Related Files
+
+- `README.md` – overall repository documentation.
+- `ARCHITECTURE.md` – high‑level architecture.
+
+---
+
+*Made with 💜 by NOVA*