Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
180 changes: 180 additions & 0 deletions content/en/open_source/modules/mem_chat.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,180 @@
---
title: MemChat
desc: MemChat is your "memory diplomat". It coordinates user input, memory retrieval, and LLM generation to create coherent conversations with long-term memory.
---

## 1. Introduction

**MemChat** is the conversation control center of MemOS.

It is not just a chat interface, but a bridge connecting "instant conversation" and "long-term memory". During interactions with users, MemChat is responsible for real-time retrieval of relevant background information from MemCube (Memory Cube), building context, and crystallizing new conversation content into new memories. With it, your Agent is no longer "goldfish memory", but a truly intelligent companion that can understand the past and continuously grow.

---

## 2. Core Capabilities

### Memory-Augmented Chat
Before answering user questions, MemChat automatically retrieves relevant Textual Memory from MemCube and injects it into the Prompt. This enables the Agent to answer questions based on past interaction history or knowledge bases, rather than relying solely on the LLM's pre-trained knowledge.

### Auto-Memorization
After conversation, MemChat uses Extractor LLM to automatically extract valuable information from the conversation flow (such as user preferences, factual knowledge) and store it in MemCube. The entire process is fully automated without manual user intervention.

### Context Management
Automatically manages conversation history window (`max_turns_window`). When conversations become too long, it intelligently trims old context while relying on retrieved long-term memory to maintain conversation coherence, effectively solving the LLM Context Window limitation problem.

### Flexible Configuration
Supports configurable toggles for different types of memory (textual memory, activation memory, etc.) to adapt to different application scenarios.

---

## 3. Code Structure

Core logic is located under `memos/src/memos/mem_chat/`.

* **`simple.py`**: **Default implementation (SimpleMemChat)**. This is an out-of-the-box REPL (Read-Eval-Print Loop) implementation containing complete "retrieve -> generate -> store" loop logic.
* **`base.py`**: **Interface definition (BaseMemChat)**. Defines the basic behavior of MemChat, such as `run()` and `mem_cube` properties.
* **`factory.py`**: **Factory class**. Responsible for instantiating concrete MemChat objects based on configuration (`MemChatConfig`).

---

## 4. Key Interface

The main interaction entry point is the `MemChat` class (typically created by `MemChatFactory`).

### 4.1 Initialization
You need to first create a configuration object, then create an instance through the factory method. After creation, you must mount the `MemCube` instance to `mem_chat.mem_cube`.

### 4.2 `run()`
Starts an interactive command-line conversation loop. Suitable for development and debugging, it handles user input, calls memory retrieval, generates replies, and prints output.

### 4.3 Properties
* **`mem_cube`**: Associated MemCube object. MemChat reads and writes memories through it.
* **`chat_llm`**: LLM instance used to generate replies.

---

## 5. Workflow

A typical conversation round in MemChat includes the following steps:

1. **Receive Input**: Get user text input.
2. **Memory Recall**: (If `enable_textual_memory` is enabled) Use user input as Query to retrieve Top-K relevant memories from `mem_cube.text_mem`.
3. **Prompt Construction**: Concatenate system prompt, retrieved memories, and recent conversation history into a complete Prompt.
4. **Generate Response**: Call `chat_llm` to generate a reply.
5. **Memorization**: (If `enable_textual_memory` is enabled) Send this round's conversation (User + Assistant) to `mem_cube`'s extractor, extract new memories and store them in the database.

---

## 6. Development Example

Below is a complete code example showing how to configure MemChat and mount a MemCube based on Qdrant and OpenAI.

### 6.1 Code Implementation

```python
import os
import sys

# Ensure src module can be imported
sys.path.append(os.path.abspath(os.path.join(os.path.dirname(__file__), "../../../src")))

from memos.configs.mem_chat import MemChatConfigFactory
from memos.configs.mem_cube import GeneralMemCubeConfig
from memos.mem_chat.factory import MemChatFactory
from memos.mem_cube.general import GeneralMemCube

def get_mem_chat_config() -> MemChatConfigFactory:
"""Generate MemChat configuration"""
return MemChatConfigFactory.model_validate(
{
"backend": "simple",
"config": {
"user_id": "user_123",
"chat_llm": {
"backend": "openai",
"config": {
"model_name_or_path": os.getenv("MOS_CHAT_MODEL", "gpt-4o"),
"temperature": 0.8,
"max_tokens": 1024,
"api_key": os.getenv("OPENAI_API_KEY"),
"api_base": os.getenv("OPENAI_API_BASE"),
},
},
"max_turns_window": 20,
"top_k": 5,
"enable_textual_memory": True, # Enable explicit memory
},
}
)

def get_mem_cube_config() -> GeneralMemCubeConfig:
"""Generate MemCube configuration"""
return GeneralMemCubeConfig.model_validate(
{
"user_id": "user03alice",
"cube_id": "user03alice/mem_cube_tree",
"text_mem": {
"backend": "general_text",
"config": {
"cube_id": "user03alice/mem_cube_general",
"extractor_llm": {
"backend": "openai",
"config": {
"model_name_or_path": os.getenv("MOS_CHAT_MODEL", "gpt-4o"),
"api_key": os.getenv("OPENAI_API_KEY"),
"api_base": os.getenv("OPENAI_API_BASE"),
},
},
"vector_db": {
"backend": "qdrant",
"config": {
"collection_name": "user03alice_mem_cube_general",
"vector_dimension": 1024,
},
},
"embedder": {
"backend": os.getenv("MOS_EMBEDDER_BACKEND", "universal_api"),
"config": {
"provider": "openai",
"api_key": os.getenv("MOS_EMBEDDER_API_KEY", "EMPTY"),
"model_name_or_path": os.getenv("MOS_EMBEDDER_MODEL", "bge-m3"),
"base_url": os.getenv("MOS_EMBEDDER_API_BASE"),
},
},
},
},
}
)

def main():
print("Initializing MemChat...")
mem_chat = MemChatFactory.from_config(get_mem_chat_config())

print("Initializing MemCube...")
mem_cube = GeneralMemCube(get_mem_cube_config())

# Critical step: mount the memory cube
mem_chat.mem_cube = mem_cube

print("Starting Chat Session...")
try:
mem_chat.run()
finally:
print("Saving memory cube...")
mem_chat.mem_cube.dump("new_cube_path")

if __name__ == "__main__":
main()
```

---

## 7. Configuration Description

When configuring `MemChatConfigFactory`, the following parameters are crucial:

* **`user_id`**: Required. Used to identify the current user in the conversation, ensuring memory isolation.
* **`chat_llm`**: Chat model configuration. Recommend using a capable model (such as GPT-4o) for better reply quality and instruction-following ability.
* **`enable_textual_memory`**: `True` / `False`. Whether to enable textual memory. If enabled, the system will perform retrieval before conversation and storage after conversation.
* **`max_turns_window`**: Integer. Number of conversation turns to retain in history. History beyond this limit will be truncated, relying on long-term memory to supplement context.
* **`top_k`**: Integer. How many most relevant memory fragments to retrieve from the memory library and inject into the Prompt each time.
75 changes: 41 additions & 34 deletions content/en/open_source/modules/mem_cube.md
Original file line number Diff line number Diff line change
@@ -1,23 +1,26 @@
---
title: MemCube Overview
desc: "`MemCube` is the core organizational unit in MemOS, designed to encapsulate and manage all types of memory for a user or agent. It provides a unified interface for loading, saving, and operating on multiple memory modules, making it easy to build, share, and deploy memory-augmented applications."
title: MemCube
desc: "`MemCube` is your memory container that manages three types of memories: textual memory, activation memory, and parametric memory. It provides a simple interface for loading, saving, and operating on multiple memory modules, making it easy to build, save, and share memory-augmented applications."
---
## What is a MemCube?

A **MemCube** is a container that bundles three major types of memory:
## What is MemCube?

- **Textual Memory** (e.g., `GeneralTextMemory`, `TreeTextMemory`): For storing and retrieving unstructured or structured text knowledge.
- **Activation Memory** (e.g., `KVCacheMemory`): For storing key-value caches to accelerate LLM inference and context reuse.
- **Parametric Memory** (e.g., `LoRAMemory`): For storing model adaptation parameters (like LoRA weights).
**MemCube** contains three major types of memory:

Each memory type is independently configurable and can be swapped or extended as needed.
- **Textual Memory**: Stores text knowledge, supporting semantic search and knowledge management.
- **Activation Memory**: Stores intermediate reasoning results, accelerating LLM responses.
- **Parametric Memory**: Stores model adaptation weights, used for personalization.

Each memory type can be independently configured and flexibly combined based on application needs.

## Structure

A MemCube is defined by a configuration (see `GeneralMemCubeConfig`), which specifies the backend and settings for each memory type. The typical structure is:
MemCube is defined by a configuration (see `GeneralMemCubeConfig`), which specifies the backend and settings for each memory type. The typical structure is:

```
MemCube
├── user_id
├── cube_id
├── text_mem: TextualMemory
├── act_mem: ActivationMemory
└── para_mem: ParametricMemory
Expand All @@ -35,7 +38,7 @@ Starting from MemOS 2.0, runtime operations (add/search) should go through the *

### SingleCubeView

Operates on a single MemCube. Use when you have one logical memory space.
Use this to manage a single MemCube. When you only need one memory space.

```python
from memos.multi_mem_cube.single_cube import SingleCubeView
Expand All @@ -59,7 +62,7 @@ view.search_memories(search_request)

### CompositeCubeView

Operates on multiple MemCubes. Fan-out operations to multiple SingleCubeViews and aggregate results.
Use this to manage multiple MemCubes. When you need unified operations across multiple memory spaces.

```python
from memos.multi_mem_cube.composite_cube import CompositeCubeView
Expand All @@ -76,15 +79,19 @@ results = composite.search_memories(search_request)
# Results contain cube_id field to identify source
```

### API Request Fields
## API Request Fields

When using the View architecture for add/search operations, specify these parameters:

| Field | Type | Description |
| :--- | :--- | :--- |
| `writable_cube_ids` | `list[str]` | Target cubes for add operations. Can specify multiple; the system will write to all targets in parallel. |
| `readable_cube_ids` | `list[str]` | Target cubes for search operations. Can search across multiple cubes; results include source information. |
| `async_mode` | `str` | Execution mode: `"sync"` for synchronous processing (wait for results), `"async"` for asynchronous processing (push to background queue, return task ID immediately). |

| Field | Description |
| --------------------- | ------------------------------------------------------------------ |
| `writable_cube_ids` | Target cubes for add operations |
| `readable_cube_ids` | Target cubes for search operations |
| `async_mode` | `"async"` (scheduler enabled) or `"sync"` (scheduler disabled) |
## Core Methods (`GeneralMemCube`)

## API Summary (`GeneralMemCube`)
**GeneralMemCube** is the standard implementation of MemCube, managing all system memories through a unified interface. Here are the main methods to complete memory lifecycle management.

### Initialization

Expand All @@ -95,23 +102,23 @@ mem_cube = GeneralMemCube(config)

### Static Data Operations

| Method | Description |
| ----------------------------------------- | --------------------------------------------------------- |
| `init_from_dir(dir)` | Load a MemCube from a local directory |
| `init_from_remote_repo(repo, base_url)` | Load a MemCube from remote repo (e.g., Hugging Face) |
| `load(dir)` | Load all memories from a directory into existing instance |
| `dump(dir)` | Save all memories to a directory for persistence |
| Method | Description |
| :--- | :--- |
| `init_from_dir(dir)` | Load a MemCube from a local directory |
| `init_from_remote_repo(repo, base_url)` | Load a MemCube from a remote repository (e.g., Hugging Face) |
| `load(dir)` | Load all memories from a directory into the existing instance |
| `dump(dir)` | Save all memories to a directory for persistence |

## File Storage
## File Structure

A MemCube directory contains:
A MemCube directory contains the following files, with each file corresponding to a memory type:

- `config.json` (MemCube configuration)
- `textual_memory.json` (textual memory)
- `activation_memory.pickle` (activation memory)
- `parametric_memory.adapter` (parametric memory)

## Example Usage
## Usage Examples

### Export Example (dump_cube.py)

Expand Down Expand Up @@ -156,15 +163,15 @@ result = view.add_memories(APIADDRequest(
))
print(f"✓ Added {len(result)} memories")

# 4. Export specific cube_id data
# 4. Export data for the specific cube_id
output_dir = "tmp/mem_cube_dump"
if os.path.exists(output_dir):
shutil.rmtree(output_dir)
os.makedirs(output_dir, exist_ok=True)

# Export graph data (only data for current cube_id)
# Export graph data (only data for the current cube_id)
json_data = naive.text_mem.graph_store.export_graph(
include_embedding=True, # Include embeddings for semantic search
include_embedding=True, # Include embeddings to support semantic search
user_name=EXAMPLE_CUBE_ID, # Filter by cube_id
)

Expand All @@ -187,7 +194,7 @@ print(f"✓ Saved to: {memory_file}")

### Import and Search Example (load_cube.py)

> **Note on Embeddings**: The sample data uses **bge-m3** model with **1024 dimensions**. If your environment uses a different embedding model or dimension, semantic search after import may be inaccurate or fail. Ensure your `.env` configuration matches the embedding settings used during export.
> **Embedding Compatibility Note**: The sample data uses the **bge-m3** model with **1024 dimensions**. If your environment uses a different embedding model or dimension, semantic search after import may be inaccurate or fail. Ensure your `.env` configuration matches the embedding settings used during export.

```python
import json
Expand Down Expand Up @@ -273,6 +280,6 @@ The old approach of directly calling `mem_cube.text_mem.get_all()` is deprecated

## Developer Notes

* MemCube enforces schema consistency for safe loading/dumping
* Each memory type is pluggable and independently tested
* See `/tests/mem_cube/` for integration tests and usage patterns
* MemCube enforces schema consistency to ensure safe loading and dumping
* Each memory type can be independently configured, tested, and extended
* See `/tests/mem_cube/` for integration tests and usage examples
8 changes: 4 additions & 4 deletions content/en/open_source/modules/mem_feedback.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
---
title: MemFeedback
desc: "MemFeedback enables your Agent to understand 'You remembered it wrong' and automatically correct the memory database. It is a key component for self-evolving memory."
desc: MemFeedback is your "memory error notebook". It enables your Agent to understand 'You remembered it wrong' and automatically correct the memory database. It is a key component for achieving self-evolving memory.
---

## 1. Introduction
Expand All @@ -9,7 +9,7 @@ desc: "MemFeedback enables your Agent to understand 'You remembered it wrong' an

In long-term memory systems, the biggest headache is often not "forgetting," but "remembering wrong and unable to change." When a user says, "No, my birthday is tomorrow" or "Change the project code to X," simple RAG systems are usually helpless.

MemFeedback can understand these natural language instructions, accurately locate conflicting memories in the database, and execute atomic correction operations (such as archiving old memories and writing new ones). With it, your Agent can correct errors and learn continuously during interactions, just like a human.
MemFeedback can understand these natural language instructions, automatically locate conflicting memories in the database, and execute atomic correction operations (such as archiving old memories and writing new ones). With it, your Agent can correct errors and learn continuously during interactions, just like a human.

---

Expand All @@ -23,7 +23,7 @@ When the user points out a factual error. The system will not brutally delete th
### Addition
If the user just supplements new information that does not conflict with old memories, it is simple—directly save it as a new node in the memory database.

### Keyword Replacement
### Keyword Replacement (Global Refactor)
Similar to "Global Refactor" in an IDE. For example, if the user says, "Change 'Zhang San' to 'Li Si' in all documents," the system will combine the Reranker to automatically determine the scope of affected documents and update all relevant memories in batches.

### Preference Evolution
Expand All @@ -50,7 +50,7 @@ There is only one main entry point: `process_feedback()`. It is usually called a

| Parameter | Description |
| :--- | :--- |
| `user_id` / `user_name` | User ID and Cube ID. |
| `user_id` / `user_name` | User identification and Cube ID. |
| `chat_history` | Conversation history, letting LLM know what you talked about. |
| `feedback_content` | The feedback sentence from the user (e.g., "No, it's 5 o'clock"). |
| **`retrieved_memory_ids`** | **Required (Strongly Recommended)**. Pass in the memory IDs retrieved in the previous RAG round. This gives the system a "target," telling it which memory to correct. If not passed, the system has to search again in the massive memory, which is slow and prone to errors. |
Expand Down
Loading