🌟 MemGovern: Enhancing Code Agents with Experience Memory

🌟 MemGovern: Enhancing Code Agents with Experience Memory

🚀 Boost SWE-Agent Performance with Governance-Aware Memory Retrieval

🚀 Overview

🎯 Remember · Retrieve · Resolve - Make Past Experience Work for You

MemGovern enhances SWE-Agent by injecting governance-aware experience memories into the agent's reasoning loop. When facing a new GitHub issue, the agent retrieves similar past experiences and learns from successful resolution patterns.

🐛 New Issue → 🔍 Memory Retrieval → 📚 Experience Injection → 🧠 Enhanced Reasoning → ✅ Better Patches

📊 Performance

SWE-Agent vs MemGovern (Ours) on SWE-Bench Verified across 9 LLMs

🏆 Key Results

Model	SWE-Agent	MemGovern (Ours)	Improvement
Claude-4-Sonnet	66.6%	69.8%	+3.2
GPT5-Medium	65.0%	67.4%	+2.4
DeepSeek-V3.1T	62.8%	65.8%	+3.0
Qwen3-235B	47.2%	55.4%	+8.2
Kimi-K2-Instruct	43.8%	51.8%	+8.0
GPT-4o	23.2%	32.6%	+9.4
GPT-4o-Mini	14.0%	17.2%	+3.2

From "Solving from Scratch" → To "Learning from Experience"

📁 Repository Structure

MemGovern/
├── data/                  # 📦 Experience DB artifacts (Git LFS)
│   └── agentic_exp_data_1220_13w_DSnewPrompt/
│       ├── experience_data.json
│       └── chroma_db_experience/
├── trajectories/          # 🗂️ Model trajectory archives (Git LFS)
│   ├── gpt4o_*.tar.gz
│   ├── gemini3_pro_trajectory.tar.gz
│   └── ...
├── config/                # ⚙️ SWE-Agent compatible YAML configs
│   ├── benchmarks/        #    Benchmark sweep configurations
│   ├── demo/              #    Lightweight demo presets
│   ├── human/             #    Human study protocols
│   └── exotic/            #    Ablation experiment settings
├── tools/                 # 🔧 Memory pipeline utilities
│   ├── experience_server.py
│   ├── issue_memory_rag/
│   ├── exp_search/
│   └── ...
├── scripts/               # 📜 Data collection scripts
│   ├── github_scraper.py
│   └── experience_process.py
├── figs/                  # 🖼️ Publication-ready figures
└── requirements.txt       # 📦 Runtime deps (installs SWE-agent + utilities)

⚙️ Reproducing MemGovern (Linux + Docker)

MemGovern is implemented as memory tools + configs on top of SWE-agent. A full run uses two terminals:

Terminal A: start the Experience Server (vector search + experience lookup)
Terminal B: run SWE-agent on SWE-bench with a MemGovern config that calls the server tools

Requirements: Linux (or WSL2), Python ≥ 3.11, Git, Docker.

WSL2 note: Windows drives are mounted under /mnt/ (e.g., E:\ → /mnt/e/).

1) Install (this also downloads SWE-agent code)

git clone https://github.com/QuantaAlpha/MemGovern.git
cd MemGovern

python3 -m venv SWE
source SWE/bin/activate
pip install -U pip
pip install -r requirements.txt

2) Prepare MemGovern experience data

The Experience Server needs two artifacts:

experience_data.json: governed experience cards (key → structured fields, including bug_description / fix_experience)
chroma_db_experience/: a persistent ChromaDB store used for semantic retrieval

In this repository, we provide them under:

data/agentic_exp_data_1220_13w_DSnewPrompt/ (tracked via Git LFS)

Place them in a directory (example layout):

<EXPERIENCE_DATA_DIR>/
├── experience_data.json
└── chroma_db_experience/
    ├── chroma.sqlite3
    └── <uuid>/
        ├── data_level0.bin
        ├── header.bin
        ├── index_metadata.pickle
        ├── length.bin
        └── link_lists.bin

Notes:

These artifacts are large; we recommend hosting them via Git LFS or a separate dataset release.

Retrieval quality depends on using the same embedding model at serving time as was used to build the ChromaDB store.

In our internal runs, we keep these files under a folder named agentic_exp_data_1220_13w_DSnewPrompt/.

3) Start the Experience Server (Terminal A)

cd <MEMGOVERN_ROOT>/data/agentic_exp_data_1220_13w_DSnewPrompt
source <MEMGOVERN_ROOT>/SWE/bin/activate

export DB_DIR="$PWD/chroma_db_experience"
export JSON_DATA_PATH="$PWD/experience_data.json"
export MODEL_PATH="<PATH_OR_MODEL_ID_FOR_SENTENCE_TRANSFORMERS>"
export HOST="0.0.0.0"
export PORT="9030"

python <MEMGOVERN_ROOT>/tools/experience_server.py

How to confirm it is running

In another shell:

curl -s http://localhost:9030/health

You should also see log lines like:

[TOOL] /search ...
[TOOL] /get_experience ...

when the agent uses the tools (this is the run-through evidence we use).

4) Run SWE-agent with MemGovern config (Terminal B)

Before running, edit config/dsv31t_agenticMemSearch_1220_13w.yaml and replace:

agent.model.api_base: YOUR_API_BASE
agent.model.api_key: YOUR_API_KEY

cd <MEMGOVERN_ROOT>
source SWE/bin/activate

sweagent run-batch \
  --config config/dsv31t_agenticMemSearch_1220_13w.yaml \
  --instances.type swe_bench \
  --instances.subset verified \
  --instances.split test \
  --num_workers 12 \
  --instances.shuffle=False

About the config → server wiring

config/dsv31t_agenticMemSearch_1220_13w.yaml sets tool endpoints:

GRAPH_EXP_SEARCH_URL: http://host.docker.internal:9030/search
GRAPH_EXP_READ_URL: http://host.docker.internal:9030/get_experience

This is the recommended setup when SWE-agent runs tasks inside Docker and the Experience Server runs on the host.

5) Evaluate (SWE-bench harness)

After the run finishes, evaluate the produced predictions:

python -m swebench.harness.run_evaluation \
  --predictions_path <PATH_TO_PREDS_JSON> \
  --dataset_name princeton-nlp/SWE-bench_Verified \
  --run_id <RUN_ID> \
  --max_workers 8

The predictions file is typically named preds.json under your run’s trajectories/ output directory.

If python -m swebench... is not available in your environment, install the SWE-bench harness following the official SWE-bench instructions.

🛠️ Tools & Configs

📊 Data Collection

Scrape GitHub PR data (metadata + patch + comments):

export GITHUB_TOKEN=your_github_token
python scripts/github_scraper.py \
  --csv-path <PATH_TO_INPUT_CSV> \
  --output-dir <OUTPUT_DIR> \
  --chunk-size 200

🧾 Experience Governance (Experience Card generation)

We provide experience_process.py to transform issue/PR/patch fields into governed experience cards using an LLM. It reads an input parquet table and writes JSONL/parquet with the Experience Card fields.

export API_KEY=your_llm_key
export BASE_URL=your_llm_base_url   # optional if using OpenAI default
export MODEL=your_model_name

python scripts/experience_process.py \
  --input <INPUT_PARQUET> \
  --output <OUTPUT_JSONL_OR_PARQUET> \
  --output-format jsonl \
  --max-workers 200

🧠 Experience Server

Launch the memory retrieval service (see “Reproducing MemGovern” above). The server reads these env vars:

DB_DIR
JSON_DATA_PATH
MODEL_PATH
HOST (default 0.0.0.0)
PORT (default 9030)

⚙️ Config Categories

Config	Use Case
`config/benchmarks/*.yaml`	Full benchmark sweeps with different governance settings
`config/demo/*.yaml`	Quick demos with minimal latency
`config/human/*.yaml`	Human evaluation study protocols
`config/exotic/*.yaml`	Ablation: windowed replace, late reproduction

🖼️ Visuals

🤝 Contributing

🌟 Join Us in Building Better Code Agents

We welcome contributions of all kinds—new configs, tools, bug fixes, or documentation improvements!

🚀 Ways to Contribute

🐛 Bug Reports: Open an issue
💡 New Configs: Add timestamped YAML files under config/
🔧 New Tools: Extend the tools/ directory with your utilities
📊 Trajectories: Share model runs via Git LFS

Note: Large files (>50 MB) should use Git LFS. Run git lfs ls-files before committing.

🙏 Acknowledgments

Special thanks to:

SWE-Agent - The foundation agent framework
RepoMaster - Autonomous repository exploration
SWE-Bench - The evaluation benchmark
ChromaDB - Vector database for memory retrieval

🌐 About QuantaAlpha

QuantaAlpha was founded in April 2025 by researchers from Tsinghua University, Peking University, CAS, CMU, HKUST, and more.

🌟 Our mission: Explore the "quantum" of intelligence and pioneer the "alpha" frontier of agent research.

✨ Research Directions:

CodeAgent: End-to-end autonomous task execution
DeepResearch: Deep reasoning & retrieval-augmented intelligence
Agentic RL: Agent-based reasoning and reinforcement learning
Self-evolution: Multi-agent coordination and learning

🔗 Team Homepage: QuantaAlpha
📧 Email: quantaalpha.ai@gmail.com

Star History

⭐ If MemGovern helps your research, please give us a star!

Made with ❤️ by the QuantaAlpha Team

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🌟 MemGovern: Enhancing Code Agents with Experience Memory

🚀 Overview

🎯 Remember · Retrieve · Resolve - Make Past Experience Work for You

📊 Performance

🏆 Key Results

📁 Repository Structure

⚙️ Reproducing MemGovern (Linux + Docker)

1) Install (this also downloads SWE-agent code)

2) Prepare MemGovern experience data

3) Start the Experience Server (Terminal A)

4) Run SWE-agent with MemGovern config (Terminal B)

5) Evaluate (SWE-bench harness)

🛠️ Tools & Configs

📊 Data Collection

🧾 Experience Governance (Experience Card generation)

🧠 Experience Server

⚙️ Config Categories

🖼️ Visuals

🤝 Contributing

🌟 Join Us in Building Better Code Agents

🚀 Ways to Contribute

🙏 Acknowledgments

🌐 About QuantaAlpha

Star History

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 1

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
config		config
data/agentic_exp_data_1220_13w_DSnewPrompt		data/agentic_exp_data_1220_13w_DSnewPrompt
figs		figs
scripts		scripts
tools		tools
trajectories		trajectories
.gitattributes		.gitattributes
README.md		README.md
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

🌟 MemGovern: Enhancing Code Agents with Experience Memory

🚀 Overview

🎯 Remember · Retrieve · Resolve - Make Past Experience Work for You

📊 Performance

🏆 Key Results

📁 Repository Structure

⚙️ Reproducing MemGovern (Linux + Docker)

1) Install (this also downloads SWE-agent code)

2) Prepare MemGovern experience data

3) Start the Experience Server (Terminal A)

4) Run SWE-agent with MemGovern config (Terminal B)

5) Evaluate (SWE-bench harness)

🛠️ Tools & Configs

📊 Data Collection

🧾 Experience Governance (Experience Card generation)

🧠 Experience Server

⚙️ Config Categories

🖼️ Visuals

🤝 Contributing

🌟 Join Us in Building Better Code Agents

🚀 Ways to Contribute

🙏 Acknowledgments

🌐 About QuantaAlpha

Star History

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 1

Languages

Packages