Skip to content

M4T1SS3/DeltaLoop

Repository files navigation

DeltaLoop

Stop optimizing prompts. Start optimizing models.

Your agents should learn from experience, not just follow instructions.

Prefer Gen Z language? Check out README_GENZ.md for the same info but it hits different fr fr 🔥

DeltaLoop is an open-source continuous fine-tuning layer that automatically converts your AI agent logs into training data and fine-tunes your model to inherently understand your domain.


The Problem

Traditional AI agent optimization is stuck in an endless loop:

Agent fails → Check logs → Rewrite prompt → Deploy → Test → Repeat

This creates prompt bloat (1500+ tokens), requires manual labor (100+ hours), and the model never actually learns.

The DeltaLoop Solution

Turn logs you're already analyzing into training data instead of manually rewriting prompts.

With DeltaLoop:

Agent runs → Auto-collect logs → Fine-tune model → Deploy adapter → Improve

Automated improvement that compounds over time. No more endless prompt engineering.


Quick Start

Installation

pip install deltaloop

1. Instrument Your Agent (One Line!)

from deltaloop.adapters.langchain import DeltaLoopCallback

agent = create_react_agent(
    llm=llm,
    tools=tools,
    callbacks=[DeltaLoopCallback()]  # That's it!
)

# Run normally - logs auto-save to data/raw_logs/
agent.run(task)

2. Process, Train, Evaluate

# Process logs into training data
deltaloop distill --input data/raw_logs/traces.jsonl --output train.jsonl

# Fine-tune your model (LoRA adapters, only ~17MB!)
deltaloop train --dataset train.jsonl --model mistral-7b --steps 500

# Evaluate improvement
deltaloop eval --adapter data/models/v1

Done! Your model is now specialized for your domain.


Key Features

  • Framework Agnostic - Works with LangChain, AutoGen, CrewAI, LlamaIndex, or custom agents
  • Fully Automated - Logs → Training → Deployment in 3 commands
  • Lightweight - LoRA adapters are only ~17MB (not full model weights)
  • Open Source - Apache 2.0, no vendor lock-in
  • Cost Effective - Reduce prompt costs by 80%+

How It Works

graph LR
    A[Agent Logs] --> B[Distillation]
    B --> C[Fine-Tuning]
    C --> D[LoRA Adapter]
    D --> E[Production]
    E -.Feedback.-> A
Loading
  1. Adapters - Framework-specific log collectors (LangChain, AutoGen, etc.)
  2. Distillation - Convert logs into high-quality training datasets
  3. Training - Fine-tune with Unsloth, Transformers, or DPO
  4. Evaluation - Compare adapted model vs baseline
  5. Deployment - Load improved adapters into production

Python API

For programmatic workflows:

from deltaloop import Pipeline, PipelineConfig

# One-shot: logs → adapter
pipeline = Pipeline(PipelineConfig(
    raw_logs="data/raw_logs/traces.jsonl",
    base_model="mistral-7b",
    output_dir="data/models/v1"
))

result = pipeline.run()
print(f"Improvement: {result.eval_summary.improvement_percent:.1f}%")

Advanced: Framework Adapters

LangChain

from deltaloop.adapters.langchain import DeltaLoopCallback

agent = create_react_agent(callbacks=[DeltaLoopCallback()])

Custom/Other Frameworks

from deltaloop.adapters.generic import GenericLogger

logger = GenericLogger()

# Manually log each interaction
logger.log(
    prompt="Check order status for #12345",
    output="Order #12345 shipped on 2024-01-15",
    success=True,
    tool_calls=["check_order_status"]
)

logger.save("data/raw_logs/custom.jsonl")

Examples

Check out examples/customer_support_agent.py for a complete end-to-end example:

python examples/customer_support_agent.py --steps 100 --training-method sft

This demonstrates:

  • E-commerce support scenarios
  • Tool usage (order status, refunds, tickets)
  • Policy adherence
  • Before/after performance comparison

Project Status

Current: Alpha (v0.1.0) - Production-ready core, expanding features

  • ✅ Core distillation, training, and evaluation
  • ✅ LangChain adapter + generic logger
  • ✅ CLI with 4 commands
  • ✅ Python API
  • ✅ Comprehensive examples
  • 🚧 Additional framework adapters (AutoGen, CrewAI)
  • 🚧 Deployment automation
  • 🚧 Advanced evaluation tasks

Contributing

We welcome contributions! Priority areas:

  • Framework Adapters - AutoGen, CrewAI, Haystack, Semantic Kernel
  • Evaluation Tasks - Domain-specific benchmarks
  • Examples - Real-world use cases
  • Documentation - Tutorials, guides, videos

See CONTRIBUTING.md for details.


License

Apache 2.0 - See LICENSE for details.


Built for the open-source AI community.

Your agents should learn from experience, not just follow instructions.