OpenDsStar Architecture

Overview

This document describes the improved architecture of the OpenDsStar project, focusing on clear separation of concerns between agents, tools, and experiments.

Directory Structure

src/
├── agents/                         # Agent implementations
│   ├── base_agent.py              # Base agent class
│   ├── ds_star/                   # DS-Star agent
│   │   ├── open_ds_star_agent.py  # Main agent class
│   │   ├── ds_star_graph.py       # Graph implementation
│   │   ├── ds_star_state.py       # State definition
│   │   ├── ds_star_execute_env.py # Execution environment
│   │   ├── ds_star_utils.py       # Utilities
│   │   └── nodes/                 # Graph nodes
│   ├── analyzer/                  # Analyzer agent
│   │   ├── analyzer_graph.py      # Graph implementation
│   │   ├── analyzer_state.py      # State definition
│   │   ├── analyzer_execute_env.py # Execution environment
│   │   └── nodes/                 # Graph nodes
│   ├── react_langchain/           # ReAct agent (LangChain)
│   │   └── react_agent_langchain.py
│   ├── react_smolagents/          # ReAct agent (SmoLAgents)
│   │   └── react_agent_smolagents.py
│   ├── codeact_smolagents/        # CodeAct agent (SmoLAgents)
│   │   └── codeact_agent_smolagents.py
│   └── utils/                     # Agent-specific utilities
│
├── ingestion/                      # Document ingestion utilities
│   ├── analyzer.py                # Analyzer-based processor
│   └── docling_analyzer.py        # Docling-based processor
│
├── tools/                          # Shared, reusable tools
│   ├── __init__.py
│   ├── vector_store_tool.py        # Semantic search tool
│   └── analyzer_retriever.py       # Analyzer summary retriever
│
├── experiments/             # Experiment framework
│   ├── core/                       # Core types and configuration
│   │   ├── config.py              # Configuration classes
│   │   ├── context.py             # Pipeline context
│   │   ├── types.py               # Type definitions
│   │   └── enums.py               # Enumerations
│   ├── interfaces/                 # Abstract interfaces
│   │   ├── agent_builder.py       # Agent builder interface
│   │   ├── tool_builder.py        # Tool builder interface
│   │   ├── data_reader.py         # Data reader interface
│   │   ├── evaluator.py           # Evaluator interface
│   │   └── agent_runner.py        # Agent runner interface
│   ├── implementations/            # Concrete implementations
│   │   └── invoke_agent_runner.py # Default agent runner
│   ├── evaluators/                 # Evaluation implementations
│   │   └── unitxt_llm_judge.py    # LLM-as-judge evaluator
│   ├── utils/                      # Utility functions
│   │   ├── cache.py               # Caching utilities
│   │   ├── evaluation_cache.py    # Evaluation caching
│   │   ├── logging.py             # Logging utilities
│   │   └── validation.py          # Validation utilities
│   ├── pipeline.py                 # Main experiment pipeline
│   └── experiments/                # Specific experiments
│       ├── base_experiment.py     # Base experiment class
│       ├── demo/                   # Demo experiment
│       └── hotpotqa/              # HotpotQA experiment
│
└── runner/                         # Simple runner utilities
    └── simple_qa_loop.py          # Interactive QA loop

Architecture Principles

1. Separation of Concerns

The architecture maintains clear boundaries between three main layers:

Agents Layer (`src/agents/`)

Responsibility: Agent implementations and their internal logic
Contains: Agent classes, graph definitions, nodes, and agent-specific utilities
Does NOT contain: Experiment configuration, evaluation logic, or tool definitions

Tools Layer (`src/tools/`)

Responsibility: Reusable tools that can be used by any agent
Contains: Tool implementations (retrievers, calculators, etc.)
Key principle: Tools are agent-agnostic and experiment-agnostic

Experiments Layer (`src/experiments/`)

Responsibility: Orchestrating experiments, evaluation, and benchmarking
Contains: Pipeline, interfaces, evaluators, and experiment configurations
Does NOT contain: Agent implementation details

2. Dependency Flow

Experiments Layer
    ↓ (uses interfaces)
Agents Layer
    ↓ (uses)
Tools Layer

Experiments depend on agent interfaces, not implementations
Agents use tools but don't own them
Tools are independent and reusable

3. Configuration Hierarchy

The new configuration system provides clear separation:

# Agent-specific configuration
AgentConfig:
    - model
    - temperature
    - max_steps
    - code_timeout
    - code_mode
    - system_prompt
    - task_prompt

# Experiment-specific configuration
ExperimentConfig:
    - run_id
    - fail_fast
    - output_dir
    - cache_dir
    - agent_config  # Nested agent config
    - use_cache
    - log_level

# Tool-specific configuration
ToolConfig:
    - embedding_model
    - chunk_size
    - chunk_overlap
    - top_k

Key Components

Pipeline (`experiments/pipeline.py`)

The ExperimentPipeline orchestrates the complete experiment workflow:

Read Data: Load corpus and benchmarks
Create Tools: Build tools from corpus using ToolBuilders
Build Agent: Create agent with tools using AgentBuilder
Run Agent: Execute agent on benchmarks using AgentRunner
Evaluate: Assess results using Evaluators

Interfaces (`experiments/interfaces/`)

All interfaces follow the dependency inversion principle:

AgentBuilder: Creates agents with tools
ToolBuilder: Creates tools from corpus/benchmarks
DataReader: Loads data for experiments
Evaluator: Evaluates agent outputs
AgentRunner: Executes agents on benchmarks

Base Experiment (`experiments/base/base_experiment.py`)

Provides a template for creating new experiments:

class MyExperiment(BaseExperiment):
    def get_data_reader(self) -> DataReader:
        # Return data reader implementation

    def get_tools_builder(self) -> Sequence[ToolBuilder]:
        # Return tool builders

    def get_agent_builder(self) -> AgentBuilder:
        # Return agent builder

    def get_evaluators(self) -> Sequence[Evaluator]:
        # Return evaluators

Design Patterns

1. Builder Pattern

Used for constructing complex objects (agents, tools):

# Tool Builder
class HotpotQAToolsBuilder(ToolBuilder):
    def build_tools(self, ctx, benchmarks, corpus):
        return [VectorStoreTool(corpus=corpus, ...)]

# Agent Builder
class DemoAgentBuilder(AgentBuilder):
    def build_agent(self, ctx, tools):
        return OpenDsStarAgent(tools=tools, ...)

2. Strategy Pattern

Used for different evaluation strategies:

class UnitxtLLMJudge(Evaluator):
    def evaluate_one(self, ctx, output, benchmark):
        # LLM-based evaluation logic

3. Template Method Pattern

Used in BaseExperiment to define experiment structure:

class BaseExperiment(ABC):
    def experiment_main(self):
        # Template method defining the workflow
        data_reader = self.get_data_reader()  # Abstract
        tools = self.get_tools_builder()      # Abstract
        agent = self.get_agent_builder()      # Abstract
        evaluators = self.get_evaluators()    # Abstract
        # ... run pipeline

Benefits of This Architecture

1. Modularity

Each component has a single, well-defined responsibility
Components can be developed and tested independently

2. Reusability

Tools can be shared across different agents and experiments
Evaluators can be reused for different benchmarks
Agent implementations are decoupled from experiments

3. Testability

Clear interfaces make mocking easy
Each layer can be unit tested independently
Integration tests can focus on specific interactions

4. Extensibility

New agents can be added without modifying experiments
New tools can be added without changing agents
New experiments can reuse existing components

5. Maintainability

Changes to agents don't affect experiments
Changes to tools don't affect agents
Clear boundaries reduce coupling

Migration Guide

Moving from Old Structure

Tools

Before:

from src.agents.tools.retrievers import AnalyzerSummaryRetrievalTool
from src.experiments.tools import VectorStoreTool

After:

from tools import AnalyzerSummaryRetrievalTool, VectorStoreTool

Configuration

Before:

agent = OpenDsStarAgent(
    model="watsonx/mistralai/mistral-medium-2505",
    temperature=0.0,
    max_steps=5,
    ...
)

After:

from experiments.core.config import AgentConfig

config = AgentConfig(
    model="watsonx/mistralai/mistral-medium-2505",
    temperature=0.0,
    max_steps=5,
)
agent = OpenDsStarAgent(**config.to_dict())

Best Practices

1. Keep Agents Simple

Agents should focus on reasoning and execution
Don't mix tool management with agent logic
Use configuration objects for parameters

2. Make Tools Reusable

Tools should be agent-agnostic
Avoid hardcoding agent-specific logic in tools
Use clear, descriptive tool names and descriptions

3. Use Interfaces

Always program to interfaces, not implementations
This allows easy swapping of components
Makes testing much easier

4. Separate Configuration

Keep agent config separate from experiment config
Use dataclasses for type safety
Validate configuration early

5. Document Interfaces

Clear docstrings for all interface methods
Include examples in documentation
Specify expected behavior and contracts

Future Improvements

Plugin System: Allow dynamic loading of agents and tools
Configuration Validation: Add schema validation for configs
Metrics Collection: Standardized metrics across experiments
Distributed Execution: Support for parallel experiment runs
Visualization: Tools for visualizing agent trajectories

Conclusion

This architecture provides a solid foundation for building, testing, and evaluating AI agents. The clear separation of concerns makes the codebase more maintainable and extensible, while the use of interfaces and configuration objects improves testability and flexibility.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

OpenDsStar Architecture

Overview

Directory Structure

Architecture Principles

1. Separation of Concerns

Agents Layer (`src/agents/`)

Tools Layer (`src/tools/`)

Experiments Layer (`src/experiments/`)

2. Dependency Flow

3. Configuration Hierarchy

Key Components

Pipeline (`experiments/pipeline.py`)

Interfaces (`experiments/interfaces/`)

Base Experiment (`experiments/base/base_experiment.py`)

Design Patterns

1. Builder Pattern

2. Strategy Pattern

3. Template Method Pattern

Benefits of This Architecture

1. Modularity

2. Reusability

3. Testability

4. Extensibility

5. Maintainability

Migration Guide

Moving from Old Structure

Tools

Configuration

Best Practices

1. Keep Agents Simple

2. Make Tools Reusable

3. Use Interfaces

4. Separate Configuration

5. Document Interfaces

Future Improvements

Conclusion

FilesExpand file tree

ARCHITECTURE.md

Latest commit

History

ARCHITECTURE.md

File metadata and controls

OpenDsStar Architecture

Overview

Directory Structure

Architecture Principles

1. Separation of Concerns

Agents Layer (src/agents/)

Tools Layer (src/tools/)

Experiments Layer (src/experiments/)

2. Dependency Flow

3. Configuration Hierarchy

Key Components

Pipeline (experiments/pipeline.py)

Interfaces (experiments/interfaces/)

Base Experiment (experiments/base/base_experiment.py)

Design Patterns

1. Builder Pattern

2. Strategy Pattern

3. Template Method Pattern

Benefits of This Architecture

1. Modularity

2. Reusability

3. Testability

4. Extensibility

5. Maintainability

Migration Guide

Moving from Old Structure

Tools

Configuration

Best Practices

1. Keep Agents Simple

2. Make Tools Reusable

3. Use Interfaces

4. Separate Configuration

5. Document Interfaces

Future Improvements

Conclusion

Agents Layer (`src/agents/`)

Tools Layer (`src/tools/`)

Experiments Layer (`src/experiments/`)

Pipeline (`experiments/pipeline.py`)

Interfaces (`experiments/interfaces/`)

Base Experiment (`experiments/base/base_experiment.py`)