Skip to content

Aligning Agentic World Models via Knowledgeable Experience Learning

License

Notifications You must be signed in to change notification settings

zjunlp/WorldMind

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

🌐 WorldMind

Aligning Agentic World Models via Knowledgeable Experience Learning

Typing Animation
ζΌ”η€ΊεŠ¨η”»

WorldMind Framework

WorldMind is a framework for aligning agentic world models through knowledgeable experience learning, enabling agents to learn directly from the environment.


πŸ“– Overview

WorldMind introduces a paradigm shift in how embodied AI agents learn and adapt. Unlike traditional approaches that rely on extensive environment interaction or domain-specific fine-tuning, WorldMind operates as a training-free framework that enables agents to:

  • Learn from Experience: Extract reusable symbolic knowledge from both successful task completions and prediction errors without gradient updates.
  • Generalize Across Tasks: Apply learned causal rules and heuristics to novel situations through semantic similarity-based retrieval.
  • Continuously Improve: Accumulate and refine the World Knowledge Repository (WKR) throughout deployment.

Key Features

Feature Description
🧠 Experience Learning Combines Goal Experience (heuristics) from successful trajectories with Process Experience (causal boundaries) from prediction errors
πŸ”„ Experience-Driven Alignment Uses State Abstraction and Verifier components to align world model predictions with actual environment dynamics
🌐 Universal Adaptability Seamlessly generalizes across diverse embodied environments (ALFRED, Habitat, Navigation) and tasks without specific fine-tuning
πŸ”Œ Modular Plugin Standalone plugin for easy integration into existing agent systems

Method

WorldMind introduces a two-stage approach for world model alignment:

Stage 1 extracts knowledge during task execution (World Knowledge Building):

  • Goal Experience: From successful trajectories, distill procedural heuristics to guide task optimality.
  • Process Experience: Employ a Predict-Act-Verify loop. When a Verifier detects a semantic discrepancy between the predicted and actual abstract states, a Self-Reflexion mechanism synthesizes corrective causal rules.

Stage 2 applies learned knowledge to new tasks (Inference via Constrained Simulation):

  • Retrieve relevant Process and Goal experiences via semantic similarity.
  • Gated Simulation: Selectively simulate outcomes only when target objects are grounded, enhancing inference efficiency.
  • Augment world model prompts with retrieved knowledge to constrain planning within physical feasibility.

πŸ–₯️ Installation

Note: We need to set up two conda environments:

  • worldmind for EB-ALFRED and EB-Habitat
  • worldmind_nav for EB-Navigation

Please use SSH download instead of HTTP to avoid errors during git lfs pull.

Environment Setup

1. Clone Repository

git clone https://github.com/zjunlp/WorldMind.git
cd WorldMind

2. Create Conda Environments

1️⃣ Environment for ALFRED and Habitat (High-Level Planning)
# Create environment named 'worldmind' 
conda env create -f conda_envs/environment.yaml 
conda activate worldmind
pip install -e .
2️⃣ Environment for Navigation (Low-Level Navigation)
# Create environment named 'worldmind_nav'
conda env create -f conda_envs/environment_eb-nav.yaml 
conda activate worldmind_nav
pip install -e .

3. Start Headless Server

For headless servers, start the X server in a separate tmux window:

conda activate worldmind
python -m embodiedbench.envs.eb_alfred.scripts.startx 1

Task-Specific Setup

🏠 EB-ALFRED (Household Tasks)

1.Download Data:

conda activate embench
git clone https://huggingface.co/datasets/EmbodiedBench/EB-ALFRED
mv EB-ALFRED embodiedbench/envs/eb_alfred/data/json_2.1.0

2.Verify Installation:

conda activate worldmind

# Remember to start the headless server first!
python -m embodiedbench.envs.eb_alfred.EBAlfEnv
πŸ›‹οΈ EB-Habitat (Rearrangement Tasks)

1. Install Habitat Sim & Lab:

conda activate worldmind

# Install Habitat-Sim with Bullet physics support
conda install -y habitat-sim==0.3.0 withbullet headless -c conda-forge -c aihabitat

# Install Habitat-Lab
cd ./habitat-lab
pip install -e habitat-lab
cd ..

2.Download Data: Download YCB and ReplicaCAD dataset for the Language Rearrangement task.

conda install -y -c conda-forge git-lfs
python -m habitat_sim.utils.datasets_download --uids rearrange_task_assets
mv data embodiedbench/envs/eb_habitat

Note: After the above step, there should be a data folder under embodiedbench/envs/eb_habitat.

2. Verify Installation: Run the following code to ensure the EB-Habitat environment is working correctly.

python -m embodiedbench.envs.eb_habitat.EBHabEnv
🧭 EB-Navigation (Vision-and-Language Navigation)

Verify Installation: Run the following code to ensure the EB-Navigation environment is working correctly.

conda activate worldmind_nav
python -m embodiedbench.envs.eb_navigation.EBNavEnv

πŸš€ Quick Start

Running Experiments

We provide a universal run script run.sh for easy experiment execution. Simply configure the script and run:

#!/bin/bash
# WorldMind Universal Run Script
# Supports all three environments: Alfred (eb-alf), Habitat (eb-hab), Navigation (eb-nav)

set -e

# ============================================================
# ENVIRONMENT VARIABLES (Export Section)
# ============================================================

export CUDA_VISIBLE_DEVICES=0
export OPENAI_API_KEY="your-openai-api-key"
export OPENAI_BASE_URL="your-openai-base-url"

# ============================================================
# CONFIGURATION PARAMETERS (Edit here)
# ============================================================

MODEL_NAME="gpt-3.5-turbo"   # Choose your model
ENV="eb-hab"              # Options: eb-alf, eb-hab, eb-nav
EXP_NAME="test"       # Your experiment name
ENABLE_WORLDMIND="True"   # True or False

# WorldMind component models (fixed to MODEL_NAME)
export WORLDMIND_DISCRIMINATOR_MODEL="$MODEL_NAME"
export WORLDMIND_SUMMARIZER_MODEL="$MODEL_NAME"
export WORLDMIND_REFLECTOR_MODEL="$MODEL_NAME"
export WORLDMIND_REFINER_MODEL="$MODEL_NAME"

# ============================================================
# VALIDATION
# ============================================================

if [ -z "$OPENAI_API_KEY" ]; then
    echo "=========================================="
    echo "ERROR: OPENAI_API_KEY not set!"
    echo "=========================================="
    exit 1
fi

case "$ENV" in
    eb-alf|eb-hab|eb-nav)
        echo "βœ“ Valid environment: $ENV"
        ;;
    *)
        echo "=========================================="
        echo "ERROR: Invalid environment '$ENV'"
        echo "=========================================="
        echo "Valid options: eb-alf, eb-hab, eb-nav"
        exit 1
        ;;
esac

# ============================================================
# DISPLAY CONFIGURATION
# ============================================================

echo ""
echo "=========================================="
echo "WorldMind Experiment Configuration"
echo "=========================================="
echo "Environment:     $ENV"
echo "Model:           $MODEL_NAME"
echo "Experiment:      $EXP_NAME"
echo "WorldMind:       $ENABLE_WORLDMIND"
echo "----------------------------------------"
echo "GPU Device:      $CUDA_VISIBLE_DEVICES"
echo "Display:         $DISPLAY"
echo "API Base URL:    $OPENAI_BASE_URL"
echo "=========================================="
echo ""

# ============================================================
# RUN EXPERIMENT
# ============================================================

python -m embodiedbench.main \
    env="$ENV" \
    model_name="$MODEL_NAME" \
    exp_name="$EXP_NAME" \
    enable_worldmind="$ENABLE_WORLDMIND"

Usage:

bash run.sh

Configuration

WorldMind uses YAML configuration files for experiment settings. You can find and customize these files in the WorldMind/embodiedbench/configs directory.

πŸ“„ Click to view example configuration (`configs/eb-nav.yaml`)
# configs/eb-nav.yaml
model_name: gpt-4o-mini
model_type: remote
exp_name: navigation_baseline

# WorldMind Settings
enable_worldmind: True
use_vision_discriminator: false
use_experience_trajectory: true
detailed_output: true

# Goal Experience Settings
enable_goal_experience: true
goal_experience_top_k: 2

# Process Experience Settings
enable_process_experience: true
process_experience_top_k: 2

# Experience Refinement
enable_experience_refine: true
use_worldmind_template: true

Key Configuration Options

Parameter Description Default
enable_worldmind Enable WorldMind components True
enable_goal_experience Enable goal experience retrieval True
goal_experience_top_k Number of goal experiences to retrieve 2
enable_process_experience Enable process experience retrieval True
process_experience_top_k Number of process experiences to retrieve 2
enable_experience_refine Enable LLM-based experience refinement True

🌍 Environments

🏠 EB-ALFRED (Household Tasks)

A benchmark for grounded language learning in 3D household environments. Tasks require agents to execute multi-step instructions involving object manipulation.

Evaluation Metrics: Success Rate (SR) and Goal Condition (GC)

Evaluation Sets: Base, Common, Complex, Visual, Spatial

πŸ›‹οΈ EB-Habitat (Rearrangement Tasks)

A simulation platform for embodied AI research focusing on object rearrangement tasks in realistic indoor environments.

Evaluation Metrics: Success Rate (SR) and Goal Condition (GC)

Evaluation Sets: Base, Common, Complex, Visual, Spatial

🧭 EB-Navigation (Vision-and-Language Navigation)

A discrete navigation environment where agents must reach target locations through natural language instructions.

Evaluation Metrics: Success Rate (SR)

Evaluation Sets: Base, Common, Complex, Visual


πŸ”Œ WorldMind Plugin

To facilitate seamless integration across diverse domains, we provide a universal, standalone plugin featuring a highly modular architecture. This powerful tool empowers you to rapidly deploy WorldMind's core capabilitiesβ€”such as experience extraction and memory retrievalβ€”into your own custom environments or new projects with minimal effort, significantly accelerating your research and development pipeline.

from worldmind_plugin import (
    WorldMindConfig,
    ProcessExperienceModule,
    GoalExperienceModule,
    ExperienceRetrievalModule,
    ProcessTrajectoryStep,
    GoalTrajectoryStep
)

# Create configuration
config = WorldMindConfig(
    api_key="your-api-key",
    save_path="./worldmind_output"
)

# Initialize modules independently
process_module = ProcessExperienceModule(config)
goal_module = GoalExperienceModule(config)
retrieval_module = ExperienceRetrievalModule(config)

# Extract goal experience from successful trajectory
trajectory = [
    GoalTrajectoryStep(
        action="navigate_to(kitchen)",
        env_feedback="Arrived at kitchen",
        observation="Kitchen counter visible"
    ),
    # ... more steps
]

experience = goal_module.extract_experience(
    task_instruction="Go to the kitchen and get an apple",
    trajectory=trajectory
)

# Retrieve experiences for a new task
result = retrieval_module.retrieve(
    task_instruction="Find the coffee mug",
    enable_refine=True
)

# Use in agent prompt
agent_prompt = f"""You are a helpful assistant.

{result['formatted_prompt']}

Task: Find the coffee mug
"""

See Plugin/README.md for detailed documentation.


πŸ“ Project Structure

WorldMind/
β”œβ”€β”€ πŸ“‚ embodiedbench/
β”‚   β”œβ”€β”€ πŸ“‚ envs/                    # Environment implementations
β”‚   β”‚   β”œβ”€β”€ eb_alfred/              # ALFRED environment
β”‚   β”‚   β”œβ”€β”€ eb_habitat/             # Habitat environment
β”‚   β”‚   └── eb_navigation/          # Navigation environment
β”‚   β”œβ”€β”€ πŸ“‚ evaluator/               # Evaluation scripts
β”‚   └── πŸ“‚ worldmind/               # WorldMind core modules
β”‚       β”œβ”€β”€ alfred/                 # ALFRED integration
β”‚       β”œβ”€β”€ habitat/                # Habitat integration
β”‚       └── navigation/             # Navigation integration
β”œβ”€β”€ πŸ“‚ Plugin/                      # Standalone WorldMind Plugin
β”œβ”€β”€ πŸ“‚ assets/                      # Images and resources
└── πŸ“„ README.md

πŸ“Š Results

EB-ALFRED Results

Model Success Rate (SR) % Goal Condition (GC) %
AvgBaseCommonComplexVisualSpatial AvgBaseCommonComplexVisualSpatial
Open-source and Proprietary Models
GPT-4o56.864.054.068.046.052.065.174.060.374.058.361.3
GPT-4o-mini28.834.028.036.024.022.034.347.835.343.533.329.0
Claude-3.7-Sonnet67.268.068.070.068.062.065.372.066.076.763.059.7
Gemini-1.5-Pro63.270.064.072.058.052.067.474.366.776.562.859.0
Llama-3.2-90B-Vis35.238.034.044.028.032.037.643.737.349.235.336.0
InternVL2.5-78B37.041.040.039.016.049.041.042.335.343.335.740.3
GPT-3.5-turbo Based Methods
ReAct44.452.048.052.032.038.050.455.353.555.342.745.0
BoN42.846.042.050.042.034.050.454.246.556.552.042.8
SimuRA45.250.042.054.038.042.053.657.847.859.748.554.3
ReasoningBank41.650.036.044.036.042.047.657.541.547.044.248.0
Synapse38.838.046.040.036.034.043.642.551.342.742.039.7
AWM40.046.032.048.040.034.046.253.239.250.747.041.0
WorldMind48.058.048.056.034.044.054.163.052.761.041.752.0
GPT-4.1-mini Based Methods
ReAct41.250.040.046.038.032.047.555.342.852.247.239.8
BoN44.446.044.050.042.040.049.550.848.354.748.745.0
SimuRA45.652.044.054.038.040.052.261.050.358.245.346.3
ReasoningBank38.042.036.042.034.036.042.646.738.845.841.540.3
Synapse37.240.032.044.036.034.042.241.237.549.541.341.7
AWM41.244.036.048.038.040.046.048.342.052.544.342.7
WorldMind49.250.058.054.042.042.055.761.061.058.848.049.7

EB-Habitat Results

Model Success Rate (SR) % Goal Condition (GC) %
AvgBaseCommonComplexVisualSpatial AvgBaseCommonComplexVisualSpatial
Open-source and Proprietary Models
GPT-4o56.864.054.068.046.052.065.174.060.374.058.361.3
GPT-4o-mini28.834.028.036.024.022.034.347.835.343.533.329.0
Claude-3.7-Sonnet67.268.068.070.068.062.065.372.066.076.763.059.7
Gemini-1.5-Pro63.270.064.072.058.052.067.474.366.776.562.859.0
Llama-3.2-90B-Vis35.238.034.044.028.032.037.643.737.349.235.336.0
InternVL2.5-78B37.041.040.039.016.049.041.042.335.343.335.740.3
GPT-3.5-turbo Based Methods
ReAct44.452.048.052.032.038.050.455.353.555.342.745.0
BoN42.846.042.050.042.034.050.454.246.556.552.042.8
SimuRA45.250.042.054.038.042.053.657.847.859.748.554.3
ReasoningBank41.650.036.044.036.042.047.657.541.547.044.248.0
Synapse38.838.046.040.036.034.043.642.551.342.742.039.7
AWM40.046.032.048.040.034.046.253.239.250.747.041.0
WorldMind48.058.048.056.034.044.054.163.052.761.041.752.0
GPT-4.1-mini Based Methods
ReAct41.250.040.046.038.032.047.555.342.852.247.239.8
BoN44.446.044.050.042.040.049.550.848.354.748.745.0
SimuRA45.652.044.054.038.040.052.261.050.358.245.346.3
ReasoningBank38.042.036.042.034.036.042.646.738.845.841.540.3
Synapse37.240.032.044.036.034.042.241.237.549.541.341.7
AWM41.244.036.048.038.040.046.048.342.052.544.342.7
WorldMind49.250.058.054.042.042.055.761.061.058.848.049.7

Detailed results and ablation studies available in our paper.


πŸ“ Citation

If you find this work useful, please cite:

@article{ren2026aligning,
  title={Aligning Agentic World Models via Knowledgeable Experience Learning},
  author={Ren, Baochang and Yao, Yunzhi and Sun, Rui and Qiao, Shuofei and Zhang, Ningyu and Chen, Huajun},
  journal={arXiv preprint arXiv:2601.13247},
  year={2026}
}

πŸ™ Acknowledgments

We thank the following projects and teams for their open-source contributions:

  • EmbodiedBench for the evaluation tasks
  • ALFRED and AI2-THOR for the household task benchmark and simulation environment
  • Habitat for the rearrangement simulation platform
  • vLLM for efficient LLM inference and serving