Skip to content

This project is the code implementation of the AAAI 2026 paper "Adaptive Theory of Mind for LLM-based Multi-Agent Coordination".

Notifications You must be signed in to change notification settings

ChunjiangMonkey/Adaptive-ToM

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Adaptive Theory of Mind for LLM-based Multi-Agent Coordination

This project is the code implementation of the paper Adaptive Theory of Mind for LLM-based Multi-Agent Coordination. It implements 0/1/2-order ToM agents and Adaptive ToM agents, and provides batch experiment scripts and result-analysis scripts. The project includes experiments implemented in the following three environments:

  • Coordination Game (coordination_game)
  • Grid World Navigation (grid_world)
  • Overcooked Environment (overcooked)

Environment Setup

Dependencies

  • Operating System: Ubuntu provides the best compatibility. Parallel scripts are not supported on Windows.
  • Python Version: The default interpreter version is Python 3.7. This is because it needs to support reinforcement learning policies based on TensorFlow 1.x and the older versions of the Overcooked environment.
  • Install required packages via:
    pip install -r requirements.txt

LLM API Configuration

Our paper reports results on Llama3.3-70B-Instruct. You can also try other high-performing models, such as the GPT series. Set the LLM API configurations in the following three files:

  • coordination_game/LLM_agent/api_key.py
  • grid_world/LLM_agent/api_key.py
  • overcooked/LLM_agent/api_key.py

Quick Start

1) Coordination Game (coordination_game)

  • Single run:

      python coordination_game/main.py --player1_type=adaptive --player2_type=adaptive --model_name=llama --exp_name=demo --horizon=20 --use_non_coordiantion_opening --adaptive_alg=Hedge
    
  • Batch run (This approach allows you to quickly reproduce the results of the paper. ):

      apt install parallel
      cd coordination_game
      chmod +x run.sh
      ./run.sh
    
  • Output directory:
    results/<exp_name>/<player1>_vs_<player2>_<model_name>[_flags]/<pid>/

  • Output files:

    • action.csv: actions of both players for each round
    • player*_prediction_vs_true_action.csv: predicted partner actions vs. real partner actions (extra output for 0/1/2-order ToM and Adaptive ToM agents)
  • Result analysis:

    • Run: python coordination_game/analyze.py <horizon> <exp_name>
    • Average score and standard deviation are saved in:
      results/<exp_name>/score.txt

2) Grid World Navigation (gird_world)

  • Single run:

      python gird_world/main.py --player1_type=adaptive --player2_type=adaptive --game_name=game1 --model_name=llama --exp_name=demo --horizon=20 --adaptive_alg=Hedge
    
  • Batch run (This approach allows you to quickly reproduce the results of the paper. ):

      apt install parallel
      cd gird_world
      chmod +x run.sh
      ./run.sh
    
  • Output directory:
    results/<exp_name>/<player1>_vs_<player2>_<model_name>/<pid>/

  • Output files:

    • player*_log.txt and public_log.txt: prompt/response traces and environment renderings
    • score.txt: number of steps the agents required (or total horizon if unfinished)
    • game_end.txt: True if both players reached their targets, otherwise absent/False
    • player*_loss*.txt: Adaptive ToM training loss
    • *_prediction_candidate_history.txt: prediction history of Adaptive ToM
  • Result analysis:

    • Run: python gird_world/anaylze.py <exp_name>
    • Aggregated success rate and score CSVs are saved under results/<exp_name>/

3) Overcooked (overcooked)

  • Single run:

      python overcooked/main.py --player1_type=adaptive_tom --player2_type=adaptive_tom --model_name=llama --exp_name=demo_overcooked  --horizon=80 --cook_time=20 --use_counter
    
  • Batch run (This approach allows you to quickly reproduce the results of the paper. ):

      apt install parallel
      cd overcooked
      chmod +x run.sh
      ./run.sh
    
  • Output directory:
    results/<exp_name>/<player1>_vs_<player2>_<model_name>[_flags]/<pid>/

  • Output files:

    • env_log.txt: environment rendering text
    • score.txt: total score (fixed horizon) or steps needed to finish (depending on --use_score_of_fixed_horizon)
    • game_end.txt: True if task completed within the time limit, otherwise False
    • player*_loss.txt: Adaptive ToM training loss
    • *_prediction_candidate_history.txt: prediction history of Adaptive ToM

Directory Structure

  • coordination_game/
    • coordination_game/main.py: entry point for coordination-game experiments
    • coordination_game/LLM_agent/: implementations of 0/1/2-order ToM and Adaptive ToM agents
    • coordination_game/analyze.py: result analysis and summary
    • coordination_game/run.sh: parallel experiment script
  • staghut_game/
    • staghut_game/main.py: entry point for stag-hunt experiments
    • staghut_game/LLM_agent/: implementations of 0/1/2-order ToM and Adaptive ToM agents
    • staghut_game/analyze.py: result analysis and summary
    • staghut_game/run.sh: batch experiment script
  • overcooked/
    • overcooked/main.py: entry point for Overcooked experiments
    • overcooked/overcooked_env/, overcooked/overcooked_ai_py/: environment implementation and dependencies
    • overcooked/LLM_agent/: ToM agents and prompt templates adapted for Overcooked
    • overcooked/rl_agent.py: RL baseline
    • overcooked/game_prompts/: prompt templates for different layouts
    • overcooked/requirements.txt: dependencies for this subproject
    • overcooked/run.sh: batch experiment script

About

This project is the code implementation of the AAAI 2026 paper "Adaptive Theory of Mind for LLM-based Multi-Agent Coordination".

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published