Adaptive Theory of Mind for LLM-based Multi-Agent Coordination

This project is the code implementation of the paper Adaptive Theory of Mind for LLM-based Multi-Agent Coordination. It implements 0/1/2-order ToM agents and Adaptive ToM agents, and provides batch experiment scripts and result-analysis scripts. The project includes experiments implemented in the following three environments:

Coordination Game (coordination_game)
Grid World Navigation (grid_world)
Overcooked Environment (overcooked)

Environment Setup

Dependencies

Operating System: Ubuntu provides the best compatibility. Parallel scripts are not supported on Windows.
Python Version: The default interpreter version is Python 3.7. This is because it needs to support reinforcement learning policies based on TensorFlow 1.x and the older versions of the Overcooked environment.
Install required packages via:
pip install -r requirements.txt

LLM API Configuration

Our paper reports results on Llama3.3-70B-Instruct. You can also try other high-performing models, such as the GPT series. Set the LLM API configurations in the following three files:

coordination_game/LLM_agent/api_key.py
grid_world/LLM_agent/api_key.py
overcooked/LLM_agent/api_key.py

Quick Start

1) Coordination Game (`coordination_game`)

Single run:

  python coordination_game/main.py --player1_type=adaptive --player2_type=adaptive --model_name=llama --exp_name=demo --horizon=20 --use_non_coordiantion_opening --adaptive_alg=Hedge

Batch run (This approach allows you to quickly reproduce the results of the paper. ):
```
  apt install parallel
  cd coordination_game
  chmod +x run.sh
  ./run.sh
```
Output directory:
results/<exp_name>/<player1>_vs_<player2>_<model_name>[_flags]/<pid>/
Output files:
- action.csv: actions of both players for each round
- player*_prediction_vs_true_action.csv: predicted partner actions vs. real partner actions (extra output for 0/1/2-order ToM and Adaptive ToM agents)
Result analysis:
- Run: python coordination_game/analyze.py <horizon> <exp_name>
- Average score and standard deviation are saved in:
  results/<exp_name>/score.txt

2) Grid World Navigation (`gird_world`)

Single run:

  python gird_world/main.py --player1_type=adaptive --player2_type=adaptive --game_name=game1 --model_name=llama --exp_name=demo --horizon=20 --adaptive_alg=Hedge

Batch run (This approach allows you to quickly reproduce the results of the paper. ):
```
  apt install parallel
  cd gird_world
  chmod +x run.sh
  ./run.sh
```
Output directory:
results/<exp_name>/<player1>_vs_<player2>_<model_name>/<pid>/
Output files:
- player*_log.txt and public_log.txt: prompt/response traces and environment renderings
- score.txt: number of steps the agents required (or total horizon if unfinished)
- game_end.txt: True if both players reached their targets, otherwise absent/False
- player*_loss*.txt: Adaptive ToM training loss
- *_prediction_candidate_history.txt: prediction history of Adaptive ToM
Result analysis:
- Run: python gird_world/anaylze.py <exp_name>
- Aggregated success rate and score CSVs are saved under results/<exp_name>/

3) Overcooked (`overcooked`)

Single run:

  python overcooked/main.py --player1_type=adaptive_tom --player2_type=adaptive_tom --model_name=llama --exp_name=demo_overcooked  --horizon=80 --cook_time=20 --use_counter

Batch run (This approach allows you to quickly reproduce the results of the paper. ):
```
  apt install parallel
  cd overcooked
  chmod +x run.sh
  ./run.sh
```
Output directory:
results/<exp_name>/<player1>_vs_<player2>_<model_name>[_flags]/<pid>/
Output files:
- env_log.txt: environment rendering text
- score.txt: total score (fixed horizon) or steps needed to finish (depending on --use_score_of_fixed_horizon)
- game_end.txt: True if task completed within the time limit, otherwise False
- player*_loss.txt: Adaptive ToM training loss
- *_prediction_candidate_history.txt: prediction history of Adaptive ToM

Directory Structure

coordination_game/
- coordination_game/main.py: entry point for coordination-game experiments
- coordination_game/LLM_agent/: implementations of 0/1/2-order ToM and Adaptive ToM agents
- coordination_game/analyze.py: result analysis and summary
- coordination_game/run.sh: parallel experiment script
staghut_game/
- staghut_game/main.py: entry point for stag-hunt experiments
- staghut_game/LLM_agent/: implementations of 0/1/2-order ToM and Adaptive ToM agents
- staghut_game/analyze.py: result analysis and summary
- staghut_game/run.sh: batch experiment script
overcooked/
- overcooked/main.py: entry point for Overcooked experiments
- overcooked/overcooked_env/, overcooked/overcooked_ai_py/: environment implementation and dependencies
- overcooked/LLM_agent/: ToM agents and prompt templates adapted for Overcooked
- overcooked/rl_agent.py: RL baseline
- overcooked/game_prompts/: prompt templates for different layouts
- overcooked/requirements.txt: dependencies for this subproject
- overcooked/run.sh: batch experiment script

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
coordination_game		coordination_game
gird_world		gird_world
overcooked		overcooked
.gitignore		.gitignore
readme.md		readme.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Adaptive Theory of Mind for LLM-based Multi-Agent Coordination

Environment Setup

Dependencies

LLM API Configuration

Quick Start

1) Coordination Game (`coordination_game`)

2) Grid World Navigation (`gird_world`)

3) Overcooked (`overcooked`)

Directory Structure

About

Uh oh!

Releases

Packages

Languages

ChunjiangMonkey/Adaptive-ToM

Folders and files

Latest commit

History

Repository files navigation

Adaptive Theory of Mind for LLM-based Multi-Agent Coordination

Environment Setup

Dependencies

LLM API Configuration

Quick Start

1) Coordination Game (coordination_game)

2) Grid World Navigation (gird_world)

3) Overcooked (overcooked)

Directory Structure

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

1) Coordination Game (`coordination_game`)

2) Grid World Navigation (`gird_world`)

3) Overcooked (`overcooked`)

Packages