A configurable Monte Carlo Tree Search (MCTS) engine for two-player zero-sum games, demonstrated on Tic-Tac-Toe. The engine is decoupled from game-specific logic and can be adapted to any game that exposes a standard state interface.
Developed in collaboration with Elvina Fahlgren.
alpha-toe/
├── config/
│ └── example.config.json # experiment parameters
├── docs/ # project report
├── src/
│ ├── board.py # Tic-Tac-Toe game state
│ ├── mcts.py # MCTS engine
│ ├── rollout_strategies.py # simulation policies
│ └── run.py # experiment runner
├── requirements.txt
└── README.md
Monte Carlo Tree Search repeats four phases until the compute budget is exhausted.
Starting from the root, the tree is traversed by repeatedly selecting the child with the highest UCB1 score:
| Symbol | Meaning |
|---|---|
| Cumulative reward at node |
|
| Visit count at node |
|
| Exploration constant (default |
The first term exploits high-value nodes; the second explores less-visited ones.
When a node is not fully expanded (i.e. some legal moves have not yet been tried), one untried move is sampled uniformly at random and added to the tree as a new child node.
A game is played out from the newly expanded node to a terminal state. Two policies are available:
- heuristic — prioritises winning moves, then blocking opponent wins, then center; falls back to random.
- random — selects actions uniformly at random.
The terminal reward propagates back through the tree. Each ancestor's visit count
After the budget is exhausted the engine returns the robust child: the root's child with the highest visit count
To use this engine with a different game, implement a class with the following methods:
class MyGame:
def available_moves(self) -> list: ...
def make_move(self, move) -> None: ...
def clone(self) -> \'MyGame\': ...
def game_over(self) -> bool: ...
def get_winner(self): ... # returns player id or Nonepip install -r requirements.txtCopy the example config and adjust as needed:
cp config/example.config.json config/config.json
# edit config/config.json
python src/run.pyYou can also pass a config path explicitly:
python src/run.py config/example.config.json| Key | Type | Description |
|---|---|---|
board_size |
int | Board dimension (3 for 3x3) |
mcts_iterations |
int | MCTS simulations per move |
player_type |
string |
"mcts" or "random" for X |
rollout_policy |
string |
"heuristic" or "random" for X |
selection_policy |
string |
"uct", "greedy", or "epsilon_greedy" for X |
opponent_type |
string |
"mcts" or "random" for O |
opponent_rollout_policy |
string | Rollout policy for O |
opponent_selection_policy |
string | Selection policy for O |
exploration_constant |
float | UCB1 exploration constant |
num_games |
int | Number of games to run |
verbose |
bool | Print board and move trace |