Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions llm/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
.env
14 changes: 14 additions & 0 deletions llm/llm_prisoners_dilemma/.env.example
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
# LLM Prisoner's Dilemma - API Keys
# Copy this file to .env and fill in your API key

# Google Gemini (default)
GEMINI_API_KEY=your_gemini_api_key_here

# OpenAI
OPENAI_API_KEY=your_openai_api_key_here

# Anthropic
ANTHROPIC_API_KEY=your_anthropic_api_key_here

# Ollama (local, no key needed)
# Set llm_model to "ollama/llama3"
74 changes: 74 additions & 0 deletions llm/llm_prisoners_dilemma/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,74 @@
# LLM Prisoner's Dilemma

## Summary

An iterated Prisoner's Dilemma simulation where agents use **LLM Chain-of-Thought reasoning** to decide whether to cooperate or defect each round — instead of following fixed strategies like tit-for-tat or always-defect.

## The Game

Each round, agents are randomly paired. Both must simultaneously choose:

- **cooperate** 🤝 — work together for mutual benefit
- **defect** 🗡️ — betray partner for personal gain

**Payoff matrix:**

| | Partner Cooperates | Partner Defects |
|--|-------------------|-----------------|
| **You Cooperate** | 3, 3 | 0, 5 |
| **You Defect** | 5, 0 | 1, 1 |

## What makes this different from classical models

Classical Prisoner's Dilemma ABM uses fixed strategies — always defect, always cooperate, tit-for-tat, random. The outcome is determined by the strategy rules.

Here, agents **reason** at each step:

> "My partner cooperated last round. That signals trustworthiness.
> If I defect now I gain 5 points but destroy the trust we've built.
> Over many rounds, mutual cooperation (3+3+3...) beats cycles of
> defection (1+1+1...). I'll cooperate."

This produces **emergent negotiation dynamics** — reputation building, trust signaling, strategic exploitation — that fixed rules cannot capture.

## Visualization

- **Cooperation rate plot** — fraction of cooperative actions per round
- **Cumulative plot** — total cooperations vs defections over time

**Initial state (Step 0):**

![Initial state — no history, agents have not yet played](prisoners_dilemma_initial.png)

**After 5 rounds of LLM-driven reasoning:**

![5 rounds — cooperation collapses after exploitation, mutual defection locks in](prisoners_dilemma_dashboard.png)

**What this run demonstrates — emergent game theory from pure LLM reasoning:**

| Round | Cooperation Rate | What happened |
|-------|-----------------|---------------|
| 1 | 0.5 | One agent tried cooperation to build trust; the other defected and exploited it |
| 2 | 0.5 | Cooperating agent gave a second chance, exploiter defected again |
| 3+ | 0.0 | Exploited agent switched to permanent defection — "I cooperated twice, got burned twice, I'm done" |
| 4–5 | 0.0 | Stable mutual defection — the Nash equilibrium lock-in |

**Why this matters:** This is the core result from Axelrod's *Evolution of Cooperation* (1984) — agents that try cooperation, get exploited, and retaliate with defection — reproduced here with **zero hardcoded strategy**. No tit-for-tat rule, no punishment parameter, no threshold. The LLM reasoned its way to this behavior by reflecting on its interaction history at each step.

This is something a fixed-strategy model simply cannot do: the agent articulates *why* it switched, references past betrayal in its reasoning chain, and makes a strategic decision grounded in language rather than math.

## How to Run

```bash
cp .env.example .env # fill in your API key
pip install -r requirements.txt
solara run app.py
```

## Supported LLM Providers

Gemini, OpenAI, Anthropic, Ollama (local) — configured via `.env`.

## Reference

Axelrod, R. (1984). *The Evolution of Cooperation*. Basic Books.
66 changes: 66 additions & 0 deletions llm/llm_prisoners_dilemma/app.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,66 @@
from dotenv import load_dotenv
from llm_prisoners_dilemma.model import PrisonersDilemmaModel
from mesa.visualization import SolaraViz, make_plot_component

load_dotenv()


def agent_portrayal(agent):
"""Color agents by their last action and score."""
if not hasattr(agent, "last_action"):
return {"color": "gray", "size": 40}

color_map = {
"cooperate": "#2ecc71", # Green
"defect": "#e74c3c", # Red
"none": "#95a5a6", # Gray (first round)
}
color = color_map.get(agent.last_action, "gray")

# Size reflects score — higher score = bigger circle
size = 30 + min(agent.score * 2, 80)

return {"color": color, "size": size}


model_params = {
"num_agents": {
"type": "SliderInt",
"value": 6,
"label": "Number of Agents",
"min": 2,
"max": 20,
"step": 2,
},
"llm_model": {
"type": "Select",
"value": "gemini/gemini-2.0-flash",
"label": "LLM Model",
"values": [
"gemini/gemini-2.0-flash",
"gpt-4o-mini",
"gpt-4o",
],
},
}

CoopPlot = make_plot_component(
{
"cooperation_rate": "#2ecc71",
}
)

ScorePlot = make_plot_component(
{
"total_cooperations": "#2ecc71",
"total_defections": "#e74c3c",
}
)
model = PrisonersDilemmaModel()

page = SolaraViz(
model,
components=[CoopPlot, ScorePlot],
model_params=model_params,
name="LLM Prisoner's Dilemma",
)
Empty file.
126 changes: 126 additions & 0 deletions llm/llm_prisoners_dilemma/llm_prisoners_dilemma/agent.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,126 @@
import re

from mesa_llm.llm_agent import LLMAgent
from mesa_llm.reasoning.cot import CoTReasoning

SYSTEM_PROMPT = """You are a player in an iterated Prisoner's Dilemma game.

Each round you interact with a partner. You must choose one of two actions:
- cooperate: Work together for mutual benefit. If both cooperate, both gain moderately.
- defect: Betray your partner for personal gain. If you defect and they cooperate,
you gain a lot and they gain nothing. If both defect, both gain very little.

Payoff matrix (your score, partner score):
- Both cooperate: (3, 3) — mutual benefit
- You defect, they cooperate: (5, 0) — you exploit them
- You cooperate, they defect: (0, 5) — they exploit you
- Both defect: (1, 1) — mutual punishment

Your goal is to maximize your total score over multiple rounds.
Consider your partner's history when making decisions.
Think carefully about trust, reputation, and long-term strategy.

IMPORTANT: You MUST end your response with exactly one of these two lines:
<ACTION>: COOPERATE
<ACTION>: DEFECT"""


class PrisonerAgent(LLMAgent):
"""
An agent in an iterated Prisoner's Dilemma simulation that uses
LLM Chain-of-Thought reasoning to decide whether to cooperate or defect.

Unlike fixed-strategy agents (always defect, tit-for-tat, etc.),
this agent reasons about its partner's history, trust, and long-term
payoff to make nuanced decisions.

Attributes:
score (int): Cumulative score across all rounds.
last_action (str): Action taken in the most recent round.
cooperation_count (int): Total number of times agent cooperated.
defection_count (int): Total number of times agent defected.
round_history (list): List of (action, partner_action, payoff) tuples.
"""

# Payoff matrix
PAYOFFS = {
("cooperate", "cooperate"): (3, 3),
("cooperate", "defect"): (0, 5),
("defect", "cooperate"): (5, 0),
("defect", "defect"): (1, 1),
}

def __init__(self, model, llm_model: str = "gemini/gemini-2.0-flash") -> None:
super().__init__(
model=model,
reasoning=CoTReasoning,
llm_model=llm_model,
system_prompt=SYSTEM_PROMPT,
vision=0,
internal_state=["score:0", "last_action:none", "rounds_played:0"],
step_prompt=(
"You are about to play a round of Prisoner's Dilemma. "
"Review your history and your partner's behavior. "
"Should you cooperate or defect this round? "
"Think carefully about trust, reputation, and long-term strategy."
),
)
self.score: int = 0
self.last_action: str = "none"
self.cooperation_count: int = 0
self.defection_count: int = 0
self.round_history: list = []

def _update_internal_state(self) -> None:
"""Sync internal state for LLM observation."""
total_rounds = self.cooperation_count + self.defection_count
coop_rate = (
round(self.cooperation_count / total_rounds, 2) if total_rounds > 0 else 0
)
self.internal_state = [
f"score:{self.score}",
f"last_action:{self.last_action}",
f"cooperation_rate:{coop_rate}",
f"rounds_played:{total_rounds}",
]

def _parse_action(self, plan_content: str) -> str:
"""
Extract cooperate/defect decision from LLM response.

Looks for the mandatory <ACTION>: COOPERATE or <ACTION>: DEFECT tag
that the system prompt requires at the end of every response.
Falls back to cooperate if the tag is missing or malformed.

Args:
plan_content: Raw LLM response text.

Returns:
Either 'cooperate' or 'defect'.
"""
match = re.search(
r"<ACTION>\s*:\s*(COOPERATE|DEFECT)", plan_content, re.IGNORECASE
)
if match:
return match.group(1).lower()
return "cooperate"

def apply_decision(self, action: str, partner_action: str) -> None:
"""
Apply the outcome of a round given both agents' actions.

Args:
action: This agent's action ('cooperate' or 'defect').
partner_action: Partner's action ('cooperate' or 'defect').
"""
my_payoff, _ = self.PAYOFFS.get((action, partner_action), (0, 0))
self.score += my_payoff
self.last_action = action

if action == "cooperate":
self.cooperation_count += 1
else:
self.defection_count += 1

self.round_history.append((action, partner_action, my_payoff))
self._update_internal_state()
Loading