Add cogames-watch-replay skill and headless frame capture script by SolbiatiAlessandro · Pull Request #7 · Metta-AI/cogames

SolbiatiAlessandro · 2026-03-29T07:00:04Z

What this adds

scripts/capture_frames.py — runs an episode headlessly and saves emoji grid snapshots to a text file at regular intervals. No GUI, no TTY, no interactive input. Works with any policy; defaults to StarterPolicy (no LLM required).

.claude/skills/cogames-watch-replay/SKILL.md — a Claude Code skill that invokes the script and guides structured analysis of the output.

Why

Watching what the policy is actually doing spatially is the highest-leverage debugging tool — are agents stuck, are they reaching gear stations, are they spreading across the map? The existing unicode renderer requires interactive keyboard input (SPACE to unpause), which makes it unusable by autonomous Claude agents or in CI.

This script hooks into Rollout.event_handlers directly, runs headlessly, and writes a plain text file that Claude (or a human) can read and parse.

Real example: StarterPolicy gets stuck after step 50

Running the script on the default machina_1 mission with 4 agents reveals an immediate problem in the starter policy:

A0: moved 4 cells step 0→50, then STUCK for all 250 remaining steps (row=47, col≈41)
A1: moved 2 cells step 0→50, then STUCK for all 250 remaining steps (row=48, col≈41)
A2: moved 2 cells step 0→50, then STUCK for all 250 remaining steps (row=48, col≈42)
A3: moved 14 cells step 0→50, then STUCK for all 250 remaining steps (row=55, col≈46)
Total reward: 0.0000 across all 300 steps

All 4 agents freeze near their spawn point around step 50 and never move again. This is exactly the kind of spatial insight the script is designed to surface — numbers alone (reward=0) don't tell you why, but the frame sequence makes it unambiguous.

How to use

# Default: StarterPolicy, 500 steps, snapshot every 50
python scripts/capture_frames.py

# Gear-up phase (watch first 200 steps closely)
python scripts/capture_frames.py --steps 200 --every 10

# Full episode
python scripts/capture_frames.py --steps 1000 --every 100

# Single agent to isolate behavior
python scripts/capture_frames.py --agents 1 --steps 500 --every 50

# Your own policy
python scripts/capture_frames.py --policy class=cogames.policy.my_policy.MyPolicy

# Output to a specific file
python scripts/capture_frames.py --out docs/replay_frames.txt

From Claude Code: /cogames-watch-replay --steps 500 --every 50

What the skill teaches Claude to do

The skill guides structured analysis beyond just reading the grid visually:

Extract agent positions programmatically — search for 🟦🟧🟩🟨 symbols, record (row, col) per frame
Compute movement deltas — Manhattan distance between frames; flag agents stuck for >30% of episode
Track reward growth rate — deceleration signals hub depletion or navigation failure
Zoom into stuck areas — extract 15×15 subgrid around frozen agent to identify blocker (wall, extractor, wrong-gear station)
Compare configs — run 1/3/8-agent and compare per-agent reward to distinguish individual vs. contention problems

🤖 Generated with Claude Code

scripts/capture_frames.py: runs an episode headlessly and saves emoji grid snapshots to a text file at regular intervals. Works with any policy (defaults to StarterPolicy, no LLM required). Useful for diagnosing navigation, gear acquisition, and routing without a GUI. .claude/skills/cogames-watch-replay/SKILL.md: Claude skill that invokes the script and guides analysis — extract agent coordinates programmatically, detect stuck agents by movement delta, zoom into blocked areas with a 15x15 subgrid, compare 1/3/8-agent configs. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…efully

SolbiatiAlessandro · 2026-04-04T05:43:23Z

let's ship this? @daveey

desiorac · 2026-04-28T15:01:56Z

The symbol legend (line 119) is hardcoded to 4 agents (🟦🟧🟩🟨), but --agents 8 is a documented usage. DEFAULT_SYMBOL_MAP likely assigns symbols for agents 4-7, but they won't appear in the file header - the stuck-agent heuristic in the skill then silently misidentifies unlabeled cells.

Worth making the legend dynamic:

agent_syms = " ".join(
    f"{self._map_buffer._symbol_map.get(f'agent{i}', '?')}=agent{i}"
    for i in range(self._sim.num_agents)
)
f.write(f"# Symbols: {agent_syms}  ⬛=wall  · =empty\n\n")

Side note: flipping mettagrid from workspace to git-pinned in pyproject.toml breaks local dev for contributors who have it checked out as a sibling workspace. Intentional for standalone distribution?

desiorac · 2026-04-28T15:40:47Z

Depends on whether --agents 8 is a real use case today or just documented future scope. If nobody is actually running 8 agents right now, ship it - the bug only surfaces when you exceed 4 agents and the legend mismatch causes the stuck-agent heuristic to flag false positives. If 8-agent runs are happening in CI, the two-liner fix is worth doing first: replace the hardcoded legend with a loop over self._sim.num_agents before merging.

relh and others added 3 commits March 26, 2026 14:13

chore: update cogames mettagrid to 0.21.1

7ca906c

fix: use machina_1 default mission, handle missing render_symbol grac…

ef1cccc

…efully

nishu-builder force-pushed the main branch 5 times, most recently from 0454f65 to 46ca0e3 Compare April 3, 2026 01:26

chore: update cogames mettagrid to 0.23.3

2e79514

daveey force-pushed the main branch from 46ca0e3 to 2e79514 Compare April 3, 2026 03:59

Merge branch 'main' into pr/cogames-watch-replay-skill

4afd30d

nishu-builder force-pushed the main branch 6 times, most recently from cbb95f7 to 5340aa9 Compare April 13, 2026 22:14

nishu-builder force-pushed the main branch 6 times, most recently from 16e7904 to a1623f6 Compare April 16, 2026 20:58

nishu-builder force-pushed the main branch from a1623f6 to 983b526 Compare April 23, 2026 23:34

relh force-pushed the main branch from 983b526 to 5fd9458 Compare April 24, 2026 11:55

nishu-builder force-pushed the main branch 2 times, most recently from 26f2bef to 71ab990 Compare April 24, 2026 22:07

relh force-pushed the main branch 2 times, most recently from ec17523 to 87261c8 Compare April 26, 2026 00:40

relh force-pushed the main branch 7 times, most recently from faac734 to ed54be8 Compare April 28, 2026 02:25

nishu-builder force-pushed the main branch from ed54be8 to f5b146a Compare April 28, 2026 04:54

nishu-builder force-pushed the main branch from f5b146a to b47d25c Compare April 28, 2026 15:33

relh force-pushed the main branch 5 times, most recently from 20578ae to 80a3706 Compare May 5, 2026 21:34

nishu-builder force-pushed the main branch 2 times, most recently from 750885c to bd5f6be Compare May 7, 2026 22:04

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add cogames-watch-replay skill and headless frame capture script#7

Add cogames-watch-replay skill and headless frame capture script#7
SolbiatiAlessandro wants to merge 5 commits intoMetta-AI:mainfrom
SolbiatiAlessandro:pr/cogames-watch-replay-skill

SolbiatiAlessandro commented Mar 29, 2026 •

edited

Loading

Uh oh!

SolbiatiAlessandro commented Apr 4, 2026

Uh oh!

desiorac commented Apr 28, 2026

Uh oh!

desiorac commented Apr 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

SolbiatiAlessandro commented Mar 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What this adds

Why

Real example: StarterPolicy gets stuck after step 50

How to use

What the skill teaches Claude to do

Uh oh!

SolbiatiAlessandro commented Apr 4, 2026

Uh oh!

desiorac commented Apr 28, 2026

Uh oh!

desiorac commented Apr 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

SolbiatiAlessandro commented Mar 29, 2026 •

edited

Loading