Skip to content

biru-codeastromer/WorldModel-Gym

Repository files navigation

worldmodel-gym

WorldModel Gym is a reproducible long-horizon planning benchmark + evaluation platform for imagination-based agents.

Quickstart (30 seconds)

make setup
make demo

make demo will:

  • start the API + web stack with Docker when available
  • fall back to local API execution when Docker daemon is unavailable
  • run one benchmark evaluation
  • upload artifacts and populate leaderboard data

Open:

Run a single evaluation

.venv/bin/python -m worldmodel_gym.eval.run \
  --agent random \
  --env memory_maze \
  --track test \
  --seeds 211,223 \
  --max-episodes 2

Artifacts are written to runs/<run_id>/:

  • metrics.json
  • trace.jsonl
  • config.yaml

Monorepo layout

  • core/: environments, traces, eval harness
  • planners/: MCTS, MPC-CEM, trajectory sampling
  • worldmodels/: deterministic/stochastic/ensemble latent models
  • agents/: baseline agents and registry
  • server/: FastAPI leaderboard + run artifact service
  • web/: Next.js dashboard
  • mobile/: Expo viewer
  • paper/: draft PDF + LaTeX sources

Dev targets

make lint
make test
make paper
make deploy
make stop
make deploy-public
make stop-public
make deploy-vercel

Free Cloud Deploy

  • API: deploy render.yaml on Render Blueprint (free web service).
  • Web: deploy web/ on Vercel Hobby with NEXT_PUBLIC_API_BASE set to the Render API URL.
  • Full steps: docs/DEPLOYMENT.md.