WorldModel Gym is a reproducible long-horizon planning benchmark + evaluation platform for imagination-based agents.
make setup
make demomake demo will:
- start the API + web stack with Docker when available
- fall back to local API execution when Docker daemon is unavailable
- run one benchmark evaluation
- upload artifacts and populate leaderboard data
Open:
- http://localhost:3000 (web dashboard)
- http://localhost:8000/docs (FastAPI docs)
.venv/bin/python -m worldmodel_gym.eval.run \
--agent random \
--env memory_maze \
--track test \
--seeds 211,223 \
--max-episodes 2Artifacts are written to runs/<run_id>/:
metrics.jsontrace.jsonlconfig.yaml
core/: environments, traces, eval harnessplanners/: MCTS, MPC-CEM, trajectory samplingworldmodels/: deterministic/stochastic/ensemble latent modelsagents/: baseline agents and registryserver/: FastAPI leaderboard + run artifact serviceweb/: Next.js dashboardmobile/: Expo viewerpaper/: draft PDF + LaTeX sources
make lint
make test
make paper
make deploy
make stop
make deploy-public
make stop-public
make deploy-vercel- API: deploy
render.yamlon Render Blueprint (free web service). - Web: deploy
web/on Vercel Hobby withNEXT_PUBLIC_API_BASEset to the Render API URL. - Full steps:
docs/DEPLOYMENT.md.