codexopt-demo

Demo repository for showcasing CodexOpt on intentionally messy instruction assets.

Main Project

This demo is the companion repository for the main CodexOpt project:

CodexOpt: https://github.com/SuperagenticAI/CodexOpt

AGENTS.md with duplicate and conflicting guidance
SKILL.md examples:
- missing frontmatter
- verbose/redundant text
- duplicated lines
tasks.md with 5 evaluation tasks
issues.md with recurring feedback themes
Tiny Python package under src/codexopt_demo
GEPA local/cloud setup guide: docs/gepa-local-and-cloud.md

Quick Start (uv)

uv lock
uv sync --extra dev
uv run --no-sync pytest -q
uv run --no-sync ruff check src tests

Run CodexOpt against this demo

From this repo root:

codexopt init
codexopt scan
codexopt benchmark
codexopt optimize agents --file AGENTS.md
codexopt optimize skills --glob ".codex/skills/**/SKILL.md"
codexopt apply --kind skills --dry-run
codexopt report --output codexopt-report.md

How To Run This Demo

This demo is meant to mirror how a team would use CodexOpt in a real repository.

Inputs in this demo:

AGENTS.md
demo skills under .codex/skills/
repo task evidence in tasks.md
recurring feedback themes in issues.md

Suggested flow:

Run benchmark to get a baseline score plus feedback.
Run optimize agents and optimize skills.
Review .codexopt/runs/*/optimize.json and generated reports.
Use apply --dry-run before writing any changes.

Example:

cp codexopt.gepa.example.yaml codexopt.yaml
codexopt --config codexopt.yaml benchmark
codexopt --config codexopt.yaml optimize agents
codexopt --config codexopt.yaml optimize skills
codexopt apply --kind agents --dry-run
codexopt --config codexopt.yaml report --output codexopt-report.md

Command reference used in the demo:

cd /path/to/codexopt-demo
export GEMINI_API_KEY="YOUR_REAL_KEY"
export GOOGLE_API_KEY="$GEMINI_API_KEY"
rm -rf .codexopt codexopt-report.md
ls
codexopt --config codexopt.gepa.example.yaml benchmark
codexopt --config codexopt.gepa.example.yaml optimize agents --engine heuristic --file AGENTS.md
codexopt --config codexopt.gepa.example.yaml optimize skills --engine heuristic --glob ".codex/skills/**/SKILL.md"
codexopt apply --kind agents --dry-run
codexopt apply --kind skills --dry-run
codexopt --config codexopt.gepa.example.yaml report --output codexopt-report.md
sed -n '1,120p' codexopt-report.md
codexopt --config codexopt.gepa.example.yaml optimize agents \
  --engine gepa \
  --reflection-model gemini/gemini-2.5-pro \
  --max-metric-calls 2 \
  --file AGENTS.md

benchmark: baseline score plus evidence-aware feedback
optimize agents: optimize AGENTS.md
optimize skills: optimize demo skill files
apply --dry-run: preview changes without writing files
report: generate a markdown summary from the latest runs
optimize ... --engine gepa: optional low-budget GEPA example with Gemini 2.5 Pro

To benchmark against repo tasks and issue themes, copy the demo config first:

cp codexopt.gepa.example.yaml codexopt.yaml
codexopt --config codexopt.yaml benchmark

That config enables:

tasks.md as task evidence
issues.md as recurring feedback evidence

The benchmark and report artifacts will then include:

criterion sub-scores
natural-language feedback
task/issue evidence counts

The current demo shows evidence-aware instruction optimization. It does not yet run full agent task simulations from tasks.md; those tasks currently shape scoring and feedback.

GEPA Configuration in this Demo

Use this example file:

codexopt.gepa.example.yaml

1) Copy it to active config

cp codexopt.gepa.example.yaml codexopt.yaml

2) Set your reflection model

Edit codexopt.yaml:

evidence:
  task_files:
    - tasks.md
  issue_files:
    - issues.md
optimization:
  engine: "gepa"
  max_metric_calls: 120
  reflection_model: "your-provider/your-reflection-model"

GEPA in CodexOpt is model-agnostic. You can use OpenAI, Gemini, local models, or other GEPA/LiteLLM-compatible providers for reflection and candidate feedback.

OpenAI example:

export OPENAI_API_KEY="YOUR_KEY"

optimization:
  engine: "gepa"
  reflection_model: "openai/gpt-5-mini"

Gemini example:

export GEMINI_API_KEY="YOUR_KEY"
export GOOGLE_API_KEY="$GEMINI_API_KEY"

optimization:
  engine: "gepa"
  reflection_model: "gemini/gemini-2.5-pro"

3) Run optimization with GEPA

codexopt --config codexopt.yaml optimize agents
codexopt --config codexopt.yaml optimize skills

4) Override from CLI (optional)

codexopt optimize skills \
  --engine gepa \
  --reflection-model your-provider/your-reflection-model \
  --max-metric-calls 200

About "iterations"

Current CodexOpt exposes GEPA tuning via max_metric_calls and reflection_model. A direct iterations field is not exposed yet; use max_metric_calls as the primary search-budget control.

If GEPA is unavailable or the requested model path fails, CodexOpt records that fallback in the optimization artifact and report.

GEPA Run Guide

For step-by-step local and cloud GEPA setup (including low-budget runs), see:

docs/gepa-local-and-cloud.md

Try It Yourself

cd /path/to/codexopt-demo
export GEMINI_API_KEY="YOUR_REAL_KEY"
export GOOGLE_API_KEY="$GEMINI_API_KEY"
rm -rf .codexopt codexopt-report.md
ls
codexopt --config codexopt.gepa.example.yaml benchmark
codexopt --config codexopt.gepa.example.yaml optimize agents --engine heuristic --file AGENTS.md
codexopt --config codexopt.gepa.example.yaml optimize skills --engine heuristic --glob ".codex/skills/**/SKILL.md"
codexopt apply --kind agents --dry-run
codexopt apply --kind skills --dry-run
codexopt --config codexopt.gepa.example.yaml report --output codexopt-report.md
sed -n '1,120p' codexopt-report.md
codexopt --config codexopt.gepa.example.yaml optimize agents \
  --engine gepa \
  --reflection-model gemini/gemini-2.5-pro \
  --max-metric-calls 2 \
  --file AGENTS.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

codexopt-demo

Main Project

Contents

Quick Start (uv)

Run CodexOpt against this demo

How To Run This Demo

GEPA Configuration in this Demo

1) Copy it to active config

2) Set your reflection model

3) Run optimization with GEPA

4) Override from CLI (optional)

About "iterations"

GEPA Run Guide

Try It Yourself

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
.codex/skills		.codex/skills
docs		docs
src/codexopt_demo		src/codexopt_demo
tests		tests
.gitignore		.gitignore
.python-version		.python-version
AGENTS.md		AGENTS.md
README.md		README.md
codexopt.gepa.example.yaml		codexopt.gepa.example.yaml
issues.md		issues.md
pyproject.toml		pyproject.toml
tasks.md		tasks.md
uv.lock		uv.lock

Folders and files

Latest commit

History

Repository files navigation

codexopt-demo

Main Project

Contents

Quick Start (uv)

Run CodexOpt against this demo

How To Run This Demo

GEPA Configuration in this Demo

1) Copy it to active config

2) Set your reflection model

3) Run optimization with GEPA

4) Override from CLI (optional)

About "iterations"

GEPA Run Guide

Try It Yourself

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages