CodeScaleBench includes a set of AI agent skill definitions in the skills/ directory. These are structured markdown runbooks that encode operational knowledge for common benchmark workflows, enabling any AI coding agent to operate the benchmark suite reliably.
Skills solve a practical problem: running a benchmark involves many multi-step workflows (infrastructure checks, task validation, run monitoring, failure triage, report generation) that are tedious to re-explain each session. By encoding these as structured files, any agent — Claude Code, Cursor, Copilot, or others — can follow them autonomously.
Project-specific skills for operating the CodeScaleBench pipeline:
| File | Skills | Purpose |
|---|---|---|
pre-run.md |
Check Infrastructure, Validate Tasks, Run Benchmark | Pre-launch readiness and execution |
monitoring.md |
Run Status, Watch Benchmarks | Active run monitoring |
triage-rerun.md |
Triage Failure, Quick Rerun | Failure investigation and fix verification |
analysis.md |
Compare Configs, MCP Audit, IR Analysis, Cost Report, Evaluate Traces | Post-run analysis |
maintenance.md |
Repo Health, Sync Metadata, Re-extract Metrics, Archive Run, Generate Report, What's Next | Data hygiene, health gate, reporting |
task-authoring.md |
Scaffold Task, Score Tasks, Benchmark Audit | Task creation and quality assurance |
Reusable skills applicable to any software project:
| File | Skills | Purpose |
|---|---|---|
workflow-tools.md |
Session Handoff, Strategic Compact, PRD Generator, Ralph Agent, Eval Harness | Session and workflow management |
agent-delegation.md |
Delegate, Codex/Cursor/Copilot/Gemini CLI Guides | Multi-agent task routing |
deep-search-clickhouse.md |
Deep Search CLI, ClickHouse Patterns | Semantic search and analytics |
dev-practices.md |
Security Review, Coding Standards, TDD, Verification Loop, Frontend/Backend Patterns | Development best practices |
Skills originated as .cursor/rules/*.mdc files. To use them with Cursor, copy into .cursor/rules/ and add YAML front-matter with description and optional globs fields. See skills/README.md for details.
Reference skill files from CLAUDE.md or AGENTS.md. The agent reads referenced files on demand.
Skills are plain markdown — any file-reading agent can use them directly.
See the Adapting for Your Own Project section in the skills README for guidance on writing skills for your own workflows.
skills/README.md— Full skill index and usage guideCLAUDE.md/AGENTS.md— Operational quick-reference (references skills)docs/QA_PROCESS.md— Quality assurance pipeline (skills automate parts of this)docs/ERROR_CATALOG.md— Known error patterns (used by triage skill)