Releases: EntityProcess/agentv
Releases · EntityProcess/agentv
v4.21.0
What's Changed
- feat(cli): auto-update when version mismatch detected by @christso in #1126
- feat(cli): skip self-update when already on latest version by @christso in #1130
- feat(cli): add version check to studio/serve commands by @christso in #1131
- feat(cli): self update preserves install scope (local vs global) by @christso in #1132
- docs(AGENTS.md): add design principle #3 — maximize feature surface through composition by @christso in #1138
- feat(core): auto-discover test cases from directory structure by @christso in #1142
- feat(studio)!: benchmarks.yaml as single source of truth, live-reloaded by @christso in #1145
- docs(targets): add CLI Provider page + oracle-validation pattern by @christso in #1146
Full Changelog: v4.20.0...v4.21.0
v4.21.0-next.1
What's Changed
- feat(cli): auto-update when version mismatch detected by @christso in #1126
- feat(cli): skip self-update when already on latest version by @christso in #1130
- feat(cli): add version check to studio/serve commands by @christso in #1131
- feat(cli): self update preserves install scope (local vs global) by @christso in #1132
- docs(AGENTS.md): add design principle #3 — maximize feature surface through composition by @christso in #1138
- feat(core): auto-discover test cases from directory structure by @christso in #1142
- feat(studio)!: benchmarks.yaml as single source of truth, live-reloaded by @christso in #1145
- docs(targets): add CLI Provider page + oracle-validation pattern by @christso in #1146
Full Changelog: v4.20.0...v4.21.0-next.1
v4.20.0
What's Changed
- feat(cli): add --budget-usd run-level cost cap by @christso in #1118
- feat(bench): autoresearch optimization loop (#958, #746, #748) by @christso in #1112
- refactor(bench): extract autoresearch to reference file by @christso in #1124
- feat(core): expose {{ tool_calls }} template variable for LLM graders by @christso in #1123
Full Changelog: v4.19.0...v4.20.0
v4.20.0-next.1
What's Changed
- feat(cli): add --budget-usd run-level cost cap by @christso in #1118
- feat(bench): autoresearch optimization loop (#958, #746, #748) by @christso in #1112
- refactor(bench): extract autoresearch to reference file by @christso in #1124
- feat(core): expose {{ tool_calls }} template variable for LLM graders by @christso in #1123
Full Changelog: v4.19.0...v4.20.0-next.1
v4.19.0
What's Changed
- refactor(core): rename Evaluator to Grader across codebase by @christso in #1111
- feat(cli): incremental eval runs — resume, append, and aggregate by @christso in #1110
- feat(core): rename total_budget_usd to budget_usd by @christso in #1117
- feat(core): add beforeAll, budgetUsd, turns, aggregation to programmatic API by @christso in #1119
- feat(cli): add *.eval.ts auto-discovery by @christso in #1120
Full Changelog: v4.17.1...v4.19.0
v4.19.0-next.1
What's Changed
- feat(core): add beforeAll, budgetUsd, turns, aggregation to programmatic API by @christso in #1119
- feat(cli): add *.eval.ts auto-discovery by @christso in #1120
Full Changelog: v4.18.0-next.1...v4.19.0-next.1
v4.18.0-next.1
v4.17.1
What's Changed
- feat(pipeline): add --target and --targets flags to pipeline run and pipeline input by @christso in #1108
Full Changelog: v4.17.0...v4.17.1
v4.17.1-next.1
What's Changed
- feat(pipeline): add --target and --targets flags to pipeline run and pipeline input by @christso in #1108
Full Changelog: v4.17.0...v4.17.1-next.1
v4.17.0
What's Changed
- feat(compare): add normalized gain metric by @christso in #1101
- docs(agents): add self-describing rules for headers and test contracts by @christso in #1103
- feat(studio): comparison analytics charts for skills/workflow benchmarking by @christso in #1104
- docs: rename evaluators to graders by @christso in #1106
- feat(cli): add results report subcommand by @christso in #1105
Full Changelog: v4.16.0...v4.17.0