Skip to content

Add eval-harness cost attribution estimator#29

Closed
orban wants to merge 1 commit intonightshift/bus-factorfrom
cost-attribution
Closed

Add eval-harness cost attribution estimator#29
orban wants to merge 1 commit intonightshift/bus-factorfrom
cost-attribution

Conversation

@orban
Copy link
Copy Markdown
Owner

@orban orban commented Apr 4, 2026

Summary

  • carry Claude-reported USD cost through task and AGENTbench execution results
  • add component-level cost attribution to eval-harness JSON/Markdown reports and resume summaries
  • add focused tests and README docs for fix vs skill_generation cost estimates

Testing

  • pytest eval-harness/tests/test_reporter.py eval-harness/tests/test_task_runner.py eval-harness/tests/test_resume.py eval-harness/tests/test_agentbench.py -q
  • pytest eval-harness/tests -q (fails only in docker-dependent test_docker_runner because the local Docker socket is unavailable in this environment)

Nightshift-Task: cost-attribution
Nightshift-Ref: https://github.com/marcus/nightshift
@orban
Copy link
Copy Markdown
Owner Author

orban commented Apr 15, 2026

Superseded by #31, which replays this implementation on top of the edge-case and resumed-run median fixes. Closing to consolidate review.

@orban orban closed this Apr 15, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant