feat: remove evaluation infrastructure (moved to openadapt-evals) by abrichr · Pull Request #25 · OpenAdaptAI/openadapt-ml

abrichr · 2026-02-13T19:42:32Z

Summary

Removes ~18,000 lines of evaluation infrastructure that has been migrated to openadapt-evals (PR feat: migrate evaluation infrastructure from openadapt-ml openadapt-evals#29)
openadapt-ml is now a pure ML package: schemas, training, inference, model adapters, and ML-specific benchmark agents
Removes azure-ai-ml, azureml-core, and azure-mgmt-* from dependencies

Deleted files

benchmarks/cli.py (8,503 lines) - VM/pool management CLI
benchmarks/azure_vm.py - AzureVMManager
benchmarks/pool.py - PoolManager
benchmarks/vm_monitor.py, azure_ops_tracker.py, resource_tracker.py
benchmarks/azure.py, viewer.py, pool_viewer.py, trace_export.py
benchmarks/waa_deploy/ - Docker agent deployment
4 test files moved to openadapt-evals

Kept in openadapt-ml

benchmarks/agent.py - PolicyAgent, APIBenchmarkAgent, UnifiedBaselineAgent (ML model wrappers)

Migration Guide

Old import (openadapt-ml)	New import (openadapt-evals)
`openadapt_ml.benchmarks.cli`	`openadapt_evals.benchmarks.vm_cli` (or `oa-vm` CLI)
`openadapt_ml.benchmarks.azure_vm.AzureVMManager`	`openadapt_evals.infrastructure.azure_vm.AzureVMManager`
`openadapt_ml.benchmarks.pool.PoolManager`	`openadapt_evals.infrastructure.pool.PoolManager`
`openadapt_ml.benchmarks.vm_monitor.VMMonitor`	`openadapt_evals.infrastructure.vm_monitor.VMMonitor`
`openadapt_ml.benchmarks.azure_ops_tracker`	`openadapt_evals.infrastructure.azure_ops_tracker`
`openadapt_ml.benchmarks.resource_tracker`	`openadapt_evals.infrastructure.resource_tracker`
`openadapt_ml.benchmarks.pool_viewer`	`openadapt_evals.benchmarks.pool_viewer`
`openadapt_ml.benchmarks.trace_export`	`openadapt_evals.benchmarks.trace_export`
`openadapt_ml.benchmarks.waa_deploy`	`openadapt_evals.waa_deploy`

CLI migration

Old command	New command
`python -m openadapt_ml.benchmarks.cli pool-create`	`oa-vm pool-create`
`python -m openadapt_ml.benchmarks.cli pool-run`	`oa-vm pool-run`
`python -m openadapt_ml.benchmarks.cli pool-status`	`oa-vm pool-status`
`python -m openadapt_ml.benchmarks.cli pool-cleanup`	`oa-vm pool-cleanup`
`python -m openadapt_ml.benchmarks.cli create`	`oa-vm create`
`python -m openadapt_ml.benchmarks.cli status`	`oa-vm status`
`python -m openadapt_ml.benchmarks.cli vm monitor`	`oa-vm vm monitor`
All other CLI commands	`oa-vm <command>`

What stays the same

# ML agents - unchanged
from openadapt_ml.benchmarks import PolicyAgent, APIBenchmarkAgent, UnifiedBaselineAgent

Test plan

uv run pytest tests/ -v — 253 passed, 6 skipped
from openadapt_ml.benchmarks import PolicyAgent, APIBenchmarkAgent, UnifiedBaselineAgent works
ruff check passes
Verify no remaining references to deleted modules in non-deleted code

🤖 Generated with Claude Code

All evaluation infrastructure (~13,000 lines) has been migrated to openadapt-evals (PR #29). This PR removes the now-redundant code from openadapt-ml, making it a pure ML package. Deleted files: - benchmarks/cli.py (8,503 lines - VM/pool CLI) - benchmarks/azure_vm.py (AzureVMManager) - benchmarks/pool.py (PoolManager) - benchmarks/vm_monitor.py, azure_ops_tracker.py, resource_tracker.py - benchmarks/azure.py, viewer.py, pool_viewer.py, trace_export.py - benchmarks/waa_deploy/ (Docker agent deployment) - tests/test_quota_auto_detection.py, test_demo_persistence.py - tests/benchmarks/test_api_agent.py, test_waa.py Updated: - benchmarks/__init__.py: Only exports ML agents (PolicyAgent, etc.) - pyproject.toml: Removed azure-ai-ml, azureml-core, azure-mgmt-* - CLAUDE.md: Removed CLI/VM/pool docs, added migration guide Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Update all remaining references to deleted benchmark modules across source code, scripts, and tests: - cloud/local.py: azure_ops_tracker, session_tracker, CLI subprocess calls - scripts/: p0/p1 validation scripts, screenshot generators, quota checker - training/benchmark_viewer.py: HTML template CLI references - experiments/waa_demo/runner.py: docstring and print references - deprecated/waa_deploy/__init__.py: import path All now point to openadapt_evals equivalents. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

abrichr and others added 2 commits February 13, 2026 14:42

abrichr merged commit 2d57d02 into main Feb 13, 2026
0 of 4 checks passed

This was referenced Feb 13, 2026

docs: update README references to migrated CLI #26

Merged

style: fix ruff formatting #27

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: remove evaluation infrastructure (moved to openadapt-evals)#25

feat: remove evaluation infrastructure (moved to openadapt-evals)#25
abrichr merged 2 commits intomainfrom
feat/remove-eval-infra

abrichr commented Feb 13, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

abrichr commented Feb 13, 2026

Summary

Deleted files

Kept in openadapt-ml

Migration Guide

CLI migration

What stays the same

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant