Decision: prioritize GitHub Action + CLI over web UI. Reason: strongest immediate value for startup teams and highest hiring signal.
Decision: optimize only RAG QA workflows in v1. Reason: clear product identity beats broad generic evaluation.
Decision: rank with weighted score after enforcing quality/latency/cost thresholds. Reason: mirrors real shipping criteria where safety rails must be non-negotiable.
Decision: include deterministic mock provider first. Reason: keeps project reproducible, low-cost, and testable in CI without external API keys.
Decision: persist run artifacts as JSON in artifacts/.
Reason: easy regression diffing, reproducibility, and auditability.