Skip to content

Surface red team scan errors in run results#45772

Draft
slister1001 wants to merge 2 commits intoAzure:mainfrom
slister1001:fix/redteam-error-surfacing
Draft

Surface red team scan errors in run results#45772
slister1001 wants to merge 2 commits intoAzure:mainfrom
slister1001:fix/redteam-error-surfacing

Conversation

@slister1001
Copy link
Member

When all attacks fail due to a configuration error (e.g., unavailable model), the run previously completed with 0 results and no error message. Users had no way to understand what went wrong.

Changes:

  • Add error field to RedTeamRun TypedDict for run-level error reporting
  • Add _aggregate_run_errors() to ResultProcessor to collect per-category errors from red_team_info into a structured run-level error
  • Classify HTTP 400 as CONFIGURATION in ExceptionHandler so systemic config errors (unavailable_model, bad credentials) are detected
  • Add early abort in Foundry execution manager after 2 consecutive configuration errors to avoid wasting time on remaining categories
  • Add _is_configuration_error() helper to detect systemic issues

Description

Please add an informative description that covers that changes made by the pull request and link all relevant issues.

If an SDK is being regenerated based on a new API spec, a link to the pull request containing these API spec changes should be included above.

All SDK Contribution checklist:

  • The pull request does not introduce [breaking changes]
  • CHANGELOG is updated for new features, bug fixes or other significant changes.
  • I have read the contribution guidelines.

General Guidelines and Best Practices

  • Title of the pull request is clear and informative.
  • There are a small number of commits, each of which have an informative message. This means that previously merged commits do not appear in the history of the PR. For more information on cleaning up the commits in your PR, see this page.

Testing Guidelines

  • Pull request includes test coverage for the included changes.

When all attacks fail due to a configuration error (e.g., unavailable model),
the run previously completed with 0 results and no error message. Users had
no way to understand what went wrong.

Changes:
- Add error field to RedTeamRun TypedDict for run-level error reporting
- Add _aggregate_run_errors() to ResultProcessor to collect per-category
  errors from red_team_info into a structured run-level error
- Classify HTTP 400 as CONFIGURATION in ExceptionHandler so systemic
  config errors (unavailable_model, bad credentials) are detected
- Add early abort in Foundry execution manager after 2 consecutive
  configuration errors to avoid wasting time on remaining categories
- Add _is_configuration_error() helper to detect systemic issues

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@github-actions github-actions bot added the Evaluation Issues related to the client library for Azure AI Evaluation label Mar 18, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Evaluation Issues related to the client library for Azure AI Evaluation

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant