Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
25 changes: 14 additions & 11 deletions messages/agent.test.run-eval.md
Original file line number Diff line number Diff line change
@@ -1,24 +1,27 @@
# summary

Run evaluation tests against an Agentforce agent.
Run rich evaluation tests against an Agentforce agent.

# description

Execute rich evaluation tests against an Agentforce agent using the Einstein Evaluation API. Supports both YAML test specs (same format as `sf agent generate test-spec`) and JSON payloads.
Specify the tests you want to run with one of these inputs to the --spec flag:

When you provide a YAML test spec, the command automatically translates test cases into Evaluation API calls and infers the agent name from the spec's `subjectName` field. This means you can use the same test spec with both `sf agent test run` and `sf agent test run-eval`. YAML test specs also support contextVariables, which allow you to inject contextual data (such as CaseId or RoutableId) into agent sessions for testing with different contexts.
- YAML test spec generated by the `agent generate test-spec` CLI command
- JSON payload

When you provide a JSON payload, it's sent directly to the API with optional normalization. The normalizer auto-corrects common field name mistakes, converts shorthand references to JSONPath, and injects defaults. Use `--no-normalize` to disable this auto-normalization. JSON payloads can also include context_variables on agent.create_session steps for the same contextual testing capabilities.
When you provide a YAML test spec, this command automatically translates test cases into internal state-based evaluation framework calls and infers the agent name from the test spec's `subjectName` field. As a result, you can use the same test spec with both the `agent test run` and `agent test run-eval` commands. YAML test specs also support context variables, which allow you to inject contextual data (such as CaseId or RoutableId) into agent sessions for testing with different contexts.

Supports 8+ evaluator types, including topic routing assertions, action invocation checks, string/numeric assertions, semantic similarity scoring, and LLM-based quality ratings.
When you provide a JSON payload, it's sent directly to the evaluation framework with optional normalization. The normalizer auto-corrects common field name mistakes, converts shorthand references to JSONPath, and injects defaults. Use `--no-normalize` to disable this auto-normalization. JSON payloads can also include context_variables on agent.create_session steps for the same contextual testing capabilities as when you use a YAML test spec.

This command supports more than 8 evaluator types, including subagent routing assertions, action invocation checks, string/numeric assertions, semantic similarity scoring, and LLM-based quality ratings.

# flags.spec.summary

Path to test spec file (YAML or JSON). Supports reading from stdin when piping content.

# flags.api-name.summary

Agent DeveloperName (also called API name) to resolve agent_id and agent_version_id. Auto-inferred from the YAML spec's subjectName.
Agent API name (also called DeveloperName) used to resolve agent_id and agent_version_id. Auto-inferred from the YAML spec's subjectName.

# flags.result-format.summary

Expand All @@ -36,23 +39,23 @@ Disable auto-normalization of field names and shorthand references.

- Run tests using a YAML test spec on the org with alias "my-org":

<%= config.bin %> <%= command.id %> --spec tests/my-agent-testSpec.yaml --target-org my-org
<%= config.bin %> <%= command.id %> --spec specs/my-agent-testSpec.yaml --target-org my-org

- Run tests using a YAML spec with explicit agent name override; use your default org:

<%= config.bin %> <%= command.id %> --spec tests/my-agent-testSpec.yaml --api-name My_Agent --target-org my-org
<%= config.bin %> <%= command.id %> --spec specs/my-agent-testSpec.yaml --api-name My_Agent

- Run tests using a JSON payload:

<%= config.bin %> <%= command.id %> --spec tests/eval-payload.json --target-org my-org
<%= config.bin %> <%= command.id %> --spec specs/eval-payload.json --target-org my-org

- Run tests and output results in JUnit format; useful for continuous integration and deployment (CI/CD):

<%= config.bin %> <%= command.id %> --spec tests/my-agent-testSpec.yaml --target-org my-org --result-format junit
<%= config.bin %> <%= command.id %> --spec specs/my-agent-testSpec.yaml --target-org my-org --result-format junit

- Run tests with contextVariables to inject contextual data into agent sessions (add contextVariables to test cases in your YAML spec):

<%= config.bin %> <%= command.id %> --spec tests/agent-with-context.yaml --target-org my-org
<%= config.bin %> <%= command.id %> --spec specs/agent-with-context.yaml --target-org my-org

- Pipe JSON payload from stdin (--spec flag is automatically populated from stdin):

Expand Down
Loading