Add SDK Guide for Critic Feature (Experimental) #263

xingyaoww · 2026-01-15T21:54:59Z

Summary

This PR adds comprehensive documentation for the new experimental Critic feature in the OpenHands SDK.

What's Added

A new guide at sdk/guides/critic.mdx that covers:

Core Concepts

What is a Critic? - Explanation of the LLM-based evaluation system
When to Use Critics - Use cases for quality monitoring, early intervention, and performance analysis
Evaluation Modes - Two modes: finish_and_message (default) and all_actions

Implementation Guide

Setting Up APIBasedCritic - Complete example with configuration
Configuration Options - All parameters explained (server_url, api_key, model_name, mode)
Understanding Results - How to interpret CriticResult scores and feedback
Visualizing Results - Color-coded output in the conversation visualizer
Programmatic Access - How to access critic results in callbacks

Technical Details

How It Works - Step-by-step evaluation flow
Chat Template Format - Qwen3-4B-Instruct-2507 template explanation
Security - API key handling with SecretStr
Performance Considerations - Latency, cost, and parallelization details

Advanced Usage

Custom Critic Implementations - Extending CriticBase with custom logic
Built-in Critics - PassCritic, AgentFinishedCritic, EmptyPatchCritic
Troubleshooting - Common issues and solutions

Example Code

Includes the full example from examples/01_standalone_sdk/34_critic_model_example.py with:

Auto-configuration for All-Hands LLM proxy
Manual configuration fallback
Running instructions

⚠️ Experimental Status

The guide includes prominent warnings that this feature is:

Highly experimental and subject to change
Not recommended for production without thorough testing
Subject to API and behavior changes based on feedback

Related PR

This documentation corresponds to OpenHands/software-agent-sdk#1269 which implements the Critic feature.

Preview

The guide follows the same structure and style as existing SDK guides, including:

Clear warnings about experimental status
Code examples with syntax highlighting
Step-by-step instructions
Troubleshooting section
Links to related guides

Checklist

Added comprehensive guide for Critic feature
Included clear experimental warnings
Provided complete code examples
Added troubleshooting section
Documented all configuration options
Linked to example code in repository
Followed existing documentation style and format

This guide documents the experimental API-based Critic feature for real-time evaluation of agent actions and messages using an external LLM. Key topics covered: - Overview of what critics are and when to use them - Two evaluation modes: finish_and_message and all_actions - Configuration and setup with APIBasedCritic - Understanding and visualizing critic results - Technical details including chat template format - Custom critic implementations - Built-in critic types - Troubleshooting common issues The guide includes clear warnings that this is an experimental feature subject to change and not recommended for production use without thorough testing. Co-authored-by: openhands <openhands@all-hands.dev>

openhands-ai · 2026-01-15T21:56:00Z

Looks like there are a few issues preventing this PR from being merged!

GitHub Actions are failing:
- .github/workflows/sync-docs-code-blocks.yml
- .github/workflows/sync-agent-sdk-openapi.yml

If you'd like me to help, just leave a comment, like

@OpenHands please fix the failing actions on PR #263 at branch `xw/critic-model`

Feel free to include any additional details that might help me get this PR into a better state.

_{^{You can manage your notification settings}}

xingyaoww requested a review from enyst as a code owner January 15, 2026 21:55

mintlify bot deployed to staging January 15, 2026 21:55 View deployment

openhands-ai bot mentioned this pull request Jan 15, 2026

Add API-Based Critic for Real-Time Agent Action Evaluation (Experimental) OpenHands/software-agent-sdk#1269

Draft

xingyaoww marked this pull request as draft January 15, 2026 22:04

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add SDK Guide for Critic Feature (Experimental) #263

Add SDK Guide for Critic Feature (Experimental) #263

Uh oh!

xingyaoww commented Jan 15, 2026

Uh oh!

openhands-ai bot commented Jan 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Add SDK Guide for Critic Feature (Experimental) #263

Are you sure you want to change the base?

Add SDK Guide for Critic Feature (Experimental) #263

Uh oh!

Conversation

xingyaoww commented Jan 15, 2026

Summary

What's Added

Core Concepts

Implementation Guide

Technical Details

Advanced Usage

Example Code

⚠️ Experimental Status

Related PR

Preview

Checklist

Uh oh!

openhands-ai bot commented Jan 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants