Add SDK Guide for Critic Feature (Experimental) #263

xingyaoww · 2026-01-15T21:54:59Z

Summary

This PR adds comprehensive documentation for the new experimental Critic feature in the OpenHands SDK.

What's Added

A new guide at sdk/guides/critic.mdx that covers:

Core Concepts

What is a Critic? - Explanation of the LLM-based evaluation system
When to Use Critics - Use cases for quality monitoring, early intervention, and performance analysis
Evaluation Modes - Two modes: finish_and_message (default) and all_actions

Implementation Guide

Setting Up APIBasedCritic - Complete example with configuration
Configuration Options - All parameters explained (server_url, api_key, model_name, mode)
Understanding Results - How to interpret CriticResult scores and feedback
Visualizing Results - Color-coded output in the conversation visualizer
Programmatic Access - How to access critic results in callbacks

Technical Details

How It Works - Step-by-step evaluation flow
Chat Template Format - Qwen3-4B-Instruct-2507 template explanation
Security - API key handling with SecretStr
Performance Considerations - Latency, cost, and parallelization details

Advanced Usage

Custom Critic Implementations - Extending CriticBase with custom logic
Built-in Critics - PassCritic, AgentFinishedCritic, EmptyPatchCritic
Troubleshooting - Common issues and solutions

Example Code

Includes the full example from examples/01_standalone_sdk/34_critic_model_example.py with:

Auto-configuration for All-Hands LLM proxy
Manual configuration fallback
Running instructions

⚠️ Experimental Status

The guide includes prominent warnings that this feature is:

Highly experimental and subject to change
Not recommended for production without thorough testing
Subject to API and behavior changes based on feedback

Related PR

This documentation corresponds to OpenHands/software-agent-sdk#1269 which implements the Critic feature.

Preview

The guide follows the same structure and style as existing SDK guides, including:

Clear warnings about experimental status
Code examples with syntax highlighting
Step-by-step instructions
Troubleshooting section
Links to related guides

Checklist

Added comprehensive guide for Critic feature
Included clear experimental warnings
Provided complete code examples
Added troubleshooting section
Documented all configuration options
Linked to example code in repository
Followed existing documentation style and format

This guide documents the experimental API-based Critic feature for real-time evaluation of agent actions and messages using an external LLM. Key topics covered: - Overview of what critics are and when to use them - Two evaluation modes: finish_and_message and all_actions - Configuration and setup with APIBasedCritic - Understanding and visualizing critic results - Technical details including chat template format - Custom critic implementations - Built-in critic types - Troubleshooting common issues The guide includes clear warnings that this is an experimental feature subject to change and not recommended for production use without thorough testing. Co-authored-by: openhands <openhands@all-hands.dev>

sdk/guides/critic.mdx

- Remove 'When to Use Critics' section - Remove implementation details (evaluation modes, how it works, technical details) - Remove custom critic implementations and other critic types - Simplify troubleshooting section - Add note that critic is hosted by OpenHands LLM Provider and is free - Rename 'Setting Up' to 'Quick Start' with emphasis on auto-configuration

- Explain what a critic is and what it provides - Add example use case: triggering agent to reflect/fix when critic indicates poor performance

…l report

sdk/guides/critic.mdx

…ple.py

xingyaoww · 2026-01-20T18:20:41Z

@OpenHands can we link the this docs page to docs.json

openhands-ai · 2026-01-20T18:20:51Z

I'm on it! xingyaoww can track my progress at all-hands.dev

Link the sdk/guides/critic.mdx page to the Agent Features section in docs.json navigation.

openhands-ai · 2026-01-20T18:22:25Z

I've linked the critic docs page to docs.json by adding "sdk/guides/critic" to the "Agent Features" group in the SDK navigation.

Changes made:

Added "sdk/guides/critic" to the "Agent Features" section in docs.json (line 271)

The change has been committed and pushed to the xw/critic-model branch. The critic guide will now appear in the SDK documentation navigation under SDK → Guides → Agent Features → Critic.

View full conversation

xingyaoww · 2026-01-20T18:35:24Z

@OpenHands can we create a new page for https://docs.openhands.dev/openhands/usage/cli/critic and put it under CLI's extension group in docs.json.

It should link to the current SDK page, it should explain that we have experiment critic feature enabled for people who are already using OpenHands LLM Provider (https://docs.openhands.dev/openhands/usage/llms/openhands-llms)

It is currently free during public beta phase

Experimental Critic Feature Enabled
We've detected that you're using the OpenHands LLM provider. An experimental critic feature is now active (free) to predict task success. We may collect your feedback rating. You can disable this in settings.

openhands-ai · 2026-01-20T18:35:33Z

I'm on it! xingyaoww can track my progress at all-hands.dev

- Create new page at openhands/usage/cli/critic.mdx explaining the experimental critic feature - Add page to CLI Extensions group in docs.json - Link to SDK critic guide for detailed information - Note that feature is free during public beta phase Co-authored-by: openhands <openhands@all-hands.dev>

openhands-ai · 2026-01-20T18:37:16Z

I've created the new CLI critic page and added it to the documentation navigation. Here's a summary of the changes:

Changes Made

1. Created new page: `openhands/usage/cli/critic.mdx`

The page includes:

✅ Warning about experimental status
✅ Link to the OpenHands LLM Provider page (/openhands/usage/llms/openhands-llms)
✅ The exact message provided about the experimental critic feature being enabled
✅ Note that it's free during public beta phase
✅ Brief explanation of what the critic does
✅ Link to the SDK Critic Guide (/sdk/guides/critic) for detailed information

2. Updated `docs.json`

✅ Added "openhands/usage/cli/critic" to the CLI "Extensions" group

The page will be accessible at https://docs.openhands.dev/openhands/usage/cli/critic and will appear in the CLI documentation under the Extensions section alongside MCP Servers.

The changes have been committed and pushed to the xw/critic-model branch.

View full conversation

- Add critic-cli-output.png showing critic score in CLI - Add critic-cli-settings.png showing CLI settings panel - Add critic-sdk-visualizer.png showing SDK visualizer output - Update CLI critic docs to embed screenshots - Update SDK critic guide to embed visualizer screenshot Co-authored-by: openhands <openhands@all-hands.dev>

openhands/usage/cli/critic.mdx

Move screenshots to local folders next to their respective mdx files and use relative paths (./screenshots/) instead of absolute paths, following the pattern used in local-llms.mdx. Co-authored-by: openhands <openhands@all-hands.dev>

jpelletier1

LGTM!

enyst · 2026-01-21T16:26:52Z

sdk/guides/critic.mdx

+</Warning>
+
+<Note>
+The critic model is hosted by the OpenHands LLM Provider and is currently free to use. This example is available on GitHub: [examples/01_standalone_sdk/34_critic_example.py](https://github.com/OpenHands/software-agent-sdk/blob/main/examples/01_standalone_sdk/34_critic_example.py)


@OpenHands run sync lock workflow and update PR

I'm on it! enyst can track my progress at all-hands.dev

Final summary (since last update)

What I checked

Request addressed: ran the “sync/lock” equivalent locally (code-block sync + API doc generation) and updated the existing PR branch.

Instructions followed: worked on branch xw/critic-model, committed changes, pushed to the same branch, and did not comment on the PR.

Branch is up to date / pushed: HEAD equals origin/xw/critic-model (no unpushed commits).

Conciseness review: changes are the direct output of running the repo’s sync steps; no unrelated manual edits.

Changes made

Synced MDX code blocks from OpenHands/software-agent-sdk

Updated:

sdk/guides/critic.mdx

sdk/guides/hooks.mdx

sdk/guides/plugins.mdx

sdk/guides/skill.mdx

Regenerated SDK API reference pages

Updated:

sdk/api-reference/openhands.sdk.agent.mdx

sdk/api-reference/openhands.sdk.conversation.mdx

sdk/api-reference/openhands.sdk.event.mdx

sdk/api-reference/openhands.sdk.llm.mdx

sdk/api-reference/openhands.sdk.tool.mdx

sdk/api-reference/openhands.sdk.workspace.mdx

Git details

Commit on PR branch: f9575fa docs: sync code blocks and regenerate API reference

Push status: already pushed to origin/xw/critic-model (no further action needed).

View full conversation

Co-authored-by: openhands <openhands@all-hands.dev>

openhands-ai · 2026-01-21T21:48:32Z

Looks like there are a few issues preventing this PR from being merged!

GitHub Actions are failing:
- .github/workflows/sync-docs-code-blocks.yml
- .github/workflows/sync-agent-sdk-openapi.yml

If you'd like me to help, just leave a comment, like

@OpenHands please fix the failing actions on PR #263 at branch `xw/critic-model`

Feel free to include any additional details that might help me get this PR into a better state.

_{^{You can manage your notification settings}}

xingyaoww requested a review from enyst as a code owner January 15, 2026 21:55

mintlify bot deployed to staging January 15, 2026 21:55 View deployment

openhands-ai bot mentioned this pull request Jan 15, 2026

Add API-Based Critic for Real-Time Agent Action Evaluation (Experimental) OpenHands/software-agent-sdk#1269

Merged

xingyaoww marked this pull request as draft January 15, 2026 22:04

xingyaoww commented Jan 20, 2026

View reviewed changes

sdk/guides/critic.mdx Show resolved Hide resolved

mintlify bot deployed to staging January 20, 2026 14:54 View deployment

Add 'What is a Critic?' section explaining use cases

97437c7

- Explain what a critic is and what it provides - Add example use case: triggering agent to reflect/fix when critic indicates poor performance

mintlify bot deployed to staging January 20, 2026 14:58 View deployment

Add reference to SWE-Bench blog post and mention forthcoming technica…

79110c4

…l report

mintlify bot deployed to staging January 20, 2026 15:00 View deployment

Change 'evaluation model' to 'evaluator' in critic description

5372457

mintlify bot deployed to staging January 20, 2026 15:01 View deployment

Update critic score visualization example to match actual output

9e77c4c

mintlify bot deployed to staging January 20, 2026 15:03 View deployment

xingyaoww commented Jan 20, 2026

View reviewed changes

sdk/guides/critic.mdx Outdated Show resolved Hide resolved

xingyaoww and others added 2 commits January 20, 2026 23:04

Apply suggestion from @xingyaoww

26a9826

Rename example file from 34_critic_model_example.py to 34_critic_exam…

450766e

…ple.py

mintlify bot deployed to staging January 20, 2026 15:06 View deployment

mintlify bot deployed to staging January 20, 2026 15:10 View deployment

xingyaoww mentioned this pull request Jan 20, 2026

[Feature]: Retry on failure functionality OpenHands/OpenHands#2221

Open

xingyaoww and others added 2 commits January 20, 2026 13:21

Merge branch 'main' into xw/critic-model

86ea0cb

Add critic guide to docs.json navigation

9cef6bc

Link the sdk/guides/critic.mdx page to the Agent Features section in docs.json navigation.

mintlify bot deployed to staging January 20, 2026 18:21 View deployment

mintlify bot deployed to staging January 20, 2026 18:22 View deployment

mintlify bot deployed to staging January 20, 2026 18:37 View deployment

xingyaoww marked this pull request as ready for review January 20, 2026 19:02

xingyaoww requested a review from mamoodi as a code owner January 20, 2026 19:02

xingyaoww requested a review from jpelletier1 January 20, 2026 19:02

xingyaoww commented Jan 20, 2026

View reviewed changes

openhands/usage/cli/critic.mdx Outdated Show resolved Hide resolved

Apply suggestion from @xingyaoww

0e67fdf

xingyaoww commented Jan 20, 2026

View reviewed changes

openhands/usage/cli/critic.mdx Outdated Show resolved Hide resolved

xingyaoww commented Jan 20, 2026

View reviewed changes

openhands/usage/cli/critic.mdx Show resolved Hide resolved

xingyaoww added 2 commits January 21, 2026 03:03

Apply suggestion from @xingyaoww

a60c2b8

Apply suggestion from @xingyaoww

fe76ba3

mintlify bot deployed to staging January 20, 2026 19:04 View deployment

mintlify bot deployed to staging January 20, 2026 19:06 View deployment

Fix screenshot paths using relative paths

f273030

Move screenshots to local folders next to their respective mdx files and use relative paths (./screenshots/) instead of absolute paths, following the pattern used in local-llms.mdx. Co-authored-by: openhands <openhands@all-hands.dev>

mintlify bot deployed to staging January 20, 2026 19:08 View deployment

jpelletier1 approved these changes Jan 20, 2026

View reviewed changes

enyst reviewed Jan 21, 2026

View reviewed changes

docs: sync code blocks and regenerate API reference

f9575fa

Co-authored-by: openhands <openhands@all-hands.dev>

mintlify bot deployed to staging January 21, 2026 16:29 View deployment

Add critic demo video to CLI documentation

38300ec

Co-authored-by: openhands <openhands@all-hands.dev>

xingyaoww merged commit f5516da into main Jan 21, 2026
1 of 2 checks passed

xingyaoww deleted the xw/critic-model branch January 21, 2026 21:52

Add SDK Guide for Critic Feature (Experimental) #263

Add SDK Guide for Critic Feature (Experimental) #263

Uh oh!

Conversation

xingyaoww commented Jan 15, 2026

Summary

What's Added

Core Concepts

Implementation Guide

Technical Details

Advanced Usage

Example Code

⚠️ Experimental Status

Related PR

Preview

Checklist

Uh oh!

Uh oh!

Uh oh!

xingyaoww commented Jan 20, 2026

Uh oh!

openhands-ai bot commented Jan 20, 2026

Uh oh!

openhands-ai bot commented Jan 20, 2026

Uh oh!

xingyaoww commented Jan 20, 2026

Uh oh!

openhands-ai bot commented Jan 20, 2026

Uh oh!

openhands-ai bot commented Jan 20, 2026

Changes Made

1. Created new page: openhands/usage/cli/critic.mdx

2. Updated docs.json

Uh oh!

Uh oh!

Uh oh!

Uh oh!

jpelletier1 left a comment

Choose a reason for hiding this comment

Uh oh!

enyst Jan 21, 2026

Choose a reason for hiding this comment

Uh oh!

openhands-ai bot Jan 21, 2026

Choose a reason for hiding this comment

Uh oh!

openhands-ai bot Jan 21, 2026

Choose a reason for hiding this comment

Final summary (since last update)

What I checked

Changes made

Git details

Uh oh!

openhands-ai bot commented Jan 21, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

1. Created new page: `openhands/usage/cli/critic.mdx`

2. Updated `docs.json`