-
Notifications
You must be signed in to change notification settings - Fork 8
Add SDK Guide for Critic Feature (Experimental) #263
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This guide documents the experimental API-based Critic feature for real-time evaluation of agent actions and messages using an external LLM. Key topics covered: - Overview of what critics are and when to use them - Two evaluation modes: finish_and_message and all_actions - Configuration and setup with APIBasedCritic - Understanding and visualizing critic results - Technical details including chat template format - Custom critic implementations - Built-in critic types - Troubleshooting common issues The guide includes clear warnings that this is an experimental feature subject to change and not recommended for production use without thorough testing. Co-authored-by: openhands <openhands@all-hands.dev>
- Remove 'When to Use Critics' section - Remove implementation details (evaluation modes, how it works, technical details) - Remove custom critic implementations and other critic types - Simplify troubleshooting section - Add note that critic is hosted by OpenHands LLM Provider and is free - Rename 'Setting Up' to 'Quick Start' with emphasis on auto-configuration
- Explain what a critic is and what it provides - Add example use case: triggering agent to reflect/fix when critic indicates poor performance
|
@OpenHands can we link the this docs page to |
|
I'm on it! xingyaoww can track my progress at all-hands.dev |
Link the sdk/guides/critic.mdx page to the Agent Features section in docs.json navigation.
|
I've linked the critic docs page to Changes made:
The change has been committed and pushed to the |
|
@OpenHands can we create a new page for It should link to the current SDK page, it should explain that we have experiment critic feature enabled for people who are already using OpenHands LLM Provider (https://docs.openhands.dev/openhands/usage/llms/openhands-llms) It is currently free during public beta phase |
|
I'm on it! xingyaoww can track my progress at all-hands.dev |
- Create new page at openhands/usage/cli/critic.mdx explaining the experimental critic feature - Add page to CLI Extensions group in docs.json - Link to SDK critic guide for detailed information - Note that feature is free during public beta phase Co-authored-by: openhands <openhands@all-hands.dev>
|
I've created the new CLI critic page and added it to the documentation navigation. Here's a summary of the changes: Changes Made1. Created new page:
|
- Add critic-cli-output.png showing critic score in CLI - Add critic-cli-settings.png showing CLI settings panel - Add critic-sdk-visualizer.png showing SDK visualizer output - Update CLI critic docs to embed screenshots - Update SDK critic guide to embed visualizer screenshot Co-authored-by: openhands <openhands@all-hands.dev>
Move screenshots to local folders next to their respective mdx files and use relative paths (./screenshots/) instead of absolute paths, following the pattern used in local-llms.mdx. Co-authored-by: openhands <openhands@all-hands.dev>
jpelletier1
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
| </Warning> | ||
|
|
||
| <Note> | ||
| The critic model is hosted by the OpenHands LLM Provider and is currently free to use. This example is available on GitHub: [examples/01_standalone_sdk/34_critic_example.py](https://github.com/OpenHands/software-agent-sdk/blob/main/examples/01_standalone_sdk/34_critic_example.py) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@OpenHands run sync lock workflow and update PR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm on it! enyst can track my progress at all-hands.dev
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Final summary (since last update)
What I checked
- Request addressed: ran the “sync/lock” equivalent locally (code-block sync + API doc generation) and updated the existing PR branch.
- Instructions followed: worked on branch
xw/critic-model, committed changes, pushed to the same branch, and did not comment on the PR. - Branch is up to date / pushed:
HEADequalsorigin/xw/critic-model(no unpushed commits). - Conciseness review: changes are the direct output of running the repo’s sync steps; no unrelated manual edits.
Changes made
-
Synced MDX code blocks from
OpenHands/software-agent-sdk- Updated:
sdk/guides/critic.mdxsdk/guides/hooks.mdxsdk/guides/plugins.mdxsdk/guides/skill.mdx
- Updated:
-
Regenerated SDK API reference pages
- Updated:
sdk/api-reference/openhands.sdk.agent.mdxsdk/api-reference/openhands.sdk.conversation.mdxsdk/api-reference/openhands.sdk.event.mdxsdk/api-reference/openhands.sdk.llm.mdxsdk/api-reference/openhands.sdk.tool.mdxsdk/api-reference/openhands.sdk.workspace.mdx
- Updated:
Git details
- Commit on PR branch:
f9575fa docs: sync code blocks and regenerate API reference - Push status: already pushed to
origin/xw/critic-model(no further action needed).
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: openhands <openhands@all-hands.dev>
|
Looks like there are a few issues preventing this PR from being merged!
If you'd like me to help, just leave a comment, like Feel free to include any additional details that might help me get this PR into a better state. You can manage your notification settings |
Summary
This PR adds comprehensive documentation for the new experimental Critic feature in the OpenHands SDK.
What's Added
A new guide at
sdk/guides/critic.mdxthat covers:Core Concepts
finish_and_message(default) andall_actionsImplementation Guide
Technical Details
Advanced Usage
Example Code
Includes the full example from
examples/01_standalone_sdk/34_critic_model_example.pywith:The guide includes prominent warnings that this feature is:
Related PR
This documentation corresponds to OpenHands/software-agent-sdk#1269 which implements the Critic feature.
Preview
The guide follows the same structure and style as existing SDK guides, including:
Checklist