OpenHands · xingyaoww · Jan 21, 2026 · Jan 15, 2026 · Jan 20, 2026 · Jan 20, 2026
diff --git a/docs.json b/docs.json
@@ -215,7 +215,8 @@
           {
             "group": "Extensions",
             "pages": [
-              "openhands/usage/cli/mcp-servers"
+              "openhands/usage/cli/mcp-servers",
+              "openhands/usage/cli/critic"
             ]
           },
           {
@@ -268,7 +269,8 @@
                   "sdk/guides/agent-custom",
                   "sdk/guides/convo-custom-visualizer",
                   "sdk/guides/agent-stuck-detector",
-                  "sdk/guides/agent-tom-agent"
+                  "sdk/guides/agent-tom-agent",
+                  "sdk/guides/critic"
                 ]
               },
               {

@@ -0,0 +1,41 @@
+---
+title: Critic (Experimental)
+description: Automatic task success prediction for OpenHands LLM Provider users
+---
+
+<Warning>
+**This feature is highly experimental** and subject to change. The API, configuration, and behavior may evolve significantly based on feedback and testing.
+</Warning>
+
+## Overview
+
+If you're using the [OpenHands LLM Provider](/openhands/usage/llms/openhands-llms), an experimental **critic feature** is automatically enabled to predict task success in real-time.
+
+For detailed information about the critic feature, including programmatic access and advanced usage, see the [SDK Critic Guide](/sdk/guides/critic).
+
+
+## What is the Critic?
+
+The critic is an LLM-based evaluator that analyzes agent actions and conversation history to predict the quality or success probability of agent decisions. It provides:
+
+- **Quality scores**: Probability scores between 0.0 and 1.0 indicating predicted success
+- **Real-time feedback**: Scores computed during agent execution, not just at completion
+
+<video
+  controls
+  className="w-full aspect-video"
+  src="/openhands/usage/cli/critic-demo.mp4"
+></video>
+
+![Critic output in CLI](./screenshots/critic-cli-output.png)
+
+## Pricing
+
+The critic feature is **free during the public beta phase** for all OpenHands LLM Provider users.
+
+## Disabling the Critic
+
+If you prefer not to use the critic feature, you can disable it in your settings.
+
+![Critic settings in CLI](./screenshots/critic-cli-settings.png)
+
@@ -26,18 +26,8 @@ AgentBase and implements the agent execution logic.
 
 #### Properties
 
-- `agent_context`: AgentContext | None
-- `condenser`: CondenserBase | None
-- `filter_tools_regex`: str | None
-- `include_default_tools`: list[str]
-- `llm`: LLM
-- `mcp_config`: dict[str, Any]
-- `model_config`: ClassVar[ConfigDict] = (configuration object)
+- `model_config`: = (configuration object)
   Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
-- `security_policy_filename`: str
-- `system_prompt_filename`: str
-- `system_prompt_kwargs`: dict[str, object]
-- `tools`: list[Tool]
 
 #### Methods
 
@@ -94,11 +84,12 @@ agent implementations must follow.
 
 - `agent_context`: AgentContext | None
 - `condenser`: CondenserBase | None
+- `critic`: CriticBase | None
 - `filter_tools_regex`: str | None
 - `include_default_tools`: list[str]
 - `llm`: LLM
 - `mcp_config`: dict[str, Any]
-- `model_config`: ClassVar[ConfigDict] = (configuration object)
+- `model_config`: = (configuration object)
   Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
 - `name`: str
   Returns the name of the Agent.

@@ -126,6 +126,10 @@ Send a message to the agent.
 
 Set the confirmation policy for the conversation.
 
+#### abstractmethod set_security_analyzer()
+
+Set the security analyzer for the conversation.
+
 #### abstractmethod update_secrets()
 
 ### class Conversation
@@ -197,8 +201,6 @@ Bases: `OpenHandsModel`
 - `execution_status`: [ConversationExecutionStatus](#class-conversationexecutionstatus)
 - `id`: UUID
 - `max_iterations`: int
-- `model_config`: ClassVar[ConfigDict] = (configuration object)
-  Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
 - `persistence_dir`: str | None
 - `secret_registry`: [SecretRegistry](#class-secretregistry)
 - `security_analyzer`: SecurityAnalyzerBase | None
@@ -280,6 +282,10 @@ actions that are pending confirmation or execution.
 
 Return True if the lock is currently held by any thread.
 
+#### model_config = (configuration object)
+
+Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
+
 #### model_post_init()
 
 This function is meant to behave like a BaseModel method to initialise private attributes.
@@ -352,7 +358,25 @@ Conversation will then calls MyVisualizer() followed by initialize(state)
 
 Initialize the visualizer base.
 
-#### initialize()
+#### create_sub_visualizer()
+
+Create a visualizer for a sub-agent during delegation.
+
+Override this method to support sub-agent visualization in multi-agent
+delegation scenarios. The sub-visualizer will be used to display events
+from the spawned sub-agent.
+
+By default, returns None which means sub-agents will not have visualization.
+Subclasses that support delegation (like DelegationVisualizer) should
+override this method to create appropriate sub-visualizers.
+
+* Parameters:
+  `agent_id` – The identifier of the sub-agent being spawned
+* Returns:
+  A visualizer instance for the sub-agent, or None if sub-agent
+  visualization is not supported
+
+#### final initialize()
 
 Initialize the visualizer with conversation state.
 
@@ -772,8 +796,6 @@ even when callable secrets fail on subsequent calls.
 
 #### Properties
 
-- `model_config`: ClassVar[ConfigDict] = (configuration object)
-  Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
 - `secret_sources`: dict[str, SecretSource]
 
 #### Methods
@@ -808,6 +830,10 @@ fresh values from callables to ensure comprehensive masking.
 * Returns:
   Text with secret values replaced by `<secret-hidden>`
 
+#### model_config = (configuration object)
+
+Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
+
 #### model_post_init()
 
 This function is meant to behave like a BaseModel method to initialise private attributes.

@@ -12,8 +12,9 @@ Bases: [`LLMConvertibleEvent`](#class-llmconvertibleevent)
 #### Properties
 
 - `action`: Action | None
+- `critic_result`: CriticResult | None
 - `llm_response_id`: str
-- `model_config`: ClassVar[ConfigDict] = (configuration object)
+- `model_config`: = (configuration object)
   Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
 - `reasoning_content`: str | None
 - `responses_reasoning_item`: ReasoningItemModel | None
@@ -47,7 +48,7 @@ represents an error produced by the agent/scaffold, not model output.
 #### Properties
 
 - `error`: str
-- `model_config`: ClassVar[ConfigDict] = (configuration object)
+- `model_config`: = (configuration object)
   Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
 - `source`: Literal['agent', 'user', 'environment']
 - `visualize`: Text
@@ -68,7 +69,7 @@ This action indicates a condensation of the conversation history is happening.
 
 - `forgotten_event_ids`: list[str]
 - `llm_response_id`: str
-- `model_config`: ClassVar[ConfigDict] = (configuration object)
+- `model_config`: = (configuration object)
   Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
 - `source`: Literal['agent', 'user', 'environment']
 - `summary`: str | None
@@ -86,7 +87,7 @@ This action is used to request a condensation of the conversation history.
 
 #### Properties
 
-- `model_config`: ClassVar[ConfigDict] = (configuration object)
+- `model_config`: = (configuration object)
   Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
 - `source`: Literal['agent', 'user', 'environment']
 - `visualize`: Text
@@ -112,7 +113,7 @@ This event represents a summary generated by a condenser.
 
 #### Properties
 
-- `model_config`: ClassVar[ConfigDict] = (configuration object)
+- `model_config`: = (configuration object)
   Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
 - `source`: Literal['agent', 'user', 'environment']
 - `summary`: str
@@ -138,7 +139,7 @@ to ensure compatibility with websocket transmission.
 #### Properties
 
 - `key`: str
-- `model_config`: ClassVar[ConfigDict] = (configuration object)
+- `model_config`: = (configuration object)
   Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
 - `source`: Literal['agent', 'user', 'environment']
 - `value`: Any
@@ -194,7 +195,7 @@ instead of writing it to a file inside the Docker container.
 
 - `filename`: str
 - `log_data`: str
-- `model_config`: ClassVar[ConfigDict] = (configuration object)
+- `model_config`: = (configuration object)
   Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
 - `model_name`: str
 - `source`: Literal['agent', 'user', 'environment']
@@ -208,11 +209,8 @@ Base class for events that can be converted to LLM messages.
 
 #### Properties
 
-- `id`: EventID
-- `model_config`: ClassVar[ConfigDict] = (configuration object)
+- `model_config`: = (configuration object)
   Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
-- `source`: SourceType
-- `timestamp`: str
 
 #### Methods
 
@@ -234,8 +232,8 @@ This is originally the “MessageAction”, but it suppose not to be tool call.
 #### Properties
 
 - `activated_skills`: list[str]
+- `critic_result`: CriticResult | None
 - `extended_content`: list[TextContent]
-- `id`: EventID
 - `llm_message`: Message
 - `llm_response_id`: str | None
 - `model_config`: ClassVar[ConfigDict] = (configuration object)
@@ -245,7 +243,6 @@ This is originally the “MessageAction”, but it suppose not to be tool call.
 - `source`: Literal['agent', 'user', 'environment']
 - `thinking_blocks`: Sequence[ThinkingBlock | RedactedThinkingBlock]
   Return the Anthropic thinking blocks from the LLM message.
-- `timestamp`: str
 - `visualize`: Text
   Return Rich Text representation of this message event.
 
@@ -264,7 +261,7 @@ Examples include tool execution, error, user reject.
 
 #### Properties
 
-- `model_config`: ClassVar[ConfigDict] = (configuration object)
+- `model_config`: = (configuration object)
   Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
 - `source`: Literal['agent', 'user', 'environment']
 - `tool_call_id`: str
@@ -277,7 +274,7 @@ Bases: [`ObservationBaseEvent`](#class-observationbaseevent)
 #### Properties
 
 - `action_id`: str
-- `model_config`: ClassVar[ConfigDict] = (configuration object)
+- `model_config`: = (configuration object)
   Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
 - `observation`: Observation
 - `visualize`: Text
@@ -296,7 +293,7 @@ Event indicating that the agent execution was paused by user request.
 
 #### Properties
 
-- `model_config`: ClassVar[ConfigDict] = (configuration object)
+- `model_config`: = (configuration object)
   Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
 - `source`: Literal['agent', 'user', 'environment']
 - `visualize`: Text
@@ -310,7 +307,7 @@ System prompt added by the agent.
 
 #### Properties
 
-- `model_config`: ClassVar[ConfigDict] = (configuration object)
+- `model_config`: = (configuration object)
   Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
 - `source`: Literal['agent', 'user', 'environment']
 - `system_prompt`: TextContent
@@ -331,7 +328,7 @@ Event from VLLM representing token IDs used in LLM interaction.
 
 #### Properties
 
-- `model_config`: ClassVar[ConfigDict] = (configuration object)
+- `model_config`: = (configuration object)
   Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
 - `prompt_token_ids`: list[int]
 - `response_token_ids`: list[int]
@@ -346,7 +343,7 @@ Observation when user rejects an action in confirmation mode.
 #### Properties
 
 - `action_id`: str
-- `model_config`: ClassVar[ConfigDict] = (configuration object)
+- `model_config`: = (configuration object)
   Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
 - `rejection_reason`: str
 - `visualize`: Text