DRAFT: Critic agent that forks generator settings for PR-style review #1706

xingyaoww · 2026-01-12T20:56:56Z

Summary

Add AgentReviewCritic, a CriticBase implementation that spawns a separate OpenHands agent to review the current git diff.
The critic forks the current agent settings (LLM + agent config) from the running conversation rather than reading critic-specific environment variables.
Includes unit tests for critic output parsing.

Fixes #1704

Checklist

If the PR is changing/adding functionality, are there tests to reflect this?
If there is an example, have you run the example to make sure that it works?
If there are instructions on how to run the code, have you followed the instructions and made sure that it works?
If the feature is significant enough to require documentation, is there a PR open on the OpenHands/docs repository with the same branch name?
Is the github CI passing?

@xingyaoww can click here to continue refining the PR

Agent Server images for this PR

• GHCR package: https://github.com/OpenHands/agent-sdk/pkgs/container/agent-server

Variants & Base Images

Variant	Architectures	Base Image	Docs / Tags
java	amd64, arm64	`eclipse-temurin:17-jdk`	Link
python	amd64, arm64	`nikolaik/python-nodejs:python3.12-nodejs22`	Link
golang	amd64, arm64	`golang:1.21-bookworm`	Link

Pull (multi-arch manifest)

# Each variant is a multi-arch manifest supporting both amd64 and arm64
docker pull ghcr.io/openhands/agent-server:8010218-python

Run

docker run -it --rm \
  -p 8000:8000 \
  --name agent-server-8010218-python \
  ghcr.io/openhands/agent-server:8010218-python

All tags pushed for this build

ghcr.io/openhands/agent-server:8010218-golang-amd64
ghcr.io/openhands/agent-server:8010218-golang_tag_1.21-bookworm-amd64
ghcr.io/openhands/agent-server:8010218-golang-arm64
ghcr.io/openhands/agent-server:8010218-golang_tag_1.21-bookworm-arm64
ghcr.io/openhands/agent-server:8010218-java-amd64
ghcr.io/openhands/agent-server:8010218-eclipse-temurin_tag_17-jdk-amd64
ghcr.io/openhands/agent-server:8010218-java-arm64
ghcr.io/openhands/agent-server:8010218-eclipse-temurin_tag_17-jdk-arm64
ghcr.io/openhands/agent-server:8010218-python-amd64
ghcr.io/openhands/agent-server:8010218-nikolaik_s_python-nodejs_tag_python3.12-nodejs22-amd64
ghcr.io/openhands/agent-server:8010218-python-arm64
ghcr.io/openhands/agent-server:8010218-nikolaik_s_python-nodejs_tag_python3.12-nodejs22-arm64
ghcr.io/openhands/agent-server:8010218-golang
ghcr.io/openhands/agent-server:8010218-java
ghcr.io/openhands/agent-server:8010218-python

About Multi-Architecture Support

Each variant tag (e.g., 8010218-python) is a multi-arch manifest supporting both amd64 and arm64
Docker automatically pulls the correct architecture for your platform
Individual architecture tags (e.g., 8010218-python-amd64) are also available if needed

github-actions · 2026-01-12T21:00:12Z

Coverage Report •

File	Stmts	Miss	Cover	Missing
openhands-sdk/openhands/sdk/critic/impl
agent_review.py	90	48	46%	33, 35, 41, 102–103, 105–106, 108–111, 113–114, 117, 120–122, 127, 139–149, 168–170, 207, 209, 211, 219–220, 226, 228–229, 235, 243–245, 250–252, 257
openhands-sdk/openhands/sdk/hooks
config.py	128	13	89%	75, 103, 115–116, 122, 124, 202, 207, 268, 317–318, 320–321
executor.py	90	11	87%	35, 64, 76, 95, 110, 159–160, 174–175, 180–181
TOTAL	16416	5071	69%

openhands-sdk/openhands/sdk/critic/impl/agent_review.py

xingyaoww · 2026-01-13T17:39:34Z

@OpenHands do a /codereview-roasted and help me address all review comments in this PR.

When you are done, you should help me write a script in examples/0 folder that demonstrate this critic work in action. You can refer to the existing delagation example, as well as the example introduced in #1269

openhands-ai · 2026-01-13T17:39:44Z

I'm on it! xingyaoww can track my progress at all-hands.dev

Co-authored-by: openhands <openhands@all-hands.dev>

openhands-ai · 2026-01-13T17:45:44Z

Final summary (double-checked)

✅ Checklist against your request

Address all PR review comments found on PR DRAFT: Critic agent that forks generator settings for PR-style review #1706
- Switched parsing/output to use CriticResult directly (no custom output model).
- Added llm: LLM | None field to AgentReviewCritic so callers can pass the LLM explicitly (removes the need for the “extract_llm hack” as the primary path).
Add an example script demonstrating the critic working in action
- Added examples/00_critic/00_agent_review_critic_stop_hook.py showing a Stop-hook gating flow with AgentReviewCritic.

🔍 Conciseness / extraneous changes check

The diff vs origin/main contains only:
- New critic implementation + exports
- New unit tests for parsing
- New example script
No unrelated refactors or drive-by formatting changes outside those files.

📌 What changed (since the last state on main)

New critic: openhands-sdk/openhands/sdk/critic/impl/agent_review.py
- AgentReviewCritic.llm: LLM | None = None
- _parse_output() now returns CriticResult
- evaluate() returns parsed CriticResult directly
Exports wired up
- openhands-sdk/openhands/sdk/critic/__init__.py
- openhands-sdk/openhands/sdk/critic/impl/__init__.py
New tests: tests/sdk/critic/test_agent_review_critic.py
- Updated assertions to match CriticResult
New example: examples/00_critic/00_agent_review_critic_stop_hook.py
- Creates a temp git repo, generates a diff, runs the critic, and uses a Stop hook to gate stopping until approval.

🚀 Push status

Changes were pushed to the remote branch openhands/critic-agent-reviewer (PR DRAFT: Critic agent that forks generator settings for PR-style review #1706 updated).

View full conversation

openhands-ai · 2026-01-13T17:51:10Z

Looks like there are a few issues preventing this PR from being merged!

GitHub Actions are failing:
- [Optional] Docs example

If you'd like me to help, just leave a comment, like

@OpenHands please fix the failing actions on PR #1706 at branch `openhands/critic-agent-reviewer`

Feel free to include any additional details that might help me get this PR into a better state.

_{^{You can manage your notification settings}}

xingyaoww · 2026-01-14T15:52:21Z

examples/00_critic/00_agent_review_critic_stop_hook.py

We should put it under https://github.com/OpenHands/software-agent-sdk/tree/main/examples/01_standalone_sdk

xingyaoww · 2026-01-14T15:55:54Z

examples/00_critic/00_agent_review_critic_stop_hook.py

+
+if __name__ == "__main__":
+    raise SystemExit(main())
+"""


This seems not ideal. Should we make a hook ideally accepting a callback function as well?

xingyaoww · 2026-01-14T15:56:13Z

openhands-sdk/openhands/sdk/critic/impl/agent_review.py

+    def _extract_llm(self, events: Sequence[LLMConvertibleEvent]) -> LLM | None:
+        for event in events:
+            agent = getattr(event, "agent", None)
+            llm = getattr(agent, "llm", None)
+            if isinstance(llm, LLM):
+                return llm
+
+        for event in events:
+            if not isinstance(event, SystemPromptEvent):
+                continue
+            agent = getattr(event, "agent", None)
+            llm = getattr(agent, "llm", None)
+            if isinstance(llm, LLM):
+                return llm
+
+        return None


We don't need this, since self.llm is already there

xingyaoww · 2026-01-14T15:56:30Z

openhands-sdk/openhands/sdk/critic/impl/agent_review.py

+    `not_pass`.
+    """
+
+    llm: LLM | None = None


This should be llm: LLM

It is a required arg

xingyaoww · 2026-01-14T15:57:07Z

openhands-sdk/openhands/sdk/critic/impl/agent_review.py

+    def _extract_agent(self, events: Sequence[LLMConvertibleEvent]) -> Agent | None:
+        for event in events:
+            agent = getattr(event, "agent", None)
+            if isinstance(agent, Agent):
+                return agent
+
+        for event in events:
+            if not isinstance(event, SystemPromptEvent):
+                continue
+            agent = getattr(event, "agent", None)
+            if isinstance(agent, Agent):
+                return agent
+
+        return None


Similar to the get planning agent function in the openhands.tools.preset - we should reconstruct an Agent instance ourselves

xingyaoww · 2026-01-20T01:37:51Z

@OpenHands review all current PR reviews and modify the code to address all the review comments. If you check examples/01_standalone_sdk/33_hooks/33_hooks.py example, you can see how we can set a Hook at Finish event. Can you extend the hook system, so we can pass in custom callback function python for the finish event hook, then we should modify the critic agent's implementation so critic agent runs in a callback function in a finish hook?

openhands-ai · 2026-01-20T01:37:59Z

I'm on it! xingyaoww can track my progress at all-hands.dev

Address PR review comments: - Make llm field required in AgentReviewCritic (not optional) - Remove _extract_llm and _extract_agent methods - Add agent_factory parameter to allow custom agent creation - Add create_critic_stop_hook() helper for callback-based stop hooks - Create get_critic_agent in openhands.tools.preset.critic Extend hook system: - Add HookType.CALLBACK for Python callback functions - Update HookDefinition to support callback field - Update HookExecutor to execute callback hooks - Add validation to ensure callback is provided for CALLBACK type Move example: - Move from examples/00_critic/ to examples/01_standalone_sdk/34_critic_agent_hook.py - Update example to use callback-based hook with agent_factory Add tests: - Add tests for callback hook functionality - Update critic tests to use mock LLM fixture Co-authored-by: openhands <openhands@all-hands.dev>

openhands-ai · 2026-01-20T01:51:36Z

Summary

I've successfully addressed all the PR review comments and implemented the requested features for PR #1706. The changes have been pushed to the openhands/critic-agent-reviewer branch.

Changes Made

1. PR Review Comments Addressed ✅

Made llm field required in AgentReviewCritic (no longer optional)
Removed _extract_llm method - no longer needed since LLM is now required
Removed _extract_agent method - replaced with agent_factory parameter pattern
Added agent_factory parameter - allows users to provide custom agent creation (similar to get_planning_agent pattern)
Created get_critic_agent in openhands.tools.preset.critic - provides a pre-configured critic agent with file viewing tools
Moved example from examples/00_critic/ to examples/01_standalone_sdk/34_critic_agent_hook.py

2. Hook System Extended for Python Callbacks ✅

Added HookType.CALLBACK - new hook type for Python callback functions
Updated HookDefinition - added callback field for Python functions
Updated HookExecutor - handles callback execution alongside command execution
Added validation - ensures callback is provided for CALLBACK type hooks

3. Critic Stop Hook Helper ✅

Created create_critic_stop_hook() - helper function that creates a callback-based stop hook
The callback:
- Gets the current git diff from the workspace
- Runs the critic agent to review the diff
- Returns allow/deny based on the critic's decision
- Provides feedback to the agent if denied

4. Tests Added ✅

Added tests for callback hook functionality (6 new tests)
Updated critic tests to use mock LLM fixture
Added test to verify llm is required

Files Changed

openhands-sdk/openhands/sdk/critic/impl/agent_review.py - Refactored critic implementation
openhands-sdk/openhands/sdk/hooks/config.py - Added CALLBACK hook type
openhands-sdk/openhands/sdk/hooks/executor.py - Added callback execution support
openhands-tools/openhands/tools/preset/critic.py - New file with get_critic_agent
examples/01_standalone_sdk/34_critic_agent_hook.py - New example demonstrating callback hooks
tests/sdk/critic/test_agent_review_critic.py - Updated tests
tests/sdk/hooks/test_executor.py - Added callback hook tests

Example Usage

from openhands.sdk.critic.impl.agent_review import (
    AgentReviewCritic,
    create_critic_stop_hook,
)
from openhands.tools.preset.critic import get_critic_agent

# Create critic with custom agent factory
critic = AgentReviewCritic(
    llm=llm,
    agent_factory=get_critic_agent,  # Optional: uses default if not provided
    review_style="roasted",
)

# Create callback-based stop hook
hook_config = HookConfig(
    stop=[
        HookMatcher(
            hooks=[create_critic_stop_hook(critic, str(workspace))]
        )
    ]
)

The PR is now updated at: #1706

View full conversation

openhands-sdk/openhands/sdk/hooks/config.py



+if TYPE_CHECKING:
+    from openhands.sdk.hooks.executor import HookResult


openhands-sdk/openhands/sdk/hooks/config.py


+if TYPE_CHECKING:
+    from openhands.sdk.hooks.executor import HookResult
+    from openhands.sdk.hooks.types import HookEvent


xingyaoww · 2026-01-20T02:16:12Z

examples/01_standalone_sdk/34_critic_agent_hook.py

+    return subprocess.check_output(["git", "diff"], cwd=workspace, text=True)
+
+
+def main() -> None:


@OpenHands please remove the main function and write it just as a script. /remember this in AGENTS.md when you finish modifying this script

I'm on it! xingyaoww can track my progress at all-hands.dev

I've completed the requested changes and pushed them to the PR branch. Here's a summary:

Changes Made

1. Refactored examples/01_standalone_sdk/34_critic_agent_hook.py

✅ Removed the main() function wrapper

✅ Converted the example to run as a direct script at module level

✅ Kept helper functions (_git, _git_patch) as they are reusable utilities

✅ Consistent with other examples in the 01_standalone_sdk/ folder (e.g., 01_hello_world.py)

2. Updated AGENTS.md with the convention

✅ Added new <EXAMPLES_STYLE> section documenting that:

Examples in examples/01_standalone_sdk/ should be written as direct scripts

NOT wrapped in a main() function

Script code should run at module level after imports and helper function definitions

Commits Pushed

bce1736a - Refactor critic agent hook example to script style

4f2ad9af - Document examples script style convention in AGENTS.md

Both commits have been pushed to the openhands/critic-agent-reviewer branch and the PR has been updated.

View full conversation

Remove the main() function and write the example as a direct script, consistent with other examples in the 01_standalone_sdk folder. Co-authored-by: openhands <openhands@all-hands.dev>

Add EXAMPLES_STYLE section noting that examples in 01_standalone_sdk should be written as direct scripts without a main() function wrapper. Co-authored-by: openhands <openhands@all-hands.dev>

Add AgentReviewCritic (forks agent settings)

78cfae6

openhands-ai bot mentioned this pull request Jan 12, 2026

Interleave codereview-roasted with agent task: add critic-agent model that reviews git diff and returns pass/fail #1704

Open

Merge branch 'main' into openhands/critic-agent-reviewer

7248273

xingyaoww commented Jan 13, 2026

View reviewed changes

openhands-sdk/openhands/sdk/critic/impl/agent_review.py Outdated Show resolved Hide resolved

xingyaoww commented Jan 13, 2026

View reviewed changes

openhands-sdk/openhands/sdk/critic/impl/agent_review.py Outdated Show resolved Hide resolved

Fix example lint/type issues

6180b26

Co-authored-by: openhands <openhands@all-hands.dev>

xingyaoww commented Jan 14, 2026

View reviewed changes

Merge branch 'main' into openhands/critic-agent-reviewer

7961dc8

github-code-quality bot found potential problems Jan 20, 2026

View reviewed changes

openhands-sdk/openhands/sdk/hooks/config.py

if TYPE_CHECKING:

from openhands.sdk.hooks.executor import HookResult

openhands-sdk/openhands/sdk/hooks/config.py

if TYPE_CHECKING:

from openhands.sdk.hooks.executor import HookResult

from openhands.sdk.hooks.types import HookEvent

xingyaoww commented Jan 20, 2026

View reviewed changes

openhands-agent added 2 commits January 20, 2026 02:18

Refactor critic agent hook example to script style

bce1736

Remove the main() function and write the example as a direct script, consistent with other examples in the 01_standalone_sdk folder. Co-authored-by: openhands <openhands@all-hands.dev>

Document examples script style convention in AGENTS.md

4475dfc

Add EXAMPLES_STYLE section noting that examples in 01_standalone_sdk should be written as direct scripts without a main() function wrapper. Co-authored-by: openhands <openhands@all-hands.dev>

xingyaoww force-pushed the openhands/critic-agent-reviewer branch from 4f2ad9a to 4475dfc Compare January 20, 2026 02:23



		if TYPE_CHECKING:
		from openhands.sdk.hooks.executor import HookResult

		return subprocess.check_output(["git", "diff"], cwd=workspace, text=True)


		def main() -> None:

DRAFT: Critic agent that forks generator settings for PR-style review #1706

Are you sure you want to change the base?

DRAFT: Critic agent that forks generator settings for PR-style review #1706

Uh oh!

Conversation

xingyaoww commented Jan 12, 2026 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Checklist

Uh oh!

github-actions bot commented Jan 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

xingyaoww commented Jan 13, 2026

Uh oh!

openhands-ai bot commented Jan 13, 2026

Uh oh!

openhands-ai bot commented Jan 13, 2026

Final summary (double-checked)

✅ Checklist against your request

🔍 Conciseness / extraneous changes check

📌 What changed (since the last state on main)

🚀 Push status

Uh oh!

openhands-ai bot commented Jan 13, 2026

Uh oh!

xingyaoww Jan 14, 2026

Choose a reason for hiding this comment

Uh oh!

xingyaoww Jan 14, 2026

Choose a reason for hiding this comment

Uh oh!

xingyaoww Jan 14, 2026

Choose a reason for hiding this comment

Uh oh!

xingyaoww Jan 14, 2026

Choose a reason for hiding this comment

Uh oh!

xingyaoww Jan 14, 2026

Choose a reason for hiding this comment

Uh oh!

xingyaoww commented Jan 20, 2026

Uh oh!

openhands-ai bot commented Jan 20, 2026

Uh oh!

openhands-ai bot commented Jan 20, 2026

Summary

Changes Made

1. PR Review Comments Addressed ✅

2. Hook System Extended for Python Callbacks ✅

3. Critic Stop Hook Helper ✅

4. Tests Added ✅

Files Changed

Example Usage

Uh oh!

xingyaoww Jan 20, 2026

Choose a reason for hiding this comment

Uh oh!

openhands-ai bot Jan 20, 2026

Choose a reason for hiding this comment

Uh oh!

openhands-ai bot Jan 20, 2026

Choose a reason for hiding this comment

Changes Made

1. Refactored examples/01_standalone_sdk/34_critic_agent_hook.py

2. Updated AGENTS.md with the convention

Commits Pushed

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

xingyaoww commented Jan 12, 2026 •

edited by github-actions bot

Loading

github-actions bot commented Jan 12, 2026 •

edited

Loading

1. Refactored `examples/01_standalone_sdk/34_critic_agent_hook.py`

2. Updated `AGENTS.md` with the convention