Skip to content

Conversation

@xingyaoww
Copy link
Collaborator

@xingyaoww xingyaoww commented Jan 12, 2026

Summary

  • Add AgentReviewCritic, a CriticBase implementation that spawns a separate OpenHands agent to review the current git diff.
  • The critic forks the current agent settings (LLM + agent config) from the running conversation rather than reading critic-specific environment variables.
  • Includes unit tests for critic output parsing.

Fixes #1704

Checklist

  • If the PR is changing/adding functionality, are there tests to reflect this?
  • If there is an example, have you run the example to make sure that it works?
  • If there are instructions on how to run the code, have you followed the instructions and made sure that it works?
  • If the feature is significant enough to require documentation, is there a PR open on the OpenHands/docs repository with the same branch name?
  • Is the github CI passing?

@xingyaoww can click here to continue refining the PR


Agent Server images for this PR

GHCR package: https://github.com/OpenHands/agent-sdk/pkgs/container/agent-server

Variants & Base Images

Variant Architectures Base Image Docs / Tags
java amd64, arm64 eclipse-temurin:17-jdk Link
python amd64, arm64 nikolaik/python-nodejs:python3.12-nodejs22 Link
golang amd64, arm64 golang:1.21-bookworm Link

Pull (multi-arch manifest)

# Each variant is a multi-arch manifest supporting both amd64 and arm64
docker pull ghcr.io/openhands/agent-server:8010218-python

Run

docker run -it --rm \
  -p 8000:8000 \
  --name agent-server-8010218-python \
  ghcr.io/openhands/agent-server:8010218-python

All tags pushed for this build

ghcr.io/openhands/agent-server:8010218-golang-amd64
ghcr.io/openhands/agent-server:8010218-golang_tag_1.21-bookworm-amd64
ghcr.io/openhands/agent-server:8010218-golang-arm64
ghcr.io/openhands/agent-server:8010218-golang_tag_1.21-bookworm-arm64
ghcr.io/openhands/agent-server:8010218-java-amd64
ghcr.io/openhands/agent-server:8010218-eclipse-temurin_tag_17-jdk-amd64
ghcr.io/openhands/agent-server:8010218-java-arm64
ghcr.io/openhands/agent-server:8010218-eclipse-temurin_tag_17-jdk-arm64
ghcr.io/openhands/agent-server:8010218-python-amd64
ghcr.io/openhands/agent-server:8010218-nikolaik_s_python-nodejs_tag_python3.12-nodejs22-amd64
ghcr.io/openhands/agent-server:8010218-python-arm64
ghcr.io/openhands/agent-server:8010218-nikolaik_s_python-nodejs_tag_python3.12-nodejs22-arm64
ghcr.io/openhands/agent-server:8010218-golang
ghcr.io/openhands/agent-server:8010218-java
ghcr.io/openhands/agent-server:8010218-python

About Multi-Architecture Support

  • Each variant tag (e.g., 8010218-python) is a multi-arch manifest supporting both amd64 and arm64
  • Docker automatically pulls the correct architecture for your platform
  • Individual architecture tags (e.g., 8010218-python-amd64) are also available if needed

@github-actions
Copy link
Contributor

github-actions bot commented Jan 12, 2026

Coverage

Coverage Report •
FileStmtsMissCoverMissing
openhands-sdk/openhands/sdk/critic/impl
   agent_review.py904846%33, 35, 41, 102–103, 105–106, 108–111, 113–114, 117, 120–122, 127, 139–149, 168–170, 207, 209, 211, 219–220, 226, 228–229, 235, 243–245, 250–252, 257
openhands-sdk/openhands/sdk/hooks
   config.py1281389%75, 103, 115–116, 122, 124, 202, 207, 268, 317–318, 320–321
   executor.py901187%35, 64, 76, 95, 110, 159–160, 174–175, 180–181
TOTAL16416507169% 

@xingyaoww
Copy link
Collaborator Author

@OpenHands do a /codereview-roasted and help me address all review comments in this PR.

When you are done, you should help me write a script in examples/0 folder that demonstrate this critic work in action. You can refer to the existing delagation example, as well as the example introduced in #1269

@openhands-ai
Copy link

openhands-ai bot commented Jan 13, 2026

I'm on it! xingyaoww can track my progress at all-hands.dev

Co-authored-by: openhands <openhands@all-hands.dev>
@openhands-ai
Copy link

openhands-ai bot commented Jan 13, 2026

Final summary (double-checked)

✅ Checklist against your request

  • Address all PR review comments found on PR DRAFT: Critic agent that forks generator settings for PR-style review #1706
    • Switched parsing/output to use CriticResult directly (no custom output model).
    • Added llm: LLM | None field to AgentReviewCritic so callers can pass the LLM explicitly (removes the need for the “extract_llm hack” as the primary path).
  • Add an example script demonstrating the critic working in action
    • Added examples/00_critic/00_agent_review_critic_stop_hook.py showing a Stop-hook gating flow with AgentReviewCritic.

🔍 Conciseness / extraneous changes check

  • The diff vs origin/main contains only:
    • New critic implementation + exports
    • New unit tests for parsing
    • New example script
  • No unrelated refactors or drive-by formatting changes outside those files.

📌 What changed (since the last state on main)

  • New critic: openhands-sdk/openhands/sdk/critic/impl/agent_review.py
    • AgentReviewCritic.llm: LLM | None = None
    • _parse_output() now returns CriticResult
    • evaluate() returns parsed CriticResult directly
  • Exports wired up
    • openhands-sdk/openhands/sdk/critic/__init__.py
    • openhands-sdk/openhands/sdk/critic/impl/__init__.py
  • New tests: tests/sdk/critic/test_agent_review_critic.py
    • Updated assertions to match CriticResult
  • New example: examples/00_critic/00_agent_review_critic_stop_hook.py
    • Creates a temp git repo, generates a diff, runs the critic, and uses a Stop hook to gate stopping until approval.

🚀 Push status

View full conversation

@openhands-ai
Copy link

openhands-ai bot commented Jan 13, 2026

Looks like there are a few issues preventing this PR from being merged!

  • GitHub Actions are failing:
    • [Optional] Docs example

If you'd like me to help, just leave a comment, like

@OpenHands please fix the failing actions on PR #1706 at branch `openhands/critic-agent-reviewer`

Feel free to include any additional details that might help me get this PR into a better state.

You can manage your notification settings

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if __name__ == "__main__":
raise SystemExit(main())
"""
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems not ideal. Should we make a hook ideally accepting a callback function as well?

Comment on lines 61 to 76
def _extract_llm(self, events: Sequence[LLMConvertibleEvent]) -> LLM | None:
for event in events:
agent = getattr(event, "agent", None)
llm = getattr(agent, "llm", None)
if isinstance(llm, LLM):
return llm

for event in events:
if not isinstance(event, SystemPromptEvent):
continue
agent = getattr(event, "agent", None)
llm = getattr(agent, "llm", None)
if isinstance(llm, LLM):
return llm

return None
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't need this, since self.llm is already there

`not_pass`.
"""

llm: LLM | None = None
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be llm: LLM

It is a required arg

Comment on lines 78 to 91
def _extract_agent(self, events: Sequence[LLMConvertibleEvent]) -> Agent | None:
for event in events:
agent = getattr(event, "agent", None)
if isinstance(agent, Agent):
return agent

for event in events:
if not isinstance(event, SystemPromptEvent):
continue
agent = getattr(event, "agent", None)
if isinstance(agent, Agent):
return agent

return None
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar to the get planning agent function in the openhands.tools.preset - we should reconstruct an Agent instance ourselves

@xingyaoww
Copy link
Collaborator Author

@OpenHands review all current PR reviews and modify the code to address all the review comments. If you check examples/01_standalone_sdk/33_hooks/33_hooks.py example, you can see how we can set a Hook at Finish event. Can you extend the hook system, so we can pass in custom callback function python for the finish event hook, then we should modify the critic agent's implementation so critic agent runs in a callback function in a finish hook?

@openhands-ai
Copy link

openhands-ai bot commented Jan 20, 2026

I'm on it! xingyaoww can track my progress at all-hands.dev

Address PR review comments:
- Make llm field required in AgentReviewCritic (not optional)
- Remove _extract_llm and _extract_agent methods
- Add agent_factory parameter to allow custom agent creation
- Add create_critic_stop_hook() helper for callback-based stop hooks
- Create get_critic_agent in openhands.tools.preset.critic

Extend hook system:
- Add HookType.CALLBACK for Python callback functions
- Update HookDefinition to support callback field
- Update HookExecutor to execute callback hooks
- Add validation to ensure callback is provided for CALLBACK type

Move example:
- Move from examples/00_critic/ to examples/01_standalone_sdk/34_critic_agent_hook.py
- Update example to use callback-based hook with agent_factory

Add tests:
- Add tests for callback hook functionality
- Update critic tests to use mock LLM fixture

Co-authored-by: openhands <openhands@all-hands.dev>
@openhands-ai
Copy link

openhands-ai bot commented Jan 20, 2026

Summary

I've successfully addressed all the PR review comments and implemented the requested features for PR #1706. The changes have been pushed to the openhands/critic-agent-reviewer branch.

Changes Made

1. PR Review Comments Addressed ✅

  • Made llm field required in AgentReviewCritic (no longer optional)
  • Removed _extract_llm method - no longer needed since LLM is now required
  • Removed _extract_agent method - replaced with agent_factory parameter pattern
  • Added agent_factory parameter - allows users to provide custom agent creation (similar to get_planning_agent pattern)
  • Created get_critic_agent in openhands.tools.preset.critic - provides a pre-configured critic agent with file viewing tools
  • Moved example from examples/00_critic/ to examples/01_standalone_sdk/34_critic_agent_hook.py

2. Hook System Extended for Python Callbacks ✅

  • Added HookType.CALLBACK - new hook type for Python callback functions
  • Updated HookDefinition - added callback field for Python functions
  • Updated HookExecutor - handles callback execution alongside command execution
  • Added validation - ensures callback is provided for CALLBACK type hooks

3. Critic Stop Hook Helper ✅

  • Created create_critic_stop_hook() - helper function that creates a callback-based stop hook
  • The callback:
    • Gets the current git diff from the workspace
    • Runs the critic agent to review the diff
    • Returns allow/deny based on the critic's decision
    • Provides feedback to the agent if denied

4. Tests Added ✅

  • Added tests for callback hook functionality (6 new tests)
  • Updated critic tests to use mock LLM fixture
  • Added test to verify llm is required

Files Changed

  • openhands-sdk/openhands/sdk/critic/impl/agent_review.py - Refactored critic implementation
  • openhands-sdk/openhands/sdk/hooks/config.py - Added CALLBACK hook type
  • openhands-sdk/openhands/sdk/hooks/executor.py - Added callback execution support
  • openhands-tools/openhands/tools/preset/critic.py - New file with get_critic_agent
  • examples/01_standalone_sdk/34_critic_agent_hook.py - New example demonstrating callback hooks
  • tests/sdk/critic/test_agent_review_critic.py - Updated tests
  • tests/sdk/hooks/test_executor.py - Added callback hook tests

Example Usage

from openhands.sdk.critic.impl.agent_review import (
    AgentReviewCritic,
    create_critic_stop_hook,
)
from openhands.tools.preset.critic import get_critic_agent

# Create critic with custom agent factory
critic = AgentReviewCritic(
    llm=llm,
    agent_factory=get_critic_agent,  # Optional: uses default if not provided
    review_style="roasted",
)

# Create callback-based stop hook
hook_config = HookConfig(
    stop=[
        HookMatcher(
            hooks=[create_critic_stop_hook(critic, str(workspace))]
        )
    ]
)

The PR is now updated at: #1706

View full conversation



if TYPE_CHECKING:
from openhands.sdk.hooks.executor import HookResult

if TYPE_CHECKING:
from openhands.sdk.hooks.executor import HookResult
from openhands.sdk.hooks.types import HookEvent
return subprocess.check_output(["git", "diff"], cwd=workspace, text=True)


def main() -> None:
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@OpenHands please remove the main function and write it just as a script. /remember this in AGENTS.md when you finish modifying this script

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm on it! xingyaoww can track my progress at all-hands.dev

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've completed the requested changes and pushed them to the PR branch. Here's a summary:

Changes Made

1. Refactored examples/01_standalone_sdk/34_critic_agent_hook.py

  • ✅ Removed the main() function wrapper
  • ✅ Converted the example to run as a direct script at module level
  • ✅ Kept helper functions (_git, _git_patch) as they are reusable utilities
  • ✅ Consistent with other examples in the 01_standalone_sdk/ folder (e.g., 01_hello_world.py)

2. Updated AGENTS.md with the convention

  • ✅ Added new <EXAMPLES_STYLE> section documenting that:
    • Examples in examples/01_standalone_sdk/ should be written as direct scripts
    • NOT wrapped in a main() function
    • Script code should run at module level after imports and helper function definitions

Commits Pushed

  1. bce1736a - Refactor critic agent hook example to script style
  2. 4f2ad9af - Document examples script style convention in AGENTS.md

Both commits have been pushed to the openhands/critic-agent-reviewer branch and the PR has been updated.

View full conversation

Remove the main() function and write the example as a direct script,
consistent with other examples in the 01_standalone_sdk folder.

Co-authored-by: openhands <openhands@all-hands.dev>
Add EXAMPLES_STYLE section noting that examples in 01_standalone_sdk
should be written as direct scripts without a main() function wrapper.

Co-authored-by: openhands <openhands@all-hands.dev>
@xingyaoww xingyaoww force-pushed the openhands/critic-agent-reviewer branch from 4f2ad9a to 4475dfc Compare January 20, 2026 02:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Interleave codereview-roasted with agent task: add critic-agent model that reviews git diff and returns pass/fail

3 participants