fix: Use Fleet SDK Task.verify_detailed() for verifier execution #4

dzorlu · 2026-01-26T02:00:59Z

Summary

Replace custom _execute_verifier_local() with Fleet SDK's Task.verify_detailed()
Properly sets up verifier namespace with Environment type and helper functions
Fixes "name 'Environment' is not defined" errors during verifier execution

Problem

The previous implementation used a bare exec() with an empty namespace, which caused verifiers using Environment type annotations to fail with:

NameError: name 'Environment' is not defined

Fleet SDK's verifier_from_string() properly injects Environment: object and other helpers into the namespace.

Changes

_compute_reward: Create Fleet SDK Task object and call verify_detailed(fleet_env)
Support both verifier_code (OpenEnv) and verifier_func (Fleet SDK) field names
Add comprehensive logging for debugging verifier execution
Remove broken _execute_verifier_local method

Test plan

All 23 unit tests pass
Test verifier success returns score from response.result
Test verifier failure returns 0.0
Test missing verifier returns 0.0
Test missing orchestrator returns 0.0
Test verifier exception returns 0.0
Test success with None result returns 1.0
Test support for verifier_func field name

🤖 Generated with Claude Code

Replace custom _execute_verifier_local() with Fleet SDK's Task.verify_detailed() which properly sets up the verifier namespace with: - Environment type annotation - Helper functions (normalized_contains, etc.) - Proper function discovery (not just "verify" function) This fixes "name 'Environment' is not defined" errors during verifier execution. Changes: - _compute_reward: Create Fleet SDK Task and call verify_detailed() - Support both 'verifier_code' and 'verifier_func' field names - Add comprehensive logging for debugging - Remove broken _execute_verifier_local method Tests: - Update all verifier tests to mock Fleet SDK Task.verify_detailed() - Add tests for various edge cases (no verifier, no orch, exceptions) - Fix fixture to avoid asyncio.run() conflicts with pytest-asyncio 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

dzorlu · 2026-01-26T02:02:57Z

Merging into PR #1 instead

dzorlu closed this Jan 26, 2026

dzorlu deleted the fix/use-fleet-sdk-verifier branch January 26, 2026 02:03

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: Use Fleet SDK Task.verify_detailed() for verifier execution #4

fix: Use Fleet SDK Task.verify_detailed() for verifier execution #4

Uh oh!

dzorlu commented Jan 26, 2026

Uh oh!

dzorlu commented Jan 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

fix: Use Fleet SDK Task.verify_detailed() for verifier execution #4

fix: Use Fleet SDK Task.verify_detailed() for verifier execution #4

Uh oh!

Conversation

dzorlu commented Jan 26, 2026

Summary

Problem

Changes

Test plan

Uh oh!

dzorlu commented Jan 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants