Skip to content

fix: prevent non-convergent stuck loops on trivial tasks#13

Merged
AbirAbbas merged 2 commits intomainfrom
fix/stuck-loop-convergence
Feb 23, 2026
Merged

fix: prevent non-convergent stuck loops on trivial tasks#13
AbirAbbas merged 2 commits intomainfrom
fix/stuck-loop-convergence

Conversation

@AbirAbbas
Copy link
Collaborator

Summary

Fixes #12 — Stall/non-convergence on trivial repo_path smoke tasks that exit with code -2.

  • Adds _detect_stuck_loop() for the default path (reviewer-only) which previously had no stuck detection (unlike the flagged path which gets it from the synthesizer)
  • When stuck or loop-exhausted with non-blocking reviewer feedback AND code changes present, returns COMPLETED_WITH_DEBT instead of FAILED_UNRECOVERABLE — issues flow through the debt gate rather than triggering replanning/abort/stall
  • Preserves FAILED_UNRECOVERABLE for genuinely blocking failures and cases where no code was produced

Root cause

The default-path coding loop had no mechanism to detect repetitive non-convergent fix cycles. When a reviewer repeatedly returned approved=False / blocking=False (e.g. requesting minor polish on trivial tasks), the loop burned all 5 iterations + 2 advisor rounds without converging, eventually causing an exit -2 abort with completed_issues: [] and the issue stuck in in_flight_issues.

Behavior change

Scenario Before After
3+ consecutive non-blocking fix cycles Runs all 5 iters → FAILED_UNRECOVERABLE Breaks at iter 3 → COMPLETED_WITH_DEBT
Loop exhausted, reviewer non-blocking, files changed FAILED_UNRECOVERABLE COMPLETED_WITH_DEBT
Loop exhausted, reviewer blocking FAILED_UNRECOVERABLE FAILED_UNRECOVERABLE (unchanged)
Loop exhausted, no files changed FAILED_UNRECOVERABLE FAILED_UNRECOVERABLE (unchanged)

Test plan

  • Unit test: _detect_stuck_loop() with various history patterns
  • Integration test: mock coding loop with perpetual non-blocking fix → confirms COMPLETED_WITH_DEBT at iteration 3
  • Integration test: blocking reviewer → confirms FAILED_UNRECOVERABLE unchanged
  • Integration test: no files changed → confirms FAILED_UNRECOVERABLE unchanged
  • Existing test suite passes (test_model_config.py)
  • Smoke test: run a trivial repo_path build and verify it converges

🤖 Generated with Claude Code

AbirAbbas and others added 2 commits February 23, 2026 14:19
The default-path coding loop (reviewer-only) had no stuck-loop detection,
unlike the flagged path which gets it from the synthesizer. When a reviewer
repeatedly returned approved=False / blocking=False (e.g. requesting minor
polish), the loop burned all 5 iterations + 2 advisor rounds without
converging, eventually causing an exit -2 abort.

Changes:
- Add _detect_stuck_loop() that detects 3+ consecutive non-blocking fix
  cycles on the default path
- When stuck or loop-exhausted with non-blocking reviewer feedback AND
  code changes present, return COMPLETED_WITH_DEBT instead of
  FAILED_UNRECOVERABLE — the issue flows through the debt gate rather
  than triggering replanning/abort
- Preserve FAILED_UNRECOVERABLE for genuinely blocking failures and
  cases where no code was produced

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…gradation

25 tests exercising run_coding_loop end-to-end with scripted call_fn
(only AI agent calls are mocked). Covers:
- Happy path (approved first/second iteration)
- Non-blocking stuck loop → COMPLETED_WITH_DEBT at window boundary
- Blocking review → immediate FAILED_UNRECOVERABLE
- Stagnation (no files changed) → FAILED_UNRECOVERABLE
- Loop exhaustion with non-blocking → COMPLETED_WITH_DEBT
- Flagged path (QA + reviewer + synthesizer) scenarios
- Coder exception handling
- Artifact/checkpoint persistence to disk
- File accumulation across iterations
- note_fn tag observability

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@AbirAbbas AbirAbbas merged commit 2bbb18b into main Feb 23, 2026
1 check passed
@AbirAbbas AbirAbbas deleted the fix/stuck-loop-convergence branch February 23, 2026 19:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant