Skip to content

Conversation

@bsgustavo
Copy link

Summary

This PR introduces security hardening, new tool support, and documentation improvements to Ralph.

I performed a security review of ralph.sh to block malicious command execution. The unsafe eval path was removed, override variables are now validated against dangerous characters, expected binaries per tool (amp, claude, codex, gh) are verified, and commands must exist in the environment before execution.

The execution flow was hardened against false completion and prompt injection. Ralph now only accepts COMPLETE when the runner exits successfully (exit code 0) and the tag appears as an exact output line.

Runner error handling was fixed so failures are not masked. I added set -o pipefail, centralized execution in execute_runner, preserved RUN_EXIT_CODE correctly, and ensured per-iteration failures are logged instead of silently ignored.

This PR also adds native support for --tool codex and --tool copilot in Ralph’s main loop, while keeping amp and claude. It includes tool validation, automatic prompt selection per tool, and centralized execution via run_with_prompt.

CLI commands are now configurable through environment variables (RALPH_AMP_CMD, RALPH_CLAUDE_CMD, RALPH_CODEX_CMD, RALPH_COPILOT_CMD), allowing different runtime environments without modifying the script.

New instruction templates CODEX.md and COPILOT.md were added. They mirror Ralph’s autonomous iterative flow using PRD, progress.txt, checks, and the COMPLETE termination condition.

Documentation was expanded to explicitly describe the security hardening (anti-injection behavior and override restrictions), prerequisites, usage examples for Codex/Copilot, template copies, and environment-variable command overrides.

AGENTS.md was updated to reflect the new runners and the hardened security pattern for future iterations of the project.

Testing

✅ bash -n ralph.sh
✅ ./ralph.sh --tool nope 1 >/tmp/test_nope.out 2>&1
✅ RALPH_AMP_CMD='amp;echo pwned' ./ralph.sh --tool amp 1 >/tmp/test_malicious.out 2>&1
✅ RALPH_COPILOT_CMD='gh copilot agent run' ./ralph.sh --tool copilot 1 >/tmp/test_gh.out 2>&1
✅ curl -L --max-time 15 -s https://docs.github.com/en/copilot
| head -n 5
✅ curl -L --max-time 15 -s https://developers.openai.com/codex
| head -n 5
✅ ./ralph.sh --tool nope 1 >/tmp/ralph_invalid.out 2>&1; echo $?; cat /tmp/ralph_invalid.out

@greptile-apps
Copy link

greptile-apps bot commented Feb 10, 2026

Greptile Overview

Greptile Summary

This PR extends Ralph’s loop to support two additional runners (--tool codex and --tool copilot) and adds corresponding prompt templates (CODEX.md/COPILOT.md), along with README/AGENTS.md updates documenting usage and environment-variable overrides.

The main functional change is in ralph.sh, where runner invocation is centralized via run_with_prompt and per-tool commands can be overridden via RALPH_AMP_CMD, RALPH_CLAUDE_CMD, RALPH_CODEX_CMD, and RALPH_COPILOT_CMD.

However, the current implementation reintroduces an unsafe eval execution path and also weakens iteration success/completion handling (runner failures are masked and completion detection is overly permissive). These issues should be addressed before merging.

Confidence Score: 2/5

  • This PR is not safe to merge as-is due to a clear command-injection path and unreliable completion/failure handling in ralph.sh.
  • The new run_with_prompt uses eval on user-configurable environment variables, enabling shell metacharacter injection. In addition, runner failures are suppressed (|| true) and completion detection matches anywhere in output without requiring a successful exit, making the loop behavior unreliable.
  • ralph.sh

Important Files Changed

Filename Overview
AGENTS.md Documentation updates to mention Codex/Copilot support and new prompt templates.
CODEX.md Adds Codex-specific agent instruction template mirroring existing prompt structure.
COPILOT.md Adds GitHub Copilot-specific agent instruction template mirroring existing prompt structure.
README.md Docs expanded for Codex/Copilot usage and environment variable overrides; depends on ralph.sh behavior being safely implemented.
ralph.sh Adds codex/copilot tool support and env var command overrides but introduces unsafe eval, masks runner failures, and has a lax completion detector.

Sequence Diagram

sequenceDiagram
  participant User as User
  participant Ralph as ralph.sh
  participant FS as Filesystem
  participant Runner as Tool runner (amp/claude/codex/copilot)

  User->>Ralph: ./ralph.sh --tool <tool> [max_iterations]
  Ralph->>FS: Read prd.json/.last-branch (archive if branch changed)
  Ralph->>FS: Ensure progress.txt exists
  loop Iterations 1..MAX_ITERATIONS
    Ralph->>FS: Read prompt template (prompt.md/CLAUDE.md/CODEX.md/COPILOT.md)
    Ralph->>Runner: Execute runner command (possibly from RALPH_*_CMD)
    Runner-->>Ralph: Combined stdout/stderr output
    Ralph->>Ralph: Search output for <promise>COMPLETE</promise>
    alt COMPLETE found
      Ralph-->>User: Exit 0 (completed)
    else Not complete
      Ralph-->>User: Sleep then next iteration
    end
  end
  Ralph-->>User: Exit 1 (max iterations reached)
Loading

Copy link

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

5 files reviewed, 3 comments

Edit Code Review Agent Settings | Greptile

Comment on lines +37 to +47
run_with_prompt() {
local prompt_file="$1"
local command="$2"

if [[ ! -f "$prompt_file" ]]; then
echo "Error: Prompt file not found: $prompt_file"
return 1
fi

cat "$prompt_file" | eval "$command"
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Command injection via eval

run_with_prompt pipes the prompt into eval "$command", which allows arbitrary command execution if RALPH_*_CMD contains shell metacharacters (e.g. RALPH_AMP_CMD='amp; echo pwned'). Since these env vars are explicitly documented as user-configurable, this is a real injection path. Prefer executing via an array (no eval) and/or strict validation (reject ;|&$()\n etc.) before execution, and ensure the intended binary exists before running.

Comment on lines 5 to 6
set -e

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Masked runner failures

The loop runs commands as OUTPUT=$(... ) || true, so any runner failure (non-zero exit) is ignored and the script continues as if the iteration succeeded. That can cause false progress and makes diagnosing failures harder. If the goal is to keep iterating, capture $? and log/handle it explicitly (and don’t treat <promise>COMPLETE</promise> as valid unless the runner exited 0).

@greptile-apps
Copy link

greptile-apps bot commented Feb 10, 2026

Additional Comments (1)

ralph.sh
Completion check too loose

Completion is detected with grep -q "<promise>COMPLETE</promise>" anywhere in output, regardless of exit status and even if it appears inside other text. This can prematurely exit (e.g., echoed in logs or included in a code block). If you want a robust stop condition, require runner exit code 0 and match an exact line (e.g. grep -x on a normalized output line).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant