Skip to main content

Ralph Wiggum Technique โ€“ Complete Overview

What's this about?

The Ralph Wiggum technique is the simplest yet most productive agentic trick in the 2025/2026 toolbox: let the AI coding agent run the same prompt in a loop until the task is done. Geoffrey Huntley popularized the pattern in his blog post "Ralph Wiggum as a Software Engineer." By now there are at least 6 noteworthy implementations โ€“ from a 200-line Bash file to a multi-phase Go tool with a web dashboard and Telegram bot. This guide is the map.

1. The technique in one sentenceโ€‹

while true; do
agent --prompt "$PROMPT"
grep -q "<promise>COMPLETE</promise>" output.log && break
done

There's no more magic to it. It works because:

  • The agent starts fresh every time โ€“ no context degradation, no drift
  • The files from the previous iteration persist โ€“ git history + filesystem are the memory
  • Tests are the truth โ€“ when the prompt says "green = done," the agent iterates against reality
  • Failures are data โ€“ every wrong iteration helps the next one improve
Quote

"Ralph is a Bash loop." โ€” Geoffrey Huntley


2. Why it actually worksโ€‹

Problem of traditional AI sessionsRalph's answer
Context window fills upFresh session per iteration โ†’ context reset
Agent loses the threadPrompt is always the same; state lives in the filesystem
"Hallucinates that it's done"Tests/linters determine completion, not the model
One model hits a dead endRotation or multiple reviews correct it
Human has to keep clicking "yes"Auto-approve in a dedicated sandbox

3. The 6 major implementationsโ€‹

Repo: snarktank/ralph ยท Language: Bash + TypeScript skills ยท Stars: 18.7k ยท License: MIT

Author: Ryan Carson โ€“ explicitly based on Geoffrey Huntley's pattern.

snarktank/ralph is the variant cited most often in tutorials, Y Combinator threads, and Twitter. It is deliberately simple: one Bash script, one prompt file, three persistence files.

Architectureโ€‹

    • ralph.sh โ€” the Bash loop (Amp + Claude)
    • prompt.md โ€” instructions for Amp
    • CLAUDE.md โ€” instructions for Claude Code
    • prd.json โ€” user stories with pass/fail
    • progress.txt โ€” append-only lessons learned
    • AGENTS.md โ€” patterns/gotchas maintained by the agent

Workflowโ€‹

  1. Generate the PRD โ€“ the prd skill loads a conversation, the human describes requirements, the agent writes a markdown PRD.
  2. Convert to prd.json โ€“ the ralph skill structures user stories as a list with passes: false.
  3. Start the loop โ€“ ./scripts/ralph/ralph.sh [max_iterations] (default: 10).
  4. Per iteration:
    • Create a branch from the spec
    • Pick the highest-priority story whose passes is still false
    • Implement the story in isolation
    • Run typecheck + tests
    • On success commit, update prd.json, append learnings to progress.txt
  5. When all stories are passes: true โ†’ output <promise>COMPLETE</promise> โ†’ the loop ends

Installation โ€“ three pathsโ€‹

# 1. Copy directly into the project
mkdir -p scripts/ralph && cp ralph.sh prompt.md scripts/ralph/

# 2. As global Amp skills
~/.config/amp/skills/prd/
~/.config/amp/skills/ralph/

# 3. As a Claude Code plugin (marketplace)
/plugin marketplace add snarktank/ralph
/plugin install ralph-skills@ralph-marketplace

Supported toolsโ€‹

  • Amp CLI (default) โ€“ with amp.experimental.autoHandoff: { context: 90 } for auto-handoff when context is full
  • Claude Code โ€“ via --tool claude

Killer feature: AGENTS.md auto-updateโ€‹

After every iteration the agent itself records discovered patterns, gotchas, and conventions in AGENTS.md. Later iterations (and human developers) benefit from this accumulating knowledge. This is the simplest form of self-improvement I've seen in any Ralph implementation.

When to pick itโ€‹

  • You want the original experience with minimal dependencies
  • You work with Amp or Claude Code, not GPT/Cursor
  • You want to see how the pattern works internally (Bash is easy to read)
  • You need the largest community + example repos

3.2 ralph-loop โ€“ the official Claude Code pluginโ€‹

Repo: anthropics/claude-plugins-official โ†’ ralph-loop ยท Full guide: ralph-loop deep dive

Anthropic's official zero-dependency variant โ€“ a plugin that runs inside Claude Code itself.

What it isโ€‹

Instead of an external Bash loop, the plugin uses a stop hook: when Claude tries to end the session, the hook intercepts and feeds the same prompt again into the running model. The loop happens inside the current session โ€“ no second terminal, no subprocess, no script.

Plugin structureโ€‹

    • README.md

Usageโ€‹

# Activate (in Claude Code)
/plugin install ralph-loop

# Start the loop
/ralph-loop "Build a REST API for todos. CRUD, validation, tests > 80% coverage. Output <promise>COMPLETE</promise> when done." \
--completion-promise "COMPLETE" \
--max-iterations 50

# Cancel the loop
/cancel-ralph

How the stop hook worksโ€‹

1. /ralph-loop invokes Claude with the prompt
2. Claude works, tries to end the session
3. The stop hook (hooks/stop-hook.sh) intercepts the exit
4. The hook injects the SAME prompt again
5. Claude reads its own diff + git log and continues
6. The loop ends on: COMPLETE in output / max iterations / /cancel-ralph
Biggest advantage: zero setup

No Bun, no Go, no Cargo, no Rust. If you already use Claude Code, this is the fastest option. One plugin install and you have Ralph.

Windows pitfall

On Windows the stop hook occasionally fails with wsl: Unknown key 'automount.crossDistro' or execvpe(/bin/bash) failed. Workaround: edit ~/.claude/plugins/cache/.../hooks/hooks.json so the hook explicitly uses Git Bash:

"command": "\"C:/Program Files/Git/bin/bash.exe\" ${CLAUDE_PLUGIN_ROOT}/hooks/stop-hook.sh"

Important: Git/bin/bash.exe (with PATH wrapper), not Git/usr/bin/bash.exe (raw MinGW).

When to pick itโ€‹

  • You use Claude Code and want to invest 5 minutes, not 5 hours
  • Your task fits in a single session without an external orchestration layer
  • You don't need multi-model comparison, Telegram steering, or a web UI

3.3 open-ralph-wiggum โ€“ the multi-agent CLI with rotationโ€‹

Repo: Th0rgal/open-ralph-wiggum ยท Language: Bun + TypeScript

Full guide: Open Ralph Wiggum

Profileโ€‹

  • 5 agents supported: Claude Code, Codex, Copilot CLI, Cursor Agent, OpenCode
  • --rotation as the killer feature: a different agent is used per iteration
    ralph "..." --rotation "claude-code:opus,codex:gpt-5,opencode:gpt-4o"
  • Tasks mode: .ralph/ralph-tasks.md, one task per iteration
  • Live steering from a second terminal: --add-context "Hint", --status, --list-tasks
  • Status dashboard with tool counts and struggle indicators

When to pick itโ€‹

  • You want to compare models directly without launching 3 tools
  • You want a lean, focused CLI without a heavyweight web UI/backend
  • You need live steering but don't want to interrupt the loop

3.4 ralphex โ€“ multi-phase reviews and plan pipelinesโ€‹

Repo: umputun/ralphex ยท Language: Go (73%) + Python/Shell ยท Stars: 1.1k+

Profileโ€‹

ralphex is the most enterprise-y variant: a 4-phase pipeline per plan.

PhaseWhat happens
1. Task ExecutionA markdown plan with ### Task N is processed sequentially, each task in a fresh Claude session, validation commands run automatically after each task
2. First Review5 parallel review agents check: quality, implementation, testing, simplification, documentation
3. External Review (optional)An external tool (default: Codex) gives an independent second verdict; Claude evaluates the findings and corrects
4. Second ReviewFocused final check with 2 agents on critical/major issues
Finalize (optional)Rebase, notifications, plan archival

Plan formatโ€‹

# Plan: User Authentication

## Overview
JWT-based auth with refresh tokens

## Validation Commands
- `pnpm test`
- `pnpm typecheck`

### Task 1: Login endpoint
- [ ] POST /login accepts email+password
- [ ] returns a bearer token
- [ ] Tests green

### Task 2: Refresh flow
- [ ] POST /refresh rotates the token
- [ ] Tests green

Highlightsโ€‹

  • Worktree isolation (--worktree) for parallel plans in the same repo
  • Web dashboard with SSE streaming, phase filters, multi-session watch
  • Mid-run steering: Ctrl+\ (SIGQUIT) pauses the task, you can edit the plan and restart the session
  • Stalemate detection: --review-patience aborts when N review rounds no longer change anything
  • Rate-limit handling: --wait automatic retry
  • Notifications: Telegram, Slack, email, webhook after the loop ends
  • VCS backend: git by default, Mercurial via translation scripts
  • Docker wrapper: read-only on credentials, RW only on the workspace

When to pick itโ€‹

  • You want automated code review before the merge
  • You need plan pipelines with clearly separated phases
  • Your team wants a web UI for monitoring
  • You want heterogeneity โ€“ Claude-implemented + Codex-reviewed

3.5 ralph-orchestrator โ€“ hat system, MCP stack, human-in-the-loopโ€‹

Repo: mikeyobrien/ralph-orchestrator ยท Language: Rust (82%) + TS ยท Stars: 2.8k

Profileโ€‹

ralph-orchestrator is the most architecturally ambitious tool: a Rust engine with a React frontend, an MCP server, a Telegram bot, and a hat system.

Hat systemโ€‹

Hats are specialized AI personas that coordinate in sequence:

HatJob
code-assistImplements features
debugFinds/fixes bugs
researchCollects context, reads repos
reviewQuality gates
pdd-to-code-assistTranslates PDD specs into implementation

Hats are defined via YAML configs (ralph.yml, ralph.qa.yml, ralph.bot.yml).

Stackโ€‹

  • Web dashboard (alpha): Rust RPC API + React, ports 3000/5173
  • MCP server: workspace-scoped, one server per repo, deterministic config/task persistence
  • Terminal UI: built in ratatui
  • RObot (Telegram bot): /status, /tasks, /restart for live steering from your phone

Backendsโ€‹

Claude Code, Gemini CLI, Kiro, Codex, Amp, Copilot CLI, OpenCode (seven).

Workflowโ€‹

ralph init --backend claude-code
ralph plan "Add OAuth login" # PDD session: specs + designs + plan
ralph run -p specs/oauth.md # Loop until done
ralph web # Dashboard
ralph mcp serve --workspace-root . # MCP for other tools
ralph bot onboard --telegram # Steering from your phone

When to pick itโ€‹

  • You want Problem-Driven Development (PDD): specs before code
  • You work with other MCP clients and want to plug Ralph in as an MCP server
  • You want to steer from your phone (Telegram)
  • You like the hat concept (clear separation of roles)

3.6 ralphy โ€“ parallel worktrees + automatic PRsโ€‹

Repo: michaelshimeles/ralphy ยท Language: TS (76%) + Bash ยท Stars: 2.8k

Profileโ€‹

ralphy focuses on parallelism and GitHub integration.

ralphy --prd PRD.md --parallel --max-parallel 5 \
--branch-per-task --create-pr

What happens:

  1. ralphy reads the PRD (markdown / folder / YAML / JSON / GitHub issues)
  2. Spawns 5 parallel agents, each in its own worktree with its own branch
  3. Each agent solves its task in isolation, runs tests
  4. On success: auto-merge into the base branch or auto-PR via gh
  5. Merge conflicts are resolved by the AI agent itself
  6. On worktree problems: fallback to sandbox mode (symlinks for node_modules/.git/vendor โ†’ fast even in monorepos)

Supported agentsโ€‹

Claude Code, OpenCode, Cursor, Codex, Qwen-Code, Factory Droid, Copilot, Gemini CLI โ€“ eight, more than any other.

Highlightsโ€‹

  • Browser automation: --browser for UI tests via agent-browser
  • Branch-per-task + auto-PR โ€“ ideal for gh workflows
  • Sandbox mode with symlinks โ€“ significantly faster in large repos
  • Task sources: MD, MD folder, YAML, JSON, GitHub issues
  • Webhook notifications in .ralphy/config.yaml

When to pick itโ€‹

  • You want to work on multiple tasks simultaneously
  • You want automatic PRs instead of direct commits
  • You work in a large monorepo (sandbox mode helps)
  • UI tests should run automatically

4. The big comparison tableโ€‹

snarktank/ralphralph-loop (plugin)open-ralph-wiggumralphexralph-orchestratorralphy
LanguageBashBash hookBun + TSGoRust + TSTS + Bash
Install effortminimalzeronpm/bungo install / brewnpm / cargonpm
Agents2 (Amp, Claude)Claude Code only51 (+ wrapper)78
Loop mechanismexternal Bash loopstop hook (in-session)external Bun processmulti-phase pipelinehat sequenceparallel worktrees
Agent rotationโ€“โ€“โœ…โ€“โ€“โ€“ (parallel)
Multi-phase reviewsโ€“โ€“โ€“โœ… (5 agents)โ€“โ€“
Parallel tasksโ€“โ€“โ€“โœ… worktreesโ€“โœ… explicit
PRD formatprd.jsonfree-form promptfree-form prompt + tasks modemarkdown planYAML / PDDMD/YAML/JSON/issues
Persistenceprd.json, progress.txt, AGENTS.mdfilesystem + git.ralph/*.json/.md.ralphex/progress/workspace + MCPworktrees + PRs
Web UIโ€“โ€“โ€“โœ… SSEโœ… alphaโ€“
MCP serverโ€“โ€“โ€“โ€“โœ…โ€“
Telegram steeringโ€“โ€“โ€“notification onlyโœ… RObotโ€“
Browser testsโ€“โ€“โ€“โ€“โ€“โœ…
Auto-PRsโ€“โ€“โ€“โ€“โ€“โœ…
Sandboxingโ€“โ€“โ€“Docker wrapperโ€“symlink sandbox
Main audiencetinkerers, OG fansClaude Code usersmodel comparersenterprise reviewersMCP / PDD stackmonorepo + GitHub
Stars (May 2026)18.7k(plugin, n/a)โ€“1.1k2.8k2.8k

5. Decision treeโ€‹

Are you a Claude Code user and want minimal effort?
โ””โ”€ YES โ†’ ralph-loop plugin
โ””โ”€ NO โ†“

Do you want to compare different models per iteration?
โ””โ”€ YES โ†’ open-ralph-wiggum (--rotation)
โ””โ”€ NO โ†“

Do you need automated multi-phase code review?
โ””โ”€ YES โ†’ ralphex
โ””โ”€ NO โ†“

Do you want parallel tasks โ†’ auto-PRs?
โ””โ”€ YES โ†’ ralphy (--parallel --create-pr)
โ””โ”€ NO โ†“

Do you want a PDD workflow, an MCP server, Telegram steering?
โ””โ”€ YES โ†’ ralph-orchestrator
โ””โ”€ NO โ†“

โ†’ snarktank/ralph (classic, 200 lines of Bash, biggest community)

6. Shared best practices (applies to ALL Ralph variants)โ€‹

6.1 ALWAYS set an iteration limitโ€‹

ralph "..." --max-iterations 20

The --completion-promise is just a string match. Without an iteration limit, a forgotten loop can run for hours and burn tokens.

6.2 Phrase acceptance criteria so they are VERIFIABLEโ€‹

โŒ Bad: "Build a nice login page."
โœ… Good:
1. POST /login โ†’ 200 with bearer token on correct credentials
2. POST /login โ†’ 401 on wrong credentials
3. `pnpm test login.spec.ts` is green
4. When all three are satisfied โ†’ output COMPLETE

Ralph spins until tests are green. Without tests it spins until the iteration limit.

6.3 Story sizingโ€‹

"If a task is too big, the LLM runs out of context before finishing and produces poor code." โ€” snarktank/ralph

Rule of thumb: one user story = fits into the context window of the model you're using. Concretely:

  • Database migration: yes
  • UI component with tests: yes
  • Server action: yes
  • "Complete e-commerce system": no โ€“ split into a PRD with 50+ small stories

6.4 JSON PRDs beat markdown PRDsโ€‹

For larger feature lists, a structured JSON schema (a list with id, title, acceptance: [], passes: false) reduces the agent's tendency to rewrite existing tests instead of satisfying them. snarktank/ralph does this by default with prd.json.

6.5 Sandboxing is mandatoryโ€‹

Auto-approve = the agent can do anything you could do in the shell. Mitigations:

  • Dedicated repo in a VM or container
  • ralphex: Docker wrapper ยท ralphy: symlink sandbox ยท others: manually with Daytona/sandboxed.sh
  • API keys via .env, never in prompts

6.6 Maintain persistent memoryโ€‹

The most successful implementations (snarktank/ralph) let the agent itself update AGENTS.md every iteration โ€“ patterns, gotchas, conventions. This accumulating memory is worth its weight in gold.


7. Pitfallsโ€‹

SymptomCauseFix
Loop runs endlessly without progressno verifiable acceptance criteriatests/linters as a gate, clear completion promise
Agent rewrites tests so they passsoft acceptance criterionJSON PRD, mark tests-as-spec explicitly as "do not change"
Token costs explodemodel too large, too many iterationssmaller model for routine work, stricter --max-iterations
Agent forgets patterns from earlier iterationsno AGENTS.md/progress.txtmaintain a memory file, reference it explicitly in the prompt
Auto-commits pollute historyevery iteration commitssnarktank: it's a feature; otherwise --no-commit + squash
Stop-hook plugin fails on Windowswsl: Unknown key bugset Git Bash explicitly in hooks.json (see ยง3.2)
Multiple worktrees crashbranch conflictsralphy has conflict resolution; otherwise use ralphex's worktree mode

8. Example stackโ€‹

A productive setup for 2026:

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ Daily coding โ”‚
โ”‚ โ””โ”€ Claude Code + ralph-loop plugin โ”‚
โ”‚ โ†’ small, well-defined loops while I do something else โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚ Larger features โ”‚
โ”‚ โ””โ”€ open-ralph-wiggum with --rotation โ”‚
โ”‚ โ†’ complex refactoring, pitting models against each other โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚ Pre-merge review โ”‚
โ”‚ โ””โ”€ ralphex --review โ”‚
โ”‚ โ†’ 5 review agents on one branch โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚ Mass tickets โ”‚
โ”‚ โ””โ”€ ralphy --parallel --create-pr from GitHub issues โ”‚
โ”‚ โ†’ 5 tickets simultaneously into draft PRs โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

9. Further readingโ€‹

Sources & originals

Implementations

Related topics

Quote

"Failures are predictable and informative. Persistence wins. Operator skill matters." โ€” The Ralph Wiggum philosophy in three sentences