Ralph Wiggum Technique โ Complete Overview
The Ralph Wiggum technique is the simplest yet most productive agentic trick in the 2025/2026 toolbox: let the AI coding agent run the same prompt in a loop until the task is done. Geoffrey Huntley popularized the pattern in his blog post "Ralph Wiggum as a Software Engineer." By now there are at least 6 noteworthy implementations โ from a 200-line Bash file to a multi-phase Go tool with a web dashboard and Telegram bot. This guide is the map.
1. The technique in one sentenceโ
while true; do
agent --prompt "$PROMPT"
grep -q "<promise>COMPLETE</promise>" output.log && break
done
There's no more magic to it. It works because:
- The agent starts fresh every time โ no context degradation, no drift
- The files from the previous iteration persist โ git history + filesystem are the memory
- Tests are the truth โ when the prompt says "green = done," the agent iterates against reality
- Failures are data โ every wrong iteration helps the next one improve
"Ralph is a Bash loop." โ Geoffrey Huntley
2. Why it actually worksโ
| Problem of traditional AI sessions | Ralph's answer |
|---|---|
| Context window fills up | Fresh session per iteration โ context reset |
| Agent loses the thread | Prompt is always the same; state lives in the filesystem |
| "Hallucinates that it's done" | Tests/linters determine completion, not the model |
| One model hits a dead end | Rotation or multiple reviews correct it |
| Human has to keep clicking "yes" | Auto-approve in a dedicated sandbox |
3. The 6 major implementationsโ
3.1 snarktank/ralph โ the popular Bash variantโ
Repo: snarktank/ralph ยท Language: Bash + TypeScript skills ยท Stars: 18.7k ยท License: MIT
Author: Ryan Carson โ explicitly based on Geoffrey Huntley's pattern.
snarktank/ralph is the variant cited most often in tutorials, Y Combinator threads, and Twitter. It is deliberately simple: one Bash script, one prompt file, three persistence files.
Architectureโ
- ralph.sh โ the Bash loop (Amp + Claude)
- prompt.md โ instructions for Amp
- CLAUDE.md โ instructions for Claude Code
- prd.json โ user stories with pass/fail
- progress.txt โ append-only lessons learned
- AGENTS.md โ patterns/gotchas maintained by the agent
Workflowโ
- Generate the PRD โ the
prdskill loads a conversation, the human describes requirements, the agent writes a markdown PRD. - Convert to
prd.jsonโ theralphskill structures user stories as a list withpasses: false. - Start the loop โ
./scripts/ralph/ralph.sh [max_iterations](default: 10). - Per iteration:
- Create a branch from the spec
- Pick the highest-priority story whose
passesis stillfalse - Implement the story in isolation
- Run typecheck + tests
- On success commit, update
prd.json, append learnings toprogress.txt
- When all stories are
passes: trueโ output<promise>COMPLETE</promise>โ the loop ends
Installation โ three pathsโ
# 1. Copy directly into the project
mkdir -p scripts/ralph && cp ralph.sh prompt.md scripts/ralph/
# 2. As global Amp skills
~/.config/amp/skills/prd/
~/.config/amp/skills/ralph/
# 3. As a Claude Code plugin (marketplace)
/plugin marketplace add snarktank/ralph
/plugin install ralph-skills@ralph-marketplace
Supported toolsโ
- Amp CLI (default) โ with
amp.experimental.autoHandoff: { context: 90 }for auto-handoff when context is full - Claude Code โ via
--tool claude
Killer feature: AGENTS.md auto-updateโ
After every iteration the agent itself records discovered patterns, gotchas, and conventions in AGENTS.md. Later iterations (and human developers) benefit from this accumulating knowledge. This is the simplest form of self-improvement I've seen in any Ralph implementation.
When to pick itโ
- You want the original experience with minimal dependencies
- You work with Amp or Claude Code, not GPT/Cursor
- You want to see how the pattern works internally (Bash is easy to read)
- You need the largest community + example repos
3.2 ralph-loop โ the official Claude Code pluginโ
Repo: anthropics/claude-plugins-official โ ralph-loop ยท Full guide: ralph-loop deep dive
Anthropic's official zero-dependency variant โ a plugin that runs inside Claude Code itself.
What it isโ
Instead of an external Bash loop, the plugin uses a stop hook: when Claude tries to end the session, the hook intercepts and feeds the same prompt again into the running model. The loop happens inside the current session โ no second terminal, no subprocess, no script.
Plugin structureโ
- README.md
Usageโ
# Activate (in Claude Code)
/plugin install ralph-loop
# Start the loop
/ralph-loop "Build a REST API for todos. CRUD, validation, tests > 80% coverage. Output <promise>COMPLETE</promise> when done." \
--completion-promise "COMPLETE" \
--max-iterations 50
# Cancel the loop
/cancel-ralph
How the stop hook worksโ
1. /ralph-loop invokes Claude with the prompt
2. Claude works, tries to end the session
3. The stop hook (hooks/stop-hook.sh) intercepts the exit
4. The hook injects the SAME prompt again
5. Claude reads its own diff + git log and continues
6. The loop ends on: COMPLETE in output / max iterations / /cancel-ralph
No Bun, no Go, no Cargo, no Rust. If you already use Claude Code, this is the fastest option. One plugin install and you have Ralph.
On Windows the stop hook occasionally fails with wsl: Unknown key 'automount.crossDistro' or execvpe(/bin/bash) failed. Workaround: edit ~/.claude/plugins/cache/.../hooks/hooks.json so the hook explicitly uses Git Bash:
"command": "\"C:/Program Files/Git/bin/bash.exe\" ${CLAUDE_PLUGIN_ROOT}/hooks/stop-hook.sh"
Important: Git/bin/bash.exe (with PATH wrapper), not Git/usr/bin/bash.exe (raw MinGW).
When to pick itโ
- You use Claude Code and want to invest 5 minutes, not 5 hours
- Your task fits in a single session without an external orchestration layer
- You don't need multi-model comparison, Telegram steering, or a web UI
3.3 open-ralph-wiggum โ the multi-agent CLI with rotationโ
Repo: Th0rgal/open-ralph-wiggum ยท Language: Bun + TypeScript
Full guide: Open Ralph Wiggum
Profileโ
- 5 agents supported: Claude Code, Codex, Copilot CLI, Cursor Agent, OpenCode
--rotationas the killer feature: a different agent is used per iterationralph "..." --rotation "claude-code:opus,codex:gpt-5,opencode:gpt-4o"- Tasks mode:
.ralph/ralph-tasks.md, one task per iteration - Live steering from a second terminal:
--add-context "Hint",--status,--list-tasks - Status dashboard with tool counts and struggle indicators
When to pick itโ
- You want to compare models directly without launching 3 tools
- You want a lean, focused CLI without a heavyweight web UI/backend
- You need live steering but don't want to interrupt the loop
3.4 ralphex โ multi-phase reviews and plan pipelinesโ
Repo: umputun/ralphex ยท Language: Go (73%) + Python/Shell ยท Stars: 1.1k+
Profileโ
ralphex is the most enterprise-y variant: a 4-phase pipeline per plan.
| Phase | What happens |
|---|---|
| 1. Task Execution | A markdown plan with ### Task N is processed sequentially, each task in a fresh Claude session, validation commands run automatically after each task |
| 2. First Review | 5 parallel review agents check: quality, implementation, testing, simplification, documentation |
| 3. External Review (optional) | An external tool (default: Codex) gives an independent second verdict; Claude evaluates the findings and corrects |
| 4. Second Review | Focused final check with 2 agents on critical/major issues |
| Finalize (optional) | Rebase, notifications, plan archival |
Plan formatโ
# Plan: User Authentication
## Overview
JWT-based auth with refresh tokens
## Validation Commands
- `pnpm test`
- `pnpm typecheck`
### Task 1: Login endpoint
- [ ] POST /login accepts email+password
- [ ] returns a bearer token
- [ ] Tests green
### Task 2: Refresh flow
- [ ] POST /refresh rotates the token
- [ ] Tests green
Highlightsโ
- Worktree isolation (
--worktree) for parallel plans in the same repo - Web dashboard with SSE streaming, phase filters, multi-session watch
- Mid-run steering:
Ctrl+\(SIGQUIT) pauses the task, you can edit the plan and restart the session - Stalemate detection:
--review-patienceaborts when N review rounds no longer change anything - Rate-limit handling:
--waitautomatic retry - Notifications: Telegram, Slack, email, webhook after the loop ends
- VCS backend: git by default, Mercurial via translation scripts
- Docker wrapper: read-only on credentials, RW only on the workspace
When to pick itโ
- You want automated code review before the merge
- You need plan pipelines with clearly separated phases
- Your team wants a web UI for monitoring
- You want heterogeneity โ Claude-implemented + Codex-reviewed
3.5 ralph-orchestrator โ hat system, MCP stack, human-in-the-loopโ
Repo: mikeyobrien/ralph-orchestrator ยท Language: Rust (82%) + TS ยท Stars: 2.8k
Profileโ
ralph-orchestrator is the most architecturally ambitious tool: a Rust engine with a React frontend, an MCP server, a Telegram bot, and a hat system.
Hat systemโ
Hats are specialized AI personas that coordinate in sequence:
| Hat | Job |
|---|---|
code-assist | Implements features |
debug | Finds/fixes bugs |
research | Collects context, reads repos |
review | Quality gates |
pdd-to-code-assist | Translates PDD specs into implementation |
Hats are defined via YAML configs (ralph.yml, ralph.qa.yml, ralph.bot.yml).
Stackโ
- Web dashboard (alpha): Rust RPC API + React, ports 3000/5173
- MCP server: workspace-scoped, one server per repo, deterministic config/task persistence
- Terminal UI: built in
ratatui - RObot (Telegram bot):
/status,/tasks,/restartfor live steering from your phone
Backendsโ
Claude Code, Gemini CLI, Kiro, Codex, Amp, Copilot CLI, OpenCode (seven).
Workflowโ
ralph init --backend claude-code
ralph plan "Add OAuth login" # PDD session: specs + designs + plan
ralph run -p specs/oauth.md # Loop until done
ralph web # Dashboard
ralph mcp serve --workspace-root . # MCP for other tools
ralph bot onboard --telegram # Steering from your phone
When to pick itโ
- You want Problem-Driven Development (PDD): specs before code
- You work with other MCP clients and want to plug Ralph in as an MCP server
- You want to steer from your phone (Telegram)
- You like the hat concept (clear separation of roles)
3.6 ralphy โ parallel worktrees + automatic PRsโ
Repo: michaelshimeles/ralphy ยท Language: TS (76%) + Bash ยท Stars: 2.8k
Profileโ
ralphy focuses on parallelism and GitHub integration.
ralphy --prd PRD.md --parallel --max-parallel 5 \
--branch-per-task --create-pr
What happens:
- ralphy reads the PRD (markdown / folder / YAML / JSON / GitHub issues)
- Spawns 5 parallel agents, each in its own worktree with its own branch
- Each agent solves its task in isolation, runs tests
- On success: auto-merge into the base branch or auto-PR via
gh - Merge conflicts are resolved by the AI agent itself
- On worktree problems: fallback to sandbox mode (symlinks for
node_modules/.git/vendorโ fast even in monorepos)
Supported agentsโ
Claude Code, OpenCode, Cursor, Codex, Qwen-Code, Factory Droid, Copilot, Gemini CLI โ eight, more than any other.
Highlightsโ
- Browser automation:
--browserfor UI tests via agent-browser - Branch-per-task + auto-PR โ ideal for gh workflows
- Sandbox mode with symlinks โ significantly faster in large repos
- Task sources: MD, MD folder, YAML, JSON, GitHub issues
- Webhook notifications in
.ralphy/config.yaml
When to pick itโ
- You want to work on multiple tasks simultaneously
- You want automatic PRs instead of direct commits
- You work in a large monorepo (sandbox mode helps)
- UI tests should run automatically
4. The big comparison tableโ
| snarktank/ralph | ralph-loop (plugin) | open-ralph-wiggum | ralphex | ralph-orchestrator | ralphy | |
|---|---|---|---|---|---|---|
| Language | Bash | Bash hook | Bun + TS | Go | Rust + TS | TS + Bash |
| Install effort | minimal | zero | npm/bun | go install / brew | npm / cargo | npm |
| Agents | 2 (Amp, Claude) | Claude Code only | 5 | 1 (+ wrapper) | 7 | 8 |
| Loop mechanism | external Bash loop | stop hook (in-session) | external Bun process | multi-phase pipeline | hat sequence | parallel worktrees |
| Agent rotation | โ | โ | โ | โ | โ | โ (parallel) |
| Multi-phase reviews | โ | โ | โ | โ (5 agents) | โ | โ |
| Parallel tasks | โ | โ | โ | โ worktrees | โ | โ explicit |
| PRD format | prd.json | free-form prompt | free-form prompt + tasks mode | markdown plan | YAML / PDD | MD/YAML/JSON/issues |
| Persistence | prd.json, progress.txt, AGENTS.md | filesystem + git | .ralph/*.json/.md | .ralphex/progress/ | workspace + MCP | worktrees + PRs |
| Web UI | โ | โ | โ | โ SSE | โ alpha | โ |
| MCP server | โ | โ | โ | โ | โ | โ |
| Telegram steering | โ | โ | โ | notification only | โ RObot | โ |
| Browser tests | โ | โ | โ | โ | โ | โ |
| Auto-PRs | โ | โ | โ | โ | โ | โ |
| Sandboxing | โ | โ | โ | Docker wrapper | โ | symlink sandbox |
| Main audience | tinkerers, OG fans | Claude Code users | model comparers | enterprise reviewers | MCP / PDD stack | monorepo + GitHub |
| Stars (May 2026) | 18.7k | (plugin, n/a) | โ | 1.1k | 2.8k | 2.8k |
5. Decision treeโ
Are you a Claude Code user and want minimal effort?
โโ YES โ ralph-loop plugin
โโ NO โ
Do you want to compare different models per iteration?
โโ YES โ open-ralph-wiggum (--rotation)
โโ NO โ
Do you need automated multi-phase code review?
โโ YES โ ralphex
โโ NO โ
Do you want parallel tasks โ auto-PRs?
โโ YES โ ralphy (--parallel --create-pr)
โโ NO โ
Do you want a PDD workflow, an MCP server, Telegram steering?
โโ YES โ ralph-orchestrator
โโ NO โ
โ snarktank/ralph (classic, 200 lines of Bash, biggest community)
6. Shared best practices (applies to ALL Ralph variants)โ
6.1 ALWAYS set an iteration limitโ
ralph "..." --max-iterations 20
The --completion-promise is just a string match. Without an iteration limit, a forgotten loop can run for hours and burn tokens.
6.2 Phrase acceptance criteria so they are VERIFIABLEโ
โ Bad: "Build a nice login page."
โ
Good:
1. POST /login โ 200 with bearer token on correct credentials
2. POST /login โ 401 on wrong credentials
3. `pnpm test login.spec.ts` is green
4. When all three are satisfied โ output COMPLETE
Ralph spins until tests are green. Without tests it spins until the iteration limit.
6.3 Story sizingโ
"If a task is too big, the LLM runs out of context before finishing and produces poor code." โ snarktank/ralph
Rule of thumb: one user story = fits into the context window of the model you're using. Concretely:
- Database migration: yes
- UI component with tests: yes
- Server action: yes
- "Complete e-commerce system": no โ split into a PRD with 50+ small stories
6.4 JSON PRDs beat markdown PRDsโ
For larger feature lists, a structured JSON schema (a list with id, title, acceptance: [], passes: false) reduces the agent's tendency to rewrite existing tests instead of satisfying them. snarktank/ralph does this by default with prd.json.
6.5 Sandboxing is mandatoryโ
Auto-approve = the agent can do anything you could do in the shell. Mitigations:
- Dedicated repo in a VM or container
- ralphex: Docker wrapper ยท ralphy: symlink sandbox ยท others: manually with Daytona/sandboxed.sh
- API keys via
.env, never in prompts
6.6 Maintain persistent memoryโ
The most successful implementations (snarktank/ralph) let the agent itself update AGENTS.md every iteration โ patterns, gotchas, conventions. This accumulating memory is worth its weight in gold.
7. Pitfallsโ
| Symptom | Cause | Fix |
|---|---|---|
| Loop runs endlessly without progress | no verifiable acceptance criteria | tests/linters as a gate, clear completion promise |
| Agent rewrites tests so they pass | soft acceptance criterion | JSON PRD, mark tests-as-spec explicitly as "do not change" |
| Token costs explode | model too large, too many iterations | smaller model for routine work, stricter --max-iterations |
| Agent forgets patterns from earlier iterations | no AGENTS.md/progress.txt | maintain a memory file, reference it explicitly in the prompt |
| Auto-commits pollute history | every iteration commits | snarktank: it's a feature; otherwise --no-commit + squash |
| Stop-hook plugin fails on Windows | wsl: Unknown key bug | set Git Bash explicitly in hooks.json (see ยง3.2) |
| Multiple worktrees crash | branch conflicts | ralphy has conflict resolution; otherwise use ralphex's worktree mode |
8. Example stackโ
A productive setup for 2026:
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ Daily coding โ
โ โโ Claude Code + ralph-loop plugin โ
โ โ small, well-defined loops while I do something else โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค
โ Larger features โ
โ โโ open-ralph-wiggum with --rotation โ
โ โ complex refactoring, pitting models against each other โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค
โ Pre-merge review โ
โ โโ ralphex --review โ
โ โ 5 review agents on one branch โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค
โ Mass tickets โ
โ โโ ralphy --parallel --create-pr from GitHub issues โ
โ โ 5 tickets simultaneously into draft PRs โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
9. Further readingโ
Sources & originals
- Geoffrey Huntley: "Ralph Wiggum as a Software Engineer" โ the original pattern
- snarktank/ralph โ popular Bash implementation (Ryan Carson)
- anthropics/claude-plugins-official โ ralph-loop โ official Claude plugin
Implementations
- Th0rgal/open-ralph-wiggum โ see dedicated guide
- umputun/ralphex
- mikeyobrien/ralph-orchestrator
- michaelshimeles/ralphy
Related topics
- Agent comparison โ Claude vs. ChatGPT vs. Copilot โ the overarching agent map
- sandboxed.sh โ container workspaces for autonomous loops
- MCP โ Model Context Protocol
"Failures are predictable and informative. Persistence wins. Operator skill matters." โ The Ralph Wiggum philosophy in three sentences