Ralph Wiggum Technique – Complete Overview

What's this about?

The Ralph Wiggum technique is the simplest yet most productive agentic trick in the 2025/2026 toolbox: let the AI coding agent run the same prompt in a loop until the task is done. Geoffrey Huntley popularized the pattern in his blog post "Ralph Wiggum as a Software Engineer." By now there are at least 6 noteworthy implementations – from a 200-line Bash file to a multi-phase Go tool with a web dashboard and Telegram bot. This guide is the map.

1. The technique in one sentence

while true; do
  agent --prompt "$PROMPT"
  grep -q "<promise>COMPLETE</promise>" output.log && break
done

There's no more magic to it. It works because:

The agent starts fresh every time – no context degradation, no drift
The files from the previous iteration persist – git history + filesystem are the memory
Tests are the truth – when the prompt says "green = done," the agent iterates against reality
Failures are data – every wrong iteration helps the next one improve

Quote

"Ralph is a Bash loop." — Geoffrey Huntley

2. Why it actually works

Problem of traditional AI sessions	Ralph's answer
Context window fills up	Fresh session per iteration → context reset
Agent loses the thread	Prompt is always the same; state lives in the filesystem
"Hallucinates that it's done"	Tests/linters determine completion, not the model
One model hits a dead end	Rotation or multiple reviews correct it
Human has to keep clicking "yes"	Auto-approve in a dedicated sandbox

3. The 6 major implementations

3.1 snarktank/ralph – the popular Bash variant

Repo: snarktank/ralph · Language: Bash + TypeScript skills · Stars: 18.7k · License: MIT

Author: Ryan Carson – explicitly based on Geoffrey Huntley's pattern.

snarktank/ralph is the variant cited most often in tutorials, Y Combinator threads, and Twitter. It is deliberately simple: one Bash script, one prompt file, three persistence files.

Architecture

- ralph.sh — the Bash loop (Amp + Claude)
- prompt.md — instructions for Amp
- CLAUDE.md — instructions for Claude Code
- prd.json — user stories with pass/fail
- progress.txt — append-only lessons learned
- AGENTS.md — patterns/gotchas maintained by the agent

Workflow

Generate the PRD – the prd skill loads a conversation, the human describes requirements, the agent writes a markdown PRD.
Convert to prd.json – the ralph skill structures user stories as a list with passes: false.
Start the loop – ./scripts/ralph/ralph.sh [max_iterations] (default: 10).
Per iteration:
- Create a branch from the spec
- Pick the highest-priority story whose passes is still false
- Implement the story in isolation
- Run typecheck + tests
- On success commit, update prd.json, append learnings to progress.txt
When all stories are passes: true → output <promise>COMPLETE</promise> → the loop ends

Installation – three paths

# 1. Copy directly into the project
mkdir -p scripts/ralph && cp ralph.sh prompt.md scripts/ralph/

# 2. As global Amp skills
~/.config/amp/skills/prd/
~/.config/amp/skills/ralph/

# 3. As a Claude Code plugin (marketplace)
/plugin marketplace add snarktank/ralph
/plugin install ralph-skills@ralph-marketplace

Supported tools

Amp CLI (default) – with amp.experimental.autoHandoff: { context: 90 } for auto-handoff when context is full
Claude Code – via --tool claude

Killer feature: `AGENTS.md` auto-update

After every iteration the agent itself records discovered patterns, gotchas, and conventions in AGENTS.md. Later iterations (and human developers) benefit from this accumulating knowledge. This is the simplest form of self-improvement I've seen in any Ralph implementation.

When to pick it

You want the original experience with minimal dependencies
You work with Amp or Claude Code, not GPT/Cursor
You want to see how the pattern works internally (Bash is easy to read)
You need the largest community + example repos

3.2 ralph-loop – the official Claude Code plugin

Repo: anthropics/claude-plugins-official → ralph-loop · Full guide: ralph-loop deep dive

Anthropic's official zero-dependency variant – a plugin that runs inside Claude Code itself.

What it is

Instead of an external Bash loop, the plugin uses a stop hook: when Claude tries to end the session, the hook intercepts and feeds the same prompt again into the running model. The loop happens inside the current session – no second terminal, no subprocess, no script.

Plugin structure

- README.md

Usage

# Activate (in Claude Code)
/plugin install ralph-loop

# Start the loop
/ralph-loop "Build a REST API for todos. CRUD, validation, tests > 80% coverage. Output <promise>COMPLETE</promise> when done." \
  --completion-promise "COMPLETE" \
  --max-iterations 50

# Cancel the loop
/cancel-ralph

How the stop hook works

/ralph-loop invokes Claude with the prompt
Claude works, tries to end the session
The stop hook (hooks/stop-hook.sh) intercepts the exit
The hook injects the SAME prompt again
Claude reads its own diff + git log and continues
The loop ends on: COMPLETE in output / max iterations / /cancel-ralph

Biggest advantage: zero setup

No Bun, no Go, no Cargo, no Rust. If you already use Claude Code, this is the fastest option. One plugin install and you have Ralph.

Windows pitfall

On Windows the stop hook occasionally fails with wsl: Unknown key 'automount.crossDistro' or execvpe(/bin/bash) failed. Workaround: edit ~/.claude/plugins/cache/.../hooks/hooks.json so the hook explicitly uses Git Bash:

"command": "\"C:/Program Files/Git/bin/bash.exe\" ${CLAUDE_PLUGIN_ROOT}/hooks/stop-hook.sh"

Important: Git/bin/bash.exe (with PATH wrapper), not Git/usr/bin/bash.exe (raw MinGW).

When to pick it

You use Claude Code and want to invest 5 minutes, not 5 hours
Your task fits in a single session without an external orchestration layer
You don't need multi-model comparison, Telegram steering, or a web UI

3.3 open-ralph-wiggum – the multi-agent CLI with rotation

Repo: Th0rgal/open-ralph-wiggum · Language: Bun + TypeScript

Full guide: Open Ralph Wiggum

Profile

5 agents supported: Claude Code, Codex, Copilot CLI, Cursor Agent, OpenCode

--rotation as the killer feature: a different agent is used per iteration

ralph "..." --rotation "claude-code:opus,codex:gpt-5,opencode:gpt-4o"

Tasks mode: .ralph/ralph-tasks.md, one task per iteration
Live steering from a second terminal: --add-context "Hint", --status, --list-tasks
Status dashboard with tool counts and struggle indicators

When to pick it

You want to compare models directly without launching 3 tools
You want a lean, focused CLI without a heavyweight web UI/backend
You need live steering but don't want to interrupt the loop

3.4 ralphex – multi-phase reviews and plan pipelines

Repo: umputun/ralphex · Language: Go (73%) + Python/Shell · Stars: 1.1k+

Profile

ralphex is the most enterprise-y variant: a 4-phase pipeline per plan.

Phase	What happens
1. Task Execution	A markdown plan with `### Task N` is processed sequentially, each task in a fresh Claude session, validation commands run automatically after each task
2. First Review	5 parallel review agents check: `quality`, `implementation`, `testing`, `simplification`, `documentation`
3. External Review (optional)	An external tool (default: Codex) gives an independent second verdict; Claude evaluates the findings and corrects
4. Second Review	Focused final check with 2 agents on critical/major issues
Finalize (optional)	Rebase, notifications, plan archival

Plan format

# Plan: User Authentication

## Overview
JWT-based auth with refresh tokens

## Validation Commands
- `pnpm test`
- `pnpm typecheck`

### Task 1: Login endpoint
- [ ] POST /login accepts email+password
- [ ] returns a bearer token
- [ ] Tests green

### Task 2: Refresh flow
- [ ] POST /refresh rotates the token
- [ ] Tests green

Highlights

Worktree isolation (--worktree) for parallel plans in the same repo
Web dashboard with SSE streaming, phase filters, multi-session watch
Mid-run steering: Ctrl+\ (SIGQUIT) pauses the task, you can edit the plan and restart the session
Stalemate detection: --review-patience aborts when N review rounds no longer change anything
Rate-limit handling: --wait automatic retry
Notifications: Telegram, Slack, email, webhook after the loop ends
VCS backend: git by default, Mercurial via translation scripts
Docker wrapper: read-only on credentials, RW only on the workspace

When to pick it

You want automated code review before the merge
You need plan pipelines with clearly separated phases
Your team wants a web UI for monitoring
You want heterogeneity – Claude-implemented + Codex-reviewed

3.5 ralph-orchestrator – hat system, MCP stack, human-in-the-loop

Repo: mikeyobrien/ralph-orchestrator · Language: Rust (82%) + TS · Stars: 2.8k

Profile

ralph-orchestrator is the most architecturally ambitious tool: a Rust engine with a React frontend, an MCP server, a Telegram bot, and a hat system.

Hat system

Hats are specialized AI personas that coordinate in sequence:

Hat	Job
`code-assist`	Implements features
`debug`	Finds/fixes bugs
`research`	Collects context, reads repos
`review`	Quality gates
`pdd-to-code-assist`	Translates PDD specs into implementation

Hats are defined via YAML configs (ralph.yml, ralph.qa.yml, ralph.bot.yml).

Stack

Web dashboard (alpha): Rust RPC API + React, ports 3000/5173
MCP server: workspace-scoped, one server per repo, deterministic config/task persistence
Terminal UI: built in ratatui
RObot (Telegram bot): /status, /tasks, /restart for live steering from your phone

Backends

Claude Code, Gemini CLI, Kiro, Codex, Amp, Copilot CLI, OpenCode (seven).

Workflow

ralph init --backend claude-code
ralph plan "Add OAuth login"          # PDD session: specs + designs + plan
ralph run -p specs/oauth.md            # Loop until done
ralph web                              # Dashboard
ralph mcp serve --workspace-root .     # MCP for other tools
ralph bot onboard --telegram           # Steering from your phone

When to pick it

You want Problem-Driven Development (PDD): specs before code
You work with other MCP clients and want to plug Ralph in as an MCP server
You want to steer from your phone (Telegram)
You like the hat concept (clear separation of roles)

3.6 ralphy – parallel worktrees + automatic PRs

Repo: michaelshimeles/ralphy · Language: TS (76%) + Bash · Stars: 2.8k

Profile

ralphy focuses on parallelism and GitHub integration.

ralphy --prd PRD.md --parallel --max-parallel 5 \
       --branch-per-task --create-pr

What happens:

ralphy reads the PRD (markdown / folder / YAML / JSON / GitHub issues)
Spawns 5 parallel agents, each in its own worktree with its own branch
Each agent solves its task in isolation, runs tests
On success: auto-merge into the base branch or auto-PR via gh
Merge conflicts are resolved by the AI agent itself
On worktree problems: fallback to sandbox mode (symlinks for node_modules/.git/vendor → fast even in monorepos)

Supported agents

Claude Code, OpenCode, Cursor, Codex, Qwen-Code, Factory Droid, Copilot, Gemini CLI – eight, more than any other.

Highlights

Browser automation: --browser for UI tests via agent-browser
Branch-per-task + auto-PR – ideal for gh workflows
Sandbox mode with symlinks – significantly faster in large repos
Task sources: MD, MD folder, YAML, JSON, GitHub issues
Webhook notifications in .ralphy/config.yaml

When to pick it

You want to work on multiple tasks simultaneously
You want automatic PRs instead of direct commits
You work in a large monorepo (sandbox mode helps)
UI tests should run automatically

4. The big comparison table

	snarktank/ralph	ralph-loop (plugin)	open-ralph-wiggum	ralphex	ralph-orchestrator	ralphy
Language	Bash	Bash hook	Bun + TS	Go	Rust + TS	TS + Bash
Install effort	minimal	zero	npm/bun	go install / brew	npm / cargo	npm
Agents	2 (Amp, Claude)	Claude Code only	5	1 (+ wrapper)	7	8
Loop mechanism	external Bash loop	stop hook (in-session)	external Bun process	multi-phase pipeline	hat sequence	parallel worktrees
Agent rotation	–	–	✅	–	–	– (parallel)
Multi-phase reviews	–	–	–	✅ (5 agents)	–	–
Parallel tasks	–	–	–	✅ worktrees	–	✅ explicit
PRD format	`prd.json`	free-form prompt	free-form prompt + tasks mode	markdown plan	YAML / PDD	MD/YAML/JSON/issues
Persistence	`prd.json`, `progress.txt`, `AGENTS.md`	filesystem + git	`.ralph/*.json/.md`	`.ralphex/progress/`	workspace + MCP	worktrees + PRs
Web UI	–	–	–	✅ SSE	✅ alpha	–
MCP server	–	–	–	–	✅	–
Telegram steering	–	–	–	notification only	✅ RObot	–
Browser tests	–	–	–	–	–	✅
Auto-PRs	–	–	–	–	–	✅
Sandboxing	–	–	–	Docker wrapper	–	symlink sandbox
Main audience	tinkerers, OG fans	Claude Code users	model comparers	enterprise reviewers	MCP / PDD stack	monorepo + GitHub
Stars (May 2026)	18.7k	(plugin, n/a)	–	1.1k	2.8k	2.8k

5. Decision tree

Are you a Claude Code user and want minimal effort?
└─ YES → ralph-loop plugin
└─ NO ↓

Do you want to compare different models per iteration?
└─ YES → open-ralph-wiggum (--rotation)
└─ NO ↓

Do you need automated multi-phase code review?
└─ YES → ralphex
└─ NO ↓

Do you want parallel tasks → auto-PRs?
└─ YES → ralphy (--parallel --create-pr)
└─ NO ↓

Do you want a PDD workflow, an MCP server, Telegram steering?
└─ YES → ralph-orchestrator
└─ NO ↓

→ snarktank/ralph (classic, 200 lines of Bash, biggest community)

6. Shared best practices (applies to ALL Ralph variants)

6.1 ALWAYS set an iteration limit

ralph "..." --max-iterations 20

The --completion-promise is just a string match. Without an iteration limit, a forgotten loop can run for hours and burn tokens.

6.2 Phrase acceptance criteria so they are VERIFIABLE

❌ Bad: "Build a nice login page."
✅ Good:
  1. POST /login → 200 with bearer token on correct credentials
  2. POST /login → 401 on wrong credentials
  3. `pnpm test login.spec.ts` is green
  4. When all three are satisfied → output COMPLETE

Ralph spins until tests are green. Without tests it spins until the iteration limit.

6.3 Story sizing

"If a task is too big, the LLM runs out of context before finishing and produces poor code." — snarktank/ralph

Rule of thumb: one user story = fits into the context window of the model you're using. Concretely:

Database migration: yes
UI component with tests: yes
Server action: yes
"Complete e-commerce system": no – split into a PRD with 50+ small stories

6.4 JSON PRDs beat markdown PRDs

For larger feature lists, a structured JSON schema (a list with id, title, acceptance: [], passes: false) reduces the agent's tendency to rewrite existing tests instead of satisfying them. snarktank/ralph does this by default with prd.json.

6.5 Sandboxing is mandatory

Auto-approve = the agent can do anything you could do in the shell. Mitigations:

Dedicated repo in a VM or container
ralphex: Docker wrapper · ralphy: symlink sandbox · others: manually with Daytona/sandboxed.sh
API keys via .env, never in prompts

6.6 Maintain persistent memory

The most successful implementations (snarktank/ralph) let the agent itself update AGENTS.md every iteration – patterns, gotchas, conventions. This accumulating memory is worth its weight in gold.

7. Pitfalls

Symptom	Cause	Fix
Loop runs endlessly without progress	no verifiable acceptance criteria	tests/linters as a gate, clear completion promise
Agent rewrites tests so they pass	soft acceptance criterion	JSON PRD, mark tests-as-spec explicitly as "do not change"
Token costs explode	model too large, too many iterations	smaller model for routine work, stricter `--max-iterations`
Agent forgets patterns from earlier iterations	no `AGENTS.md`/`progress.txt`	maintain a memory file, reference it explicitly in the prompt
Auto-commits pollute history	every iteration commits	snarktank: it's a feature; otherwise `--no-commit` + squash
Stop-hook plugin fails on Windows	`wsl: Unknown key` bug	set Git Bash explicitly in `hooks.json` (see §3.2)
Multiple worktrees crash	branch conflicts	ralphy has conflict resolution; otherwise use ralphex's worktree mode

8. Example stack

A productive setup for 2026:

┌──────────────────────────────────────────────────────────────────┐
│ Daily coding                                                     │
│   └─ Claude Code + ralph-loop plugin                             │
│      → small, well-defined loops while I do something else       │
├──────────────────────────────────────────────────────────────────┤
│ Larger features                                                  │
│   └─ open-ralph-wiggum with --rotation                           │
│      → complex refactoring, pitting models against each other    │
├──────────────────────────────────────────────────────────────────┤
│ Pre-merge review                                                 │
│   └─ ralphex --review                                            │
│      → 5 review agents on one branch                             │
├──────────────────────────────────────────────────────────────────┤
│ Mass tickets                                                     │
│   └─ ralphy --parallel --create-pr from GitHub issues            │
│      → 5 tickets simultaneously into draft PRs                   │
└──────────────────────────────────────────────────────────────────┘

9. Further reading

Sources & originals

Geoffrey Huntley: "Ralph Wiggum as a Software Engineer" – the original pattern
snarktank/ralph – popular Bash implementation (Ryan Carson)
anthropics/claude-plugins-official → ralph-loop – official Claude plugin

Implementations

Related topics

Agent comparison – Claude vs. ChatGPT vs. Copilot – the overarching agent map
sandboxed.sh – container workspaces for autonomous loops
MCP – Model Context Protocol

Quote

"Failures are predictable and informative. Persistence wins. Operator skill matters." — The Ralph Wiggum philosophy in three sentences

1. The technique in one sentence​

2. Why it actually works​

3. The 6 major implementations​

3.1 snarktank/ralph – the popular Bash variant​

Architecture​

Workflow​

Installation – three paths​

Supported tools​

Killer feature: AGENTS.md auto-update​

When to pick it​

3.2 ralph-loop – the official Claude Code plugin​

What it is​

Plugin structure​

Usage​

How the stop hook works​

When to pick it​

3.3 open-ralph-wiggum – the multi-agent CLI with rotation​

Profile​

When to pick it​

3.4 ralphex – multi-phase reviews and plan pipelines​

Profile​

Plan format​

Highlights​

When to pick it​

3.5 ralph-orchestrator – hat system, MCP stack, human-in-the-loop​

Profile​

Hat system​

Stack​

Backends​

Workflow​

When to pick it​

3.6 ralphy – parallel worktrees + automatic PRs​

Profile​

Supported agents​

Highlights​

When to pick it​

4. The big comparison table​

5. Decision tree​

6. Shared best practices (applies to ALL Ralph variants)​

6.1 ALWAYS set an iteration limit​

6.2 Phrase acceptance criteria so they are VERIFIABLE​

6.3 Story sizing​

6.4 JSON PRDs beat markdown PRDs​

6.5 Sandboxing is mandatory​

6.6 Maintain persistent memory​

7. Pitfalls​

8. Example stack​

9. Further reading​

1. The technique in one sentence

2. Why it actually works

3. The 6 major implementations

3.1 snarktank/ralph – the popular Bash variant

Architecture

Workflow

Installation – three paths

Supported tools

Killer feature: `AGENTS.md` auto-update

When to pick it

3.2 ralph-loop – the official Claude Code plugin

What it is

Plugin structure

Usage

How the stop hook works

When to pick it

3.3 open-ralph-wiggum – the multi-agent CLI with rotation

Profile

When to pick it

3.4 ralphex – multi-phase reviews and plan pipelines

Profile

Plan format

Highlights

When to pick it

3.5 ralph-orchestrator – hat system, MCP stack, human-in-the-loop

Profile

Hat system

Stack

Backends

Workflow

When to pick it

3.6 ralphy – parallel worktrees + automatic PRs

Profile

Supported agents

Highlights

When to pick it

4. The big comparison table

5. Decision tree

6. Shared best practices (applies to ALL Ralph variants)

6.1 ALWAYS set an iteration limit

6.2 Phrase acceptance criteria so they are VERIFIABLE

6.3 Story sizing

6.4 JSON PRDs beat markdown PRDs

6.5 Sandboxing is mandatory

6.6 Maintain persistent memory

7. Pitfalls

8. Example stack

9. Further reading