Rewrite Ralph loop recipes: split into simple vs ideal versions

Align all 4 language recipes (Node.js, Python, .NET, Go) with the Ralph Playbook architecture: - Simple version: minimal outer loop with fresh session per iteration - Ideal version: planning/building modes, backpressure, git integration - Fresh context isolation instead of in-session context accumulation - Disk-based shared state via IMPLEMENTATION_PLAN.md - Example prompt templates (PROMPT_plan.md, PROMPT_build.md, AGENTS.md) - Updated cookbook README descriptions
2026-02-22 03:15:13 +00:00 · 2026-02-11 11:28:41 -08:00
parent ab82accc08
commit 952372c1ec
9 changed files with 1052 additions and 1122 deletions
--- a/cookbook/copilot-sdk/nodejs/ralph-loop.md
+++ b/cookbook/copilot-sdk/nodejs/ralph-loop.md
@@ -1,6 +1,6 @@
-# RALPH-loop: Iterative Self-Referential AI Loops
+# Ralph Loop: Autonomous AI Task Loops

-Implement self-referential feedback loops where an AI agent iteratively improves work by reading its own previous output.
+Build autonomous coding loops where an AI agent picks tasks, implements them, validates against backpressure (tests, builds), commits, and repeats — each iteration in a fresh context window.

 > **Runnable example:** [recipe/ralph-loop.ts](recipe/ralph-loop.ts)
 >
@@ -9,200 +9,217 @@ Implement self-referential feedback loops where an AI agent iteratively improves
 > npx tsx ralph-loop.ts
 > ```

-## What is RALPH-loop?
+## What is a Ralph Loop?

-RALPH-loop is a development methodology for iterative AI-powered task completion. Named after the Ralph Wiggum technique, it embodies the philosophy of persistent iteration:
+A [Ralph loop](https://ghuntley.com/ralph/) is an autonomous development workflow where an AI agent iterates through tasks in isolated context windows. The key insight: **state lives on disk, not in the model's context**. Each iteration starts fresh, reads the current state from files, does one task, writes results back to disk, and exits.

- **One prompt, multiple iterations**: The same prompt is processed repeatedly
- **Self-referential feedback**: The AI reads its own previous work (file changes, git history)
- **Completion detection**: Loop exits when a completion promise is detected in output
- **Safety limits**: Always include a maximum iteration count to prevent infinite loops
-
-## Example Scenario
-
-You need to iteratively improve code until all tests pass. Instead of asking Copilot to "write perfect code," you use RALPH-loop to:
-
-1. Send the initial prompt with clear success criteria
-2. Copilot writes code and tests
-3. Copilot runs tests and sees failures
-4. Loop automatically re-sends the prompt
-5. Copilot reads test output and previous code, fixes issues
-6. Repeat until all tests pass and completion promise is output
-
-## Basic Implementation
-
-```typescript
-import { CopilotClient } from "@github/copilot-sdk";
-
-class RalphLoop {
-    private client: CopilotClient;
-    private iteration: number = 0;
-    private maxIterations: number;
-    private completionPromise: string;
-    private lastResponse: string | null = null;
-
-    constructor(maxIterations: number = 10, completionPromise: string = "COMPLETE") {
-        this.client = new CopilotClient();
-        this.maxIterations = maxIterations;
-        this.completionPromise = completionPromise;
-    }
-
-    async run(initialPrompt: string): Promise<string> {
-        await this.client.start();
-        const session = await this.client.createSession({ model: "gpt-5.1-codex-mini" });
-
-        try {
-            while (this.iteration < this.maxIterations) {
-                this.iteration++;
-                console.log(`\n--- Iteration ${this.iteration}/${this.maxIterations} ---`);
-
-                // Build prompt including previous response as context
-                const prompt = this.iteration === 1
-                    ? initialPrompt
-                    : `${initialPrompt}\n\nPrevious attempt:\n${this.lastResponse}\n\nContinue improving...`;
-
-                const response = await session.sendAndWait({ prompt });
-                this.lastResponse = response?.data.content || "";
-
-                console.log(`Response (${this.lastResponse.length} chars)`);
-
-                // Check for completion promise
-                if (this.lastResponse.includes(this.completionPromise)) {
-                    console.log(`✓ Completion promise detected: ${this.completionPromise}`);
-                    return this.lastResponse;
-                }
-
-                console.log(`Continuing to iteration ${this.iteration + 1}...`);
-            }
-
-            throw new Error(
-                `Max iterations (${this.maxIterations}) reached without completion promise`
-            );
-        } finally {
-            await session.destroy();
-            await this.client.stop();
-        }
-    }
-}
-
-// Usage
-const loop = new RalphLoop(5, "COMPLETE");
-const result = await loop.run("Your task here");
-console.log(result);
+```
+┌─────────────────────────────────────────────────┐
+│                   loop.sh                       │
+│  while true:                                    │
+│    ┌─────────────────────────────────────────┐  │
+│    │  Fresh session (isolated context)       │  │
+│    │                                         │  │
+│    │  1. Read PROMPT.md + AGENTS.md          │  │
+│    │  2. Study specs/* and code              │  │
+│    │  3. Pick next task from plan            │  │
+│    │  4. Implement + run tests               │  │
+│    │  5. Update plan, commit, exit           │  │
+│    └─────────────────────────────────────────┘  │
+│    ↻ next iteration (fresh context)             │
+└─────────────────────────────────────────────────┘
 ```

-## With File Persistence
+**Core principles:**

-For tasks involving code generation, persist state to files so the AI can see changes:
+- **Fresh context per iteration**: Each loop creates a new session — no context accumulation, always in the "smart zone"
+- **Disk as shared state**: `IMPLEMENTATION_PLAN.md` persists between iterations and acts as the coordination mechanism
+- **Backpressure steers quality**: Tests, builds, and lints reject bad work — the agent must fix issues before committing
+- **Two modes**: PLANNING (gap analysis → generate plan) and BUILDING (implement from plan)
+
+## Simple Version
+
+The minimal Ralph loop — the SDK equivalent of `while :; do cat PROMPT.md | claude ; done`:

 ```typescript
-import fs from "fs/promises";
-import path from "path";
+import { readFile } from "fs/promises";
 import { CopilotClient } from "@github/copilot-sdk";

-class PersistentRalphLoop {
-    private client: CopilotClient;
-    private workDir: string;
-    private iteration: number = 0;
-    private maxIterations: number;
+async function ralphLoop(promptFile: string, maxIterations: number = 50) {
+    const client = new CopilotClient();
+    await client.start();

-    constructor(workDir: string, maxIterations: number = 10) {
-        this.client = new CopilotClient();
-        this.workDir = workDir;
-        this.maxIterations = maxIterations;
-    }
+    try {
+        const prompt = await readFile(promptFile, "utf-8");

-    async run(initialPrompt: string): Promise<string> {
-        await fs.mkdir(this.workDir, { recursive: true });
-        await this.client.start();
-        const session = await this.client.createSession({ model: "gpt-5.1-codex-mini" });
+        for (let i = 1; i <= maxIterations; i++) {
+            console.log(`\n=== Iteration ${i}/${maxIterations} ===`);

-        try {
-            // Store initial prompt
-            await fs.writeFile(path.join(this.workDir, "prompt.md"), initialPrompt);
-
-            while (this.iteration < this.maxIterations) {
-                this.iteration++;
-                console.log(`\n--- Iteration ${this.iteration} ---`);
-
-                // Build context from previous outputs
-                let context = initialPrompt;
-                const prevOutputFile = path.join(this.workDir, `output-${this.iteration - 1}.txt`);
-                try {
-                    const prevOutput = await fs.readFile(prevOutputFile, "utf-8");
-                    context += `\n\nPrevious iteration:\n${prevOutput}`;
-                } catch {
-                    // No previous output yet
-                }
-
-                const response = await session.sendAndWait({ prompt: context });
-                const output = response?.data.content || "";
-
-                // Persist output
-                await fs.writeFile(
-                    path.join(this.workDir, `output-${this.iteration}.txt`),
-                    output
-                );
-
-                if (output.includes("COMPLETE")) {
-                    return output;
-                }
+            // Fresh session each iteration — context isolation is the point
+            const session = await client.createSession({ model: "claude-sonnet-4.5" });
+            try {
+                await session.sendAndWait({ prompt }, 600_000);
+            } finally {
+                await session.destroy();
            }

-            throw new Error("Max iterations reached");
-        } finally {
-            await session.destroy();
-            await this.client.stop();
+            console.log(`Iteration ${i} complete.`);
        }
+    } finally {
+        await client.stop();
    }
 }
+
+// Usage: point at your PROMPT.md
+ralphLoop("PROMPT.md", 20);
+```
+
+This is all you need to get started. The prompt file tells the agent what to do; the agent reads project files, does work, commits, and exits. The loop restarts with a clean slate.
+
+## Ideal Version
+
+The full Ralph pattern with planning and building modes, matching the [Ralph Playbook](https://github.com/ClaytonFarr/ralph-playbook) architecture:
+
+```typescript
+import { readFile } from "fs/promises";
+import { execSync } from "child_process";
+import { CopilotClient } from "@github/copilot-sdk";
+
+type Mode = "plan" | "build";
+
+async function ralphLoop(mode: Mode, maxIterations: number = 50) {
+    const promptFile = mode === "plan" ? "PROMPT_plan.md" : "PROMPT_build.md";
+    const client = new CopilotClient();
+    await client.start();
+
+    const branch = execSync("git branch --show-current", { encoding: "utf-8" }).trim();
+    console.log(`Mode: ${mode} | Prompt: ${promptFile} | Branch: ${branch}`);
+
+    try {
+        const prompt = await readFile(promptFile, "utf-8");
+
+        for (let i = 1; i <= maxIterations; i++) {
+            console.log(`\n=== Iteration ${i}/${maxIterations} ===`);
+
+            // Fresh session — each task gets full context budget
+            const session = await client.createSession({ model: "claude-sonnet-4.5" });
+            try {
+                await session.sendAndWait({ prompt }, 600_000);
+            } finally {
+                await session.destroy();
+            }
+
+            // Push changes after each iteration
+            try {
+                execSync(`git push origin ${branch}`, { stdio: "inherit" });
+            } catch {
+                execSync(`git push -u origin ${branch}`, { stdio: "inherit" });
+            }
+
+            console.log(`Iteration ${i} complete.`);
+        }
+    } finally {
+        await client.stop();
+    }
+}
+
+// Parse CLI args: npx tsx ralph-loop.ts [plan] [max_iterations]
+const args = process.argv.slice(2);
+const mode: Mode = args.includes("plan") ? "plan" : "build";
+const maxArg = args.find(a => /^\d+$/.test(a));
+const maxIterations = maxArg ? parseInt(maxArg) : 50;
+
+ralphLoop(mode, maxIterations);
+```
+
+### Required Project Files
+
+The ideal version expects this file structure in your project:
+
+```
+project-root/
+├── PROMPT_plan.md              # Planning mode instructions
+├── PROMPT_build.md             # Building mode instructions
+├── AGENTS.md                   # Operational guide (build/test commands)
+├── IMPLEMENTATION_PLAN.md      # Task list (generated by planning mode)
+├── specs/                      # Requirement specs (one per topic)
+│   ├── auth.md
+│   └── data-pipeline.md
+└── src/                        # Your source code
+```
+
+### Example `PROMPT_plan.md`
+
+```markdown
+0a. Study `specs/*` to learn the application specifications.
+0b. Study IMPLEMENTATION_PLAN.md (if present) to understand the plan so far.
+0c. Study `src/` to understand existing code and shared utilities.
+
+1. Compare specs against code (gap analysis). Create or update
+   IMPLEMENTATION_PLAN.md as a prioritized bullet-point list of tasks
+   yet to be implemented. Do NOT implement anything.
+
+IMPORTANT: Do NOT assume functionality is missing — search the
+codebase first to confirm. Prefer updating existing utilities over
+creating ad-hoc copies.
+```
+
+### Example `PROMPT_build.md`
+
+```markdown
+0a. Study `specs/*` to learn the application specifications.
+0b. Study IMPLEMENTATION_PLAN.md.
+0c. Study `src/` for reference.
+
+1. Choose the most important item from IMPLEMENTATION_PLAN.md. Before
+   making changes, search the codebase (don't assume not implemented).
+2. After implementing, run the tests. If functionality is missing, add it.
+3. When you discover issues, update IMPLEMENTATION_PLAN.md immediately.
+4. When tests pass, update IMPLEMENTATION_PLAN.md, then `git add -A`
+   then `git commit` with a descriptive message.
+
+99999. When authoring documentation, capture the why.
+999999. Implement completely. No placeholders or stubs.
+9999999. Keep IMPLEMENTATION_PLAN.md current — future iterations depend on it.
+```
+
+### Example `AGENTS.md`
+
+Keep this brief (~60 lines). It's loaded every iteration, so bloat wastes context.
+
+```markdown
+## Build & Run
+
+npm run build
+
+## Validation
+
+- Tests: `npm test`
+- Typecheck: `npx tsc --noEmit`
+- Lint: `npm run lint`
 ```

 ## Best Practices

-1. **Write clear completion criteria**: Include exactly what "done" looks like
-2. **Use output markers**: Include `<promise>COMPLETE</promise>` or similar in completion condition
-3. **Always set max iterations**: Prevents infinite loops on impossible tasks
-4. **Persist state**: Save files so AI can see what changed between iterations
-5. **Include context**: Feed previous iteration output back as context
-6. **Monitor progress**: Log each iteration to track what's happening
+1. **Fresh context per iteration**: Never accumulate context across iterations — that's the whole point
+2. **Disk is your database**: `IMPLEMENTATION_PLAN.md` is shared state between isolated sessions
+3. **Backpressure is essential**: Tests, builds, lints in `AGENTS.md` — the agent must pass them before committing
+4. **Start with PLANNING mode**: Generate the plan first, then switch to BUILDING
+5. **Observe and tune**: Watch early iterations, add guardrails to prompts when the agent fails in specific ways
+6. **The plan is disposable**: If the agent goes off track, delete `IMPLEMENTATION_PLAN.md` and re-plan
+7. **Keep `AGENTS.md` brief**: It's loaded every iteration — operational info only, no progress notes
+8. **Use a sandbox**: The agent runs autonomously with full tool access — isolate it

-## Example: Iterative Code Generation
-
-```typescript
-const prompt = `Write a function that:
-1. Parses CSV data
-2. Validates required fields
-3. Returns parsed records or error
-4. Has unit tests
-5. Output <promise>COMPLETE</promise> when done`;
-
-const loop = new RalphLoop(10, "COMPLETE");
-const result = await loop.run(prompt);
-```
-
-## Handling Failures
-
-```typescript
-try {
-    const result = await loop.run(prompt);
-    console.log("Task completed successfully!");
-} catch (error) {
-    console.error("Task failed:", error.message);
-    // Analyze what was attempted and suggest alternatives
-}
-```
-
-## When to Use RALPH-loop
+## When to Use a Ralph Loop

 **Good for:**
- Code generation with automatic verification (tests, linters)
- Tasks with clear success criteria
- Iterative refinement where each attempt learns from previous failures
- Unattended long-running improvements
+- Implementing features from specs with test-driven validation
+- Large refactors broken into many small tasks
+- Unattended, long-running development with clear requirements
+- Any work where backpressure (tests/builds) can verify correctness

 **Not good for:**
- Tasks requiring human judgment or design input
- One-shot operations
- Tasks with vague success criteria
- Real-time interactive debugging
+- Tasks requiring human judgment mid-loop
+- One-shot operations that don't benefit from iteration
+- Vague requirements without testable acceptance criteria
+- Exploratory prototyping where direction isn't clear