diff --git a/cookbook/copilot-sdk/README.md b/cookbook/copilot-sdk/README.md index 6e364457..55981302 100644 --- a/cookbook/copilot-sdk/README.md +++ b/cookbook/copilot-sdk/README.md @@ -6,7 +6,7 @@ This cookbook collects small, focused recipes showing how to accomplish common t ### .NET (C#) -- [RALPH-loop](dotnet/ralph-loop.md): Implement iterative self-referential AI loops for task completion with automatic retries. +- [Ralph Loop](dotnet/ralph-loop.md): Build autonomous AI coding loops with fresh context per iteration, planning/building modes, and backpressure. - [Error Handling](dotnet/error-handling.md): Handle errors gracefully including connection failures, timeouts, and cleanup. - [Multiple Sessions](dotnet/multiple-sessions.md): Manage multiple independent conversations simultaneously. - [Managing Local Files](dotnet/managing-local-files.md): Organize files by metadata using AI-powered grouping strategies. @@ -15,7 +15,7 @@ This cookbook collects small, focused recipes showing how to accomplish common t ### Node.js / TypeScript -- [RALPH-loop](nodejs/ralph-loop.md): Implement iterative self-referential AI loops for task completion with automatic retries. +- [Ralph Loop](nodejs/ralph-loop.md): Build autonomous AI coding loops with fresh context per iteration, planning/building modes, and backpressure. - [Error Handling](nodejs/error-handling.md): Handle errors gracefully including connection failures, timeouts, and cleanup. - [Multiple Sessions](nodejs/multiple-sessions.md): Manage multiple independent conversations simultaneously. - [Managing Local Files](nodejs/managing-local-files.md): Organize files by metadata using AI-powered grouping strategies. @@ -24,7 +24,7 @@ This cookbook collects small, focused recipes showing how to accomplish common t ### Python -- [RALPH-loop](python/ralph-loop.md): Implement iterative self-referential AI loops for task completion with automatic retries. +- [Ralph Loop](python/ralph-loop.md): Build autonomous AI coding loops with fresh context per iteration, planning/building modes, and backpressure. - [Error Handling](python/error-handling.md): Handle errors gracefully including connection failures, timeouts, and cleanup. - [Multiple Sessions](python/multiple-sessions.md): Manage multiple independent conversations simultaneously. - [Managing Local Files](python/managing-local-files.md): Organize files by metadata using AI-powered grouping strategies. @@ -33,7 +33,7 @@ This cookbook collects small, focused recipes showing how to accomplish common t ### Go -- [RALPH-loop](go/ralph-loop.md): Implement iterative self-referential AI loops for task completion with automatic retries. +- [Ralph Loop](go/ralph-loop.md): Build autonomous AI coding loops with fresh context per iteration, planning/building modes, and backpressure. - [Error Handling](go/error-handling.md): Handle errors gracefully including connection failures, timeouts, and cleanup. - [Multiple Sessions](go/multiple-sessions.md): Manage multiple independent conversations simultaneously. - [Managing Local Files](go/managing-local-files.md): Organize files by metadata using AI-powered grouping strategies. diff --git a/cookbook/copilot-sdk/dotnet/ralph-loop.md b/cookbook/copilot-sdk/dotnet/ralph-loop.md index 12aa0138..1cc98762 100644 --- a/cookbook/copilot-sdk/dotnet/ralph-loop.md +++ b/cookbook/copilot-sdk/dotnet/ralph-loop.md @@ -1,6 +1,6 @@ -# RALPH-loop: Iterative Self-Referential AI Loops +# Ralph Loop: Autonomous AI Task Loops -Implement self-referential feedback loops where an AI agent iteratively improves work by reading its own previous output. +Build autonomous coding loops where an AI agent picks tasks, implements them, validates against backpressure (tests, builds), commits, and repeats — each iteration in a fresh context window. > **Runnable example:** [recipe/ralph-loop.cs](recipe/ralph-loop.cs) > @@ -9,252 +9,250 @@ Implement self-referential feedback loops where an AI agent iteratively improves > dotnet run recipe/ralph-loop.cs > ``` -## What is RALPH-loop? +## What is a Ralph Loop? -RALPH-loop is a development methodology for iterative AI-powered task completion. Named after the Ralph Wiggum technique, it embodies the philosophy of persistent iteration: +A [Ralph loop](https://ghuntley.com/ralph/) is an autonomous development workflow where an AI agent iterates through tasks in isolated context windows. The key insight: **state lives on disk, not in the model's context**. Each iteration starts fresh, reads the current state from files, does one task, writes results back to disk, and exits. -- **One prompt, multiple iterations**: The same prompt is processed repeatedly -- **Self-referential feedback**: The AI reads its own previous work (file changes, git history) -- **Completion detection**: Loop exits when a completion promise is detected in output -- **Safety limits**: Always include a maximum iteration count to prevent infinite loops +``` +┌─────────────────────────────────────────────────┐ +│ loop.sh │ +│ while true: │ +│ ┌─────────────────────────────────────────┐ │ +│ │ Fresh session (isolated context) │ │ +│ │ │ │ +│ │ 1. Read PROMPT.md + AGENTS.md │ │ +│ │ 2. Study specs/* and code │ │ +│ │ 3. Pick next task from plan │ │ +│ │ 4. Implement + run tests │ │ +│ │ 5. Update plan, commit, exit │ │ +│ └─────────────────────────────────────────┘ │ +│ ↻ next iteration (fresh context) │ +└─────────────────────────────────────────────────┘ +``` -## Example Scenario +**Core principles:** -You need to iteratively improve code until all tests pass. Instead of asking the model to "write perfect code," you use RALPH-loop to: +- **Fresh context per iteration**: Each loop creates a new session — no context accumulation, always in the "smart zone" +- **Disk as shared state**: `IMPLEMENTATION_PLAN.md` persists between iterations and acts as the coordination mechanism +- **Backpressure steers quality**: Tests, builds, and lints reject bad work — the agent must fix issues before committing +- **Two modes**: PLANNING (gap analysis → generate plan) and BUILDING (implement from plan) -1. Send the initial prompt with clear success criteria -2. The model writes code and tests -3. The model runs tests and sees failures -4. Loop automatically re-sends the prompt -5. The model reads test output and previous code, fixes issues -6. Repeat until all tests pass and completion promise is output +## Simple Version -## Basic Implementation +The minimal Ralph loop — the SDK equivalent of `while :; do cat PROMPT.md | claude ; done`: ```csharp using GitHub.Copilot.SDK; -public class RalphLoop +var client = new CopilotClient(); +await client.StartAsync(); + +try { - private readonly CopilotClient _client; - private int _iteration = 0; - private readonly int _maxIterations; - private readonly string _completionPromise; - private string? _lastResponse; + var prompt = await File.ReadAllTextAsync("PROMPT.md"); + var maxIterations = 50; - public RalphLoop(int maxIterations = 10, string completionPromise = "COMPLETE") + for (var i = 1; i <= maxIterations; i++) { - _client = new CopilotClient(); - _maxIterations = maxIterations; - _completionPromise = completionPromise; - } - - public async Task RunAsync(string prompt) - { - await _client.StartAsync(); + Console.WriteLine($"\n=== Iteration {i}/{maxIterations} ==="); + // Fresh session each iteration — context isolation is the point + var session = await client.CreateSessionAsync( + new SessionConfig { Model = "claude-sonnet-4.5" }); try { - var session = await _client.CreateSessionAsync( - new SessionConfig { Model = "gpt-5.1-codex-mini" }); - - try + var done = new TaskCompletionSource(); + session.On(evt => { - var done = new TaskCompletionSource(); - session.On(evt => - { - if (evt is AssistantMessageEvent msg) - { - _lastResponse = msg.Data.Content; - done.TrySetResult(msg.Data.Content); - } - }); + if (evt is AssistantMessageEvent msg) + done.TrySetResult(msg.Data.Content); + }); - while (_iteration < _maxIterations) - { - _iteration++; - Console.WriteLine($"\n--- Iteration {_iteration} ---"); - - done = new TaskCompletionSource(); - - // Send prompt (on first iteration) or continuation - var messagePrompt = _iteration == 1 - ? prompt - : $"{prompt}\n\nPrevious attempt:\n{_lastResponse}\n\nContinue iterating..."; - - await session.SendAsync(new MessageOptions { Prompt = messagePrompt }); - var response = await done.Task; - - // Check for completion promise - if (response.Contains(_completionPromise)) - { - Console.WriteLine($"✓ Completion promise detected: {_completionPromise}"); - return response; - } - - Console.WriteLine($"Iteration {_iteration} complete. Continuing..."); - } - - throw new InvalidOperationException( - $"Max iterations ({_maxIterations}) reached without completion promise"); - } - finally - { - await session.DisposeAsync(); - } + await session.SendAsync(new MessageOptions { Prompt = prompt }); + await done.Task; } finally { - await _client.StopAsync(); + await session.DisposeAsync(); } + + Console.WriteLine($"Iteration {i} complete."); } } - -// Usage -var loop = new RalphLoop(maxIterations: 5, completionPromise: "COMPLETE"); -var result = await loop.RunAsync("Your task here"); -Console.WriteLine(result); +finally +{ + await client.StopAsync(); +} ``` -## With File Persistence +This is all you need to get started. The prompt file tells the agent what to do; the agent reads project files, does work, commits, and exits. The loop restarts with a clean slate. -For tasks involving code generation, persist state to files so the AI can see changes: +## Ideal Version + +The full Ralph pattern with planning and building modes, matching the [Ralph Playbook](https://github.com/ClaytonFarr/ralph-playbook) architecture: ```csharp -public class PersistentRalphLoop +using System.Diagnostics; +using GitHub.Copilot.SDK; + +// Parse args: dotnet run [plan] [max_iterations] +var mode = args.Contains("plan") ? "plan" : "build"; +var maxArg = args.FirstOrDefault(a => int.TryParse(a, out _)); +var maxIterations = maxArg != null ? int.Parse(maxArg) : 50; +var promptFile = mode == "plan" ? "PROMPT_plan.md" : "PROMPT_build.md"; + +var client = new CopilotClient(); +await client.StartAsync(); + +var branchInfo = new ProcessStartInfo("git", "branch --show-current") + { RedirectStandardOutput = true }; +var branch = Process.Start(branchInfo)!; +var branchName = (await branch.StandardOutput.ReadToEndAsync()).Trim(); +await branch.WaitForExitAsync(); + +Console.WriteLine(new string('━', 40)); +Console.WriteLine($"Mode: {mode}"); +Console.WriteLine($"Prompt: {promptFile}"); +Console.WriteLine($"Branch: {branchName}"); +Console.WriteLine($"Max: {maxIterations} iterations"); +Console.WriteLine(new string('━', 40)); + +try { - private readonly string _workDir; - private readonly CopilotClient _client; - private readonly int _maxIterations; - private int _iteration = 0; + var prompt = await File.ReadAllTextAsync(promptFile); - public PersistentRalphLoop(string workDir, int maxIterations = 10) + for (var i = 1; i <= maxIterations; i++) { - _workDir = workDir; - _maxIterations = maxIterations; - Directory.CreateDirectory(_workDir); - _client = new CopilotClient(); - } - - public async Task RunAsync(string prompt) - { - await _client.StartAsync(); + Console.WriteLine($"\n=== Iteration {i}/{maxIterations} ==="); + // Fresh session — each task gets full context budget + var session = await client.CreateSessionAsync( + new SessionConfig { Model = "claude-sonnet-4.5" }); try { - var session = await _client.CreateSessionAsync( - new SessionConfig { Model = "gpt-5.1-codex-mini" }); - - try + var done = new TaskCompletionSource(); + session.On(evt => { - // Store initial prompt - var promptFile = Path.Combine(_workDir, "prompt.md"); - await File.WriteAllTextAsync(promptFile, prompt); + if (evt is AssistantMessageEvent msg) + done.TrySetResult(msg.Data.Content); + }); - var done = new TaskCompletionSource(); - string response = ""; - session.On(evt => - { - if (evt is AssistantMessageEvent msg) - { - response = msg.Data.Content; - done.TrySetResult(msg.Data.Content); - } - }); - - while (_iteration < _maxIterations) - { - _iteration++; - Console.WriteLine($"\n--- Iteration {_iteration} ---"); - - done = new TaskCompletionSource(); - - // Build context including previous work - var contextBuilder = new StringBuilder(prompt); - var previousOutput = Path.Combine(_workDir, $"output-{_iteration - 1}.txt"); - if (File.Exists(previousOutput)) - { - contextBuilder.AppendLine($"\nPrevious iteration output:\n{await File.ReadAllTextAsync(previousOutput)}"); - } - - await session.SendAsync(new MessageOptions { Prompt = contextBuilder.ToString() }); - await done.Task; - - // Persist output - await File.WriteAllTextAsync( - Path.Combine(_workDir, $"output-{_iteration}.txt"), - response); - - if (response.Contains("COMPLETE")) - { - return response; - } - } - - throw new InvalidOperationException("Max iterations reached"); - } - finally - { - await session.DisposeAsync(); - } + await session.SendAsync(new MessageOptions { Prompt = prompt }); + await done.Task; } finally { - await _client.StopAsync(); + await session.DisposeAsync(); } + + // Push changes after each iteration + try + { + Process.Start("git", $"push origin {branchName}")!.WaitForExit(); + } + catch + { + Process.Start("git", $"push -u origin {branchName}")!.WaitForExit(); + } + + Console.WriteLine($"\nIteration {i} complete."); } + + Console.WriteLine($"\nReached max iterations: {maxIterations}"); } +finally +{ + await client.StopAsync(); +} +``` + +### Required Project Files + +The ideal version expects this file structure in your project: + +``` +project-root/ +├── PROMPT_plan.md # Planning mode instructions +├── PROMPT_build.md # Building mode instructions +├── AGENTS.md # Operational guide (build/test commands) +├── IMPLEMENTATION_PLAN.md # Task list (generated by planning mode) +├── specs/ # Requirement specs (one per topic) +│ ├── auth.md +│ └── data-pipeline.md +└── src/ # Your source code +``` + +### Example `PROMPT_plan.md` + +```markdown +0a. Study `specs/*` to learn the application specifications. +0b. Study IMPLEMENTATION_PLAN.md (if present) to understand the plan so far. +0c. Study `src/` to understand existing code and shared utilities. + +1. Compare specs against code (gap analysis). Create or update + IMPLEMENTATION_PLAN.md as a prioritized bullet-point list of tasks + yet to be implemented. Do NOT implement anything. + +IMPORTANT: Do NOT assume functionality is missing — search the +codebase first to confirm. Prefer updating existing utilities over +creating ad-hoc copies. +``` + +### Example `PROMPT_build.md` + +```markdown +0a. Study `specs/*` to learn the application specifications. +0b. Study IMPLEMENTATION_PLAN.md. +0c. Study `src/` for reference. + +1. Choose the most important item from IMPLEMENTATION_PLAN.md. Before + making changes, search the codebase (don't assume not implemented). +2. After implementing, run the tests. If functionality is missing, add it. +3. When you discover issues, update IMPLEMENTATION_PLAN.md immediately. +4. When tests pass, update IMPLEMENTATION_PLAN.md, then `git add -A` + then `git commit` with a descriptive message. + +99999. When authoring documentation, capture the why. +999999. Implement completely. No placeholders or stubs. +9999999. Keep IMPLEMENTATION_PLAN.md current — future iterations depend on it. +``` + +### Example `AGENTS.md` + +Keep this brief (~60 lines). It's loaded every iteration, so bloat wastes context. + +```markdown +## Build & Run + +dotnet build + +## Validation + +- Tests: `dotnet test` +- Build: `dotnet build --no-restore` ``` ## Best Practices -1. **Write clear completion criteria**: Include exactly what "done" looks like -2. **Use output markers**: Include `COMPLETE` or similar in completion condition -3. **Always set max iterations**: Prevents infinite loops on impossible tasks -4. **Persist state**: Save files so AI can see what changed between iterations -5. **Include context**: Feed previous iteration output back as context -6. **Monitor progress**: Log each iteration to track what's happening +1. **Fresh context per iteration**: Never accumulate context across iterations — that's the whole point +2. **Disk is your database**: `IMPLEMENTATION_PLAN.md` is shared state between isolated sessions +3. **Backpressure is essential**: Tests, builds, lints in `AGENTS.md` — the agent must pass them before committing +4. **Start with PLANNING mode**: Generate the plan first, then switch to BUILDING +5. **Observe and tune**: Watch early iterations, add guardrails to prompts when the agent fails in specific ways +6. **The plan is disposable**: If the agent goes off track, delete `IMPLEMENTATION_PLAN.md` and re-plan +7. **Keep `AGENTS.md` brief**: It's loaded every iteration — operational info only, no progress notes +8. **Use a sandbox**: The agent runs autonomously with full tool access — isolate it -## Example: Iterative Code Generation - -```csharp -var prompt = @"Write a function that: -1. Parses CSV data -2. Validates required fields -3. Returns parsed records or error -4. Has unit tests -5. Output COMPLETE when done"; - -var loop = new RalphLoop(maxIterations: 10, completionPromise: "COMPLETE"); -var result = await loop.RunAsync(prompt); -``` - -## Handling Failures - -```csharp -try -{ - var result = await loop.RunAsync(prompt); - Console.WriteLine("Task completed successfully!"); -} -catch (InvalidOperationException ex) when (ex.Message.Contains("Max iterations")) -{ - Console.WriteLine("Task did not complete within iteration limit."); - Console.WriteLine($"Last response: {loop.LastResponse}"); - // Document what was attempted and suggest alternatives -} -``` - -## When to Use RALPH-loop +## When to Use a Ralph Loop **Good for:** -- Code generation with automatic verification (tests, linters) -- Tasks with clear success criteria -- Iterative refinement where each attempt learns from previous failures -- Unattended long-running improvements +- Implementing features from specs with test-driven validation +- Large refactors broken into many small tasks +- Unattended, long-running development with clear requirements +- Any work where backpressure (tests/builds) can verify correctness **Not good for:** -- Tasks requiring human judgment or design input -- One-shot operations -- Tasks with vague success criteria -- Real-time interactive debugging +- Tasks requiring human judgment mid-loop +- One-shot operations that don't benefit from iteration +- Vague requirements without testable acceptance criteria +- Exploratory prototyping where direction isn't clear diff --git a/cookbook/copilot-sdk/dotnet/recipe/ralph-loop.cs b/cookbook/copilot-sdk/dotnet/recipe/ralph-loop.cs index 0e81b5f8..c198c727 100644 --- a/cookbook/copilot-sdk/dotnet/recipe/ralph-loop.cs +++ b/cookbook/copilot-sdk/dotnet/recipe/ralph-loop.cs @@ -1,141 +1,90 @@ #:package GitHub.Copilot.SDK@* #:property PublishAot=false +using System.Diagnostics; using GitHub.Copilot.SDK; -using System.Text; -// RALPH-loop: Iterative self-referential AI loops. -// The same prompt is sent repeatedly, with AI reading its own previous output. -// Loop continues until completion promise is detected in the response. +// Ralph loop: autonomous AI task loop with fresh context per iteration. +// +// Two modes: +// - "plan": reads PROMPT_plan.md, generates/updates IMPLEMENTATION_PLAN.md +// - "build": reads PROMPT_build.md, implements tasks, runs tests, commits +// +// Each iteration creates a fresh session so the agent always operates in +// the "smart zone" of its context window. State is shared between +// iterations via files on disk (IMPLEMENTATION_PLAN.md, AGENTS.md, specs/*). +// +// Usage: +// dotnet run # build mode, 50 iterations +// dotnet run plan # planning mode +// dotnet run 20 # build mode, 20 iterations +// dotnet run plan 5 # planning mode, 5 iterations -var prompt = @"You are iteratively building a small library. Follow these phases IN ORDER. -Do NOT skip ahead — only do the current phase, then stop and wait for the next iteration. +var mode = args.Contains("plan") ? "plan" : "build"; +var maxArg = args.FirstOrDefault(a => int.TryParse(a, out _)); +var maxIterations = maxArg != null ? int.Parse(maxArg) : 50; +var promptFile = mode == "plan" ? "PROMPT_plan.md" : "PROMPT_build.md"; -Phase 1: Design a DataValidator class that validates records against a schema. - - Schema defines field names, types (string, int, float, bool), and whether required. - - Return a list of validation errors per record. - - Show the class code only. Do NOT output COMPLETE. +var client = new CopilotClient(); +await client.StartAsync(); -Phase 2: Write at least 4 unit tests covering: missing required field, wrong type, - valid record, and empty input. Show test code only. Do NOT output COMPLETE. +var branchProc = Process.Start(new ProcessStartInfo("git", "branch --show-current") + { RedirectStandardOutput = true })!; +var branch = (await branchProc.StandardOutput.ReadToEndAsync()).Trim(); +await branchProc.WaitForExitAsync(); -Phase 3: Review the code from phases 1 and 2. Fix any bugs, add docstrings, and add - an extra edge-case test. Show the final consolidated code with all fixes. - When this phase is fully done, output the exact text: COMPLETE"; - -var loop = new RalphLoop(maxIterations: 5, completionPromise: "COMPLETE"); +Console.WriteLine(new string('━', 40)); +Console.WriteLine($"Mode: {mode}"); +Console.WriteLine($"Prompt: {promptFile}"); +Console.WriteLine($"Branch: {branch}"); +Console.WriteLine($"Max: {maxIterations} iterations"); +Console.WriteLine(new string('━', 40)); try { - var result = await loop.RunAsync(prompt); - Console.WriteLine("\n=== FINAL RESULT ==="); - Console.WriteLine(result); -} -catch (InvalidOperationException ex) -{ - Console.WriteLine($"\nTask did not complete: {ex.Message}"); - if (loop.LastResponse != null) + var prompt = await File.ReadAllTextAsync(promptFile); + + for (var i = 1; i <= maxIterations; i++) { - Console.WriteLine($"\nLast attempt:\n{loop.LastResponse}"); - } -} + Console.WriteLine($"\n=== Iteration {i}/{maxIterations} ==="); -// --- RalphLoop class definition --- - -public class RalphLoop -{ - private readonly CopilotClient _client; - private int _iteration = 0; - private readonly int _maxIterations; - private readonly string _completionPromise; - private string? _lastResponse; - - public RalphLoop(int maxIterations = 10, string completionPromise = "COMPLETE") - { - _client = new CopilotClient(); - _maxIterations = maxIterations; - _completionPromise = completionPromise; - } - - public string? LastResponse => _lastResponse; - - public async Task RunAsync(string initialPrompt) - { - await _client.StartAsync(); + // Fresh session — each task gets full context budget + var session = await client.CreateSessionAsync( + new SessionConfig { Model = "claude-sonnet-4.5" }); try { - var session = await _client.CreateSessionAsync(new SessionConfig - { - Model = "gpt-5.1-codex-mini" + var done = new TaskCompletionSource(); + session.On(evt => + { + if (evt is AssistantMessageEvent msg) + done.TrySetResult(msg.Data.Content); }); - try - { - var done = new TaskCompletionSource(); - session.On(evt => - { - if (evt is AssistantMessageEvent msg) - { - _lastResponse = msg.Data.Content; - done.TrySetResult(msg.Data.Content); - } - }); - - while (_iteration < _maxIterations) - { - _iteration++; - Console.WriteLine($"\n=== Iteration {_iteration}/{_maxIterations} ==="); - - done = new TaskCompletionSource(); - - var currentPrompt = BuildIterationPrompt(initialPrompt); - Console.WriteLine($"Sending prompt (length: {currentPrompt.Length})..."); - - await session.SendAsync(new MessageOptions { Prompt = currentPrompt }); - var response = await done.Task; - - var summary = response.Length > 200 - ? response.Substring(0, 200) + "..." - : response; - Console.WriteLine($"Response: {summary}"); - - if (response.Contains(_completionPromise)) - { - Console.WriteLine($"\n✓ Completion promise detected: '{_completionPromise}'"); - return response; - } - - Console.WriteLine($"Iteration {_iteration} complete. Continuing..."); - } - - throw new InvalidOperationException( - $"Max iterations ({_maxIterations}) reached without completion promise: '{_completionPromise}'"); - } - finally - { - await session.DisposeAsync(); - } + await session.SendAsync(new MessageOptions { Prompt = prompt }); + await done.Task; } finally { - await _client.StopAsync(); + await session.DisposeAsync(); } + + // Push changes after each iteration + try + { + Process.Start("git", $"push origin {branch}")!.WaitForExit(); + } + catch + { + Process.Start("git", $"push -u origin {branch}")!.WaitForExit(); + } + + Console.WriteLine($"\nIteration {i} complete."); } - private string BuildIterationPrompt(string initialPrompt) - { - if (_iteration == 1) - return initialPrompt; - - var sb = new StringBuilder(); - sb.AppendLine(initialPrompt); - sb.AppendLine(); - sb.AppendLine("=== CONTEXT FROM PREVIOUS ITERATION ==="); - sb.AppendLine(_lastResponse); - sb.AppendLine("=== END CONTEXT ==="); - sb.AppendLine(); - sb.AppendLine("Continue working on this task. Review the previous attempt and improve upon it."); - return sb.ToString(); - } + Console.WriteLine($"\nReached max iterations: {maxIterations}"); +} +finally +{ + await client.StopAsync(); } diff --git a/cookbook/copilot-sdk/go/ralph-loop.md b/cookbook/copilot-sdk/go/ralph-loop.md index 2469c181..44d0d1ca 100644 --- a/cookbook/copilot-sdk/go/ralph-loop.md +++ b/cookbook/copilot-sdk/go/ralph-loop.md @@ -1,6 +1,6 @@ -# RALPH-loop: Iterative Self-Referential AI Loops +# Ralph Loop: Autonomous AI Task Loops -Implement self-referential feedback loops where an AI agent iteratively improves work by reading its own previous output. +Build autonomous coding loops where an AI agent picks tasks, implements them, validates against backpressure (tests, builds), commits, and repeats — each iteration in a fresh context window. > **Runnable example:** [recipe/ralph-loop.go](recipe/ralph-loop.go) > @@ -9,27 +9,37 @@ Implement self-referential feedback loops where an AI agent iteratively improves > go run recipe/ralph-loop.go > ``` -## What is RALPH-loop? +## What is a Ralph Loop? -RALPH-loop is a development methodology for iterative AI-powered task completion. Named after the Ralph Wiggum technique, it embodies the philosophy of persistent iteration: +A [Ralph loop](https://ghuntley.com/ralph/) is an autonomous development workflow where an AI agent iterates through tasks in isolated context windows. The key insight: **state lives on disk, not in the model's context**. Each iteration starts fresh, reads the current state from files, does one task, writes results back to disk, and exits. -- **One prompt, multiple iterations**: The same prompt is processed repeatedly -- **Self-referential feedback**: The AI reads its own previous work (file changes, git history) -- **Completion detection**: Loop exits when a completion promise is detected in output -- **Safety limits**: Always include a maximum iteration count to prevent infinite loops +``` +┌─────────────────────────────────────────────────┐ +│ loop.sh │ +│ while true: │ +│ ┌─────────────────────────────────────────┐ │ +│ │ Fresh session (isolated context) │ │ +│ │ │ │ +│ │ 1. Read PROMPT.md + AGENTS.md │ │ +│ │ 2. Study specs/* and code │ │ +│ │ 3. Pick next task from plan │ │ +│ │ 4. Implement + run tests │ │ +│ │ 5. Update plan, commit, exit │ │ +│ └─────────────────────────────────────────┘ │ +│ ↻ next iteration (fresh context) │ +└─────────────────────────────────────────────────┘ +``` -## Example Scenario +**Core principles:** -You need to iteratively improve code until all tests pass. Instead of asking Copilot to "write perfect code," you use RALPH-loop to: +- **Fresh context per iteration**: Each loop creates a new session — no context accumulation, always in the "smart zone" +- **Disk as shared state**: `IMPLEMENTATION_PLAN.md` persists between iterations and acts as the coordination mechanism +- **Backpressure steers quality**: Tests, builds, and lints reject bad work — the agent must fix issues before committing +- **Two modes**: PLANNING (gap analysis → generate plan) and BUILDING (implement from plan) -1. Send the initial prompt with clear success criteria -2. Copilot writes code and tests -3. Copilot runs tests and sees failures -4. Loop automatically re-sends the prompt -5. Copilot reads test output and previous code, fixes issues -6. Repeat until all tests pass and completion promise is output +## Simple Version -## Basic Implementation +The minimal Ralph loop — the SDK equivalent of `while :; do cat PROMPT.md | claude ; done`: ```go package main @@ -38,81 +48,59 @@ import ( "context" "fmt" "log" - "strings" + "os" copilot "github.com/github/copilot-sdk/go" ) -type RalphLoop struct { - client *copilot.Client - iteration int - maxIterations int - completionPromise string - LastResponse string -} - -func NewRalphLoop(maxIterations int, completionPromise string) *RalphLoop { - return &RalphLoop{ - client: copilot.NewClient(nil), - maxIterations: maxIterations, - completionPromise: completionPromise, +func ralphLoop(ctx context.Context, promptFile string, maxIterations int) error { + client := copilot.NewClient(nil) + if err := client.Start(ctx); err != nil { + return err } -} + defer client.Stop() -func (r *RalphLoop) Run(ctx context.Context, initialPrompt string) (string, error) { - if err := r.client.Start(ctx); err != nil { - return "", err - } - defer r.client.Stop() - - session, err := r.client.CreateSession(ctx, &copilot.SessionConfig{ - Model: "gpt-5.1-codex-mini", - }) + prompt, err := os.ReadFile(promptFile) if err != nil { - return "", err + return err } - defer session.Destroy() - for r.iteration < r.maxIterations { - r.iteration++ - fmt.Printf("\n--- Iteration %d/%d ---\n", r.iteration, r.maxIterations) + for i := 1; i <= maxIterations; i++ { + fmt.Printf("\n=== Iteration %d/%d ===\n", i, maxIterations) - prompt := r.buildIterationPrompt(initialPrompt) - - result, err := session.SendAndWait(ctx, copilot.MessageOptions{Prompt: prompt}) + // Fresh session each iteration — context isolation is the point + session, err := client.CreateSession(ctx, &copilot.SessionConfig{ + Model: "claude-sonnet-4.5", + }) if err != nil { - return "", err + return err } - if result != nil && result.Data.Content != nil { - r.LastResponse = *result.Data.Content + _, err = session.SendAndWait(ctx, copilot.MessageOptions{ + Prompt: string(prompt), + }) + session.Destroy() + if err != nil { + return err } - if strings.Contains(r.LastResponse, r.completionPromise) { - fmt.Printf("✓ Completion promise detected: %s\n", r.completionPromise) - return r.LastResponse, nil - } + fmt.Printf("Iteration %d complete.\n", i) } - - return "", fmt.Errorf("max iterations (%d) reached without completion promise", - r.maxIterations) + return nil } -// Usage func main() { - ctx := context.Background() - loop := NewRalphLoop(5, "COMPLETE") - result, err := loop.Run(ctx, "Your task here") - if err != nil { + if err := ralphLoop(context.Background(), "PROMPT.md", 20); err != nil { log.Fatal(err) } - fmt.Println(result) } ``` -## With File Persistence +This is all you need to get started. The prompt file tells the agent what to do; the agent reads project files, does work, commits, and exits. The loop restarts with a clean slate. -For tasks involving code generation, persist state to files so the AI can see changes: +## Ideal Version + +The full Ralph pattern with planning and building modes, matching the [Ralph Playbook](https://github.com/ClaytonFarr/ralph-playbook) architecture: ```go package main @@ -120,121 +108,178 @@ package main import ( "context" "fmt" + "log" "os" - "path/filepath" + "os/exec" + "strconv" "strings" copilot "github.com/github/copilot-sdk/go" ) -type PersistentRalphLoop struct { - client *copilot.Client - workDir string - iteration int - maxIterations int -} - -func NewPersistentRalphLoop(workDir string, maxIterations int) *PersistentRalphLoop { - os.MkdirAll(workDir, 0755) - return &PersistentRalphLoop{ - client: copilot.NewClient(nil), - workDir: workDir, - maxIterations: maxIterations, +func ralphLoop(ctx context.Context, mode string, maxIterations int) error { + promptFile := "PROMPT_build.md" + if mode == "plan" { + promptFile = "PROMPT_plan.md" } -} -func (p *PersistentRalphLoop) Run(ctx context.Context, initialPrompt string) (string, error) { - if err := p.client.Start(ctx); err != nil { - return "", err + client := copilot.NewClient(nil) + if err := client.Start(ctx); err != nil { + return err } - defer p.client.Stop() + defer client.Stop() - os.WriteFile(filepath.Join(p.workDir, "prompt.md"), []byte(initialPrompt), 0644) + branchOut, _ := exec.Command("git", "branch", "--show-current").Output() + branch := strings.TrimSpace(string(branchOut)) - session, err := p.client.CreateSession(ctx, &copilot.SessionConfig{ - Model: "gpt-5.1-codex-mini", - }) + fmt.Println(strings.Repeat("━", 40)) + fmt.Printf("Mode: %s\n", mode) + fmt.Printf("Prompt: %s\n", promptFile) + fmt.Printf("Branch: %s\n", branch) + fmt.Printf("Max: %d iterations\n", maxIterations) + fmt.Println(strings.Repeat("━", 40)) + + prompt, err := os.ReadFile(promptFile) if err != nil { - return "", err + return err } - defer session.Destroy() - for p.iteration < p.maxIterations { - p.iteration++ + for i := 1; i <= maxIterations; i++ { + fmt.Printf("\n=== Iteration %d/%d ===\n", i, maxIterations) - prompt := initialPrompt - prevFile := filepath.Join(p.workDir, fmt.Sprintf("output-%d.txt", p.iteration-1)) - if data, err := os.ReadFile(prevFile); err == nil { - prompt = fmt.Sprintf("%s\n\nPrevious iteration:\n%s", initialPrompt, string(data)) - } - - result, err := session.SendAndWait(ctx, copilot.MessageOptions{Prompt: prompt}) + // Fresh session — each task gets full context budget + session, err := client.CreateSession(ctx, &copilot.SessionConfig{ + Model: "claude-sonnet-4.5", + }) if err != nil { - return "", err + return err } - response := "" - if result != nil && result.Data.Content != nil { - response = *result.Data.Content + _, err = session.SendAndWait(ctx, copilot.MessageOptions{ + Prompt: string(prompt), + }) + session.Destroy() + if err != nil { + return err } - os.WriteFile(filepath.Join(p.workDir, fmt.Sprintf("output-%d.txt", p.iteration)), - []byte(response), 0644) + // Push changes after each iteration + if err := exec.Command("git", "push", "origin", branch).Run(); err != nil { + exec.Command("git", "push", "-u", "origin", branch).Run() + } - if strings.Contains(response, "COMPLETE") { - return response, nil + fmt.Printf("\nIteration %d complete.\n", i) + } + + fmt.Printf("\nReached max iterations: %d\n", maxIterations) + return nil +} + +func main() { + mode := "build" + maxIterations := 50 + + for _, arg := range os.Args[1:] { + if arg == "plan" { + mode = "plan" + } else if n, err := strconv.Atoi(arg); err == nil { + maxIterations = n } } - return "", fmt.Errorf("max iterations reached") + if err := ralphLoop(context.Background(), mode, maxIterations); err != nil { + log.Fatal(err) + } } ``` +### Required Project Files + +The ideal version expects this file structure in your project: + +``` +project-root/ +├── PROMPT_plan.md # Planning mode instructions +├── PROMPT_build.md # Building mode instructions +├── AGENTS.md # Operational guide (build/test commands) +├── IMPLEMENTATION_PLAN.md # Task list (generated by planning mode) +├── specs/ # Requirement specs (one per topic) +│ ├── auth.md +│ └── data-pipeline.md +└── src/ # Your source code +``` + +### Example `PROMPT_plan.md` + +```markdown +0a. Study `specs/*` to learn the application specifications. +0b. Study IMPLEMENTATION_PLAN.md (if present) to understand the plan so far. +0c. Study `src/` to understand existing code and shared utilities. + +1. Compare specs against code (gap analysis). Create or update + IMPLEMENTATION_PLAN.md as a prioritized bullet-point list of tasks + yet to be implemented. Do NOT implement anything. + +IMPORTANT: Do NOT assume functionality is missing — search the +codebase first to confirm. Prefer updating existing utilities over +creating ad-hoc copies. +``` + +### Example `PROMPT_build.md` + +```markdown +0a. Study `specs/*` to learn the application specifications. +0b. Study IMPLEMENTATION_PLAN.md. +0c. Study `src/` for reference. + +1. Choose the most important item from IMPLEMENTATION_PLAN.md. Before + making changes, search the codebase (don't assume not implemented). +2. After implementing, run the tests. If functionality is missing, add it. +3. When you discover issues, update IMPLEMENTATION_PLAN.md immediately. +4. When tests pass, update IMPLEMENTATION_PLAN.md, then `git add -A` + then `git commit` with a descriptive message. + +99999. When authoring documentation, capture the why. +999999. Implement completely. No placeholders or stubs. +9999999. Keep IMPLEMENTATION_PLAN.md current — future iterations depend on it. +``` + +### Example `AGENTS.md` + +Keep this brief (~60 lines). It's loaded every iteration, so bloat wastes context. + +```markdown +## Build & Run + +go build ./... + +## Validation + +- Tests: `go test ./...` +- Vet: `go vet ./...` +``` + ## Best Practices -1. **Write clear completion criteria**: Include exactly what "done" looks like -2. **Use output markers**: Include `COMPLETE` or similar in completion condition -3. **Always set max iterations**: Prevents infinite loops on impossible tasks -4. **Persist state**: Save files so AI can see what changed between iterations -5. **Include context**: Feed previous iteration output back as context -6. **Monitor progress**: Log each iteration to track what's happening +1. **Fresh context per iteration**: Never accumulate context across iterations — that's the whole point +2. **Disk is your database**: `IMPLEMENTATION_PLAN.md` is shared state between isolated sessions +3. **Backpressure is essential**: Tests, builds, lints in `AGENTS.md` — the agent must pass them before committing +4. **Start with PLANNING mode**: Generate the plan first, then switch to BUILDING +5. **Observe and tune**: Watch early iterations, add guardrails to prompts when the agent fails in specific ways +6. **The plan is disposable**: If the agent goes off track, delete `IMPLEMENTATION_PLAN.md` and re-plan +7. **Keep `AGENTS.md` brief**: It's loaded every iteration — operational info only, no progress notes +8. **Use a sandbox**: The agent runs autonomously with full tool access — isolate it -## Example: Iterative Code Generation - -```go -prompt := `Write a function that: -1. Parses CSV data -2. Validates required fields -3. Returns parsed records or error -4. Has unit tests -5. Output COMPLETE when done` - -loop := NewRalphLoop(10, "COMPLETE") -result, err := loop.Run(context.Background(), prompt) -``` - -## Handling Failures - -```go -ctx := context.Background() -loop := NewRalphLoop(5, "COMPLETE") -result, err := loop.Run(ctx, prompt) -if err != nil { - log.Printf("Task failed: %v", err) - log.Printf("Last attempt: %s", loop.LastResponse) -} -``` - -## When to Use RALPH-loop +## When to Use a Ralph Loop **Good for:** -- Code generation with automatic verification (tests, linters) -- Tasks with clear success criteria -- Iterative refinement where each attempt learns from previous failures -- Unattended long-running improvements +- Implementing features from specs with test-driven validation +- Large refactors broken into many small tasks +- Unattended, long-running development with clear requirements +- Any work where backpressure (tests/builds) can verify correctness **Not good for:** -- Tasks requiring human judgment or design input -- One-shot operations -- Tasks with vague success criteria -- Real-time interactive debugging +- Tasks requiring human judgment mid-loop +- One-shot operations that don't benefit from iteration +- Vague requirements without testable acceptance criteria +- Exploratory prototyping where direction isn't clear diff --git a/cookbook/copilot-sdk/go/recipe/ralph-loop.go b/cookbook/copilot-sdk/go/recipe/ralph-loop.go index b99fe54d..1d317842 100644 --- a/cookbook/copilot-sdk/go/recipe/ralph-loop.go +++ b/cookbook/copilot-sdk/go/recipe/ralph-loop.go @@ -4,127 +4,101 @@ import ( "context" "fmt" "log" + "os" + "os/exec" + "strconv" "strings" copilot "github.com/github/copilot-sdk/go" ) -// RalphLoop implements iterative self-referential feedback loops. -// The same prompt is sent repeatedly, with AI reading its own previous output. -// Loop continues until completion promise is detected in the response. -type RalphLoop struct { - client *copilot.Client - iteration int - maxIterations int - completionPromise string - LastResponse string -} +// Ralph loop: autonomous AI task loop with fresh context per iteration. +// +// Two modes: +// - "plan": reads PROMPT_plan.md, generates/updates IMPLEMENTATION_PLAN.md +// - "build": reads PROMPT_build.md, implements tasks, runs tests, commits +// +// Each iteration creates a fresh session so the agent always operates in +// the "smart zone" of its context window. State is shared between +// iterations via files on disk (IMPLEMENTATION_PLAN.md, AGENTS.md, specs/*). +// +// Usage: +// go run ralph-loop.go # build mode, 50 iterations +// go run ralph-loop.go plan # planning mode +// go run ralph-loop.go 20 # build mode, 20 iterations +// go run ralph-loop.go plan 5 # planning mode, 5 iterations -// NewRalphLoop creates a new RALPH-loop instance. -func NewRalphLoop(maxIterations int, completionPromise string) *RalphLoop { - return &RalphLoop{ - client: copilot.NewClient(nil), - maxIterations: maxIterations, - completionPromise: completionPromise, +func ralphLoop(ctx context.Context, mode string, maxIterations int) error { + promptFile := "PROMPT_build.md" + if mode == "plan" { + promptFile = "PROMPT_plan.md" } -} -// Run executes the RALPH-loop until completion promise is detected or max iterations reached. -func (r *RalphLoop) Run(ctx context.Context, initialPrompt string) (string, error) { - if err := r.client.Start(ctx); err != nil { - return "", fmt.Errorf("failed to start client: %w", err) + client := copilot.NewClient(nil) + if err := client.Start(ctx); err != nil { + return fmt.Errorf("failed to start client: %w", err) } - defer r.client.Stop() + defer client.Stop() - session, err := r.client.CreateSession(ctx, &copilot.SessionConfig{ - Model: "gpt-5.1-codex-mini", - }) + branchOut, _ := exec.Command("git", "branch", "--show-current").Output() + branch := strings.TrimSpace(string(branchOut)) + + fmt.Println(strings.Repeat("━", 40)) + fmt.Printf("Mode: %s\n", mode) + fmt.Printf("Prompt: %s\n", promptFile) + fmt.Printf("Branch: %s\n", branch) + fmt.Printf("Max: %d iterations\n", maxIterations) + fmt.Println(strings.Repeat("━", 40)) + + prompt, err := os.ReadFile(promptFile) if err != nil { - return "", fmt.Errorf("failed to create session: %w", err) + return fmt.Errorf("failed to read %s: %w", promptFile, err) } - defer session.Destroy() - for r.iteration < r.maxIterations { - r.iteration++ - fmt.Printf("\n=== Iteration %d/%d ===\n", r.iteration, r.maxIterations) + for i := 1; i <= maxIterations; i++ { + fmt.Printf("\n=== Iteration %d/%d ===\n", i, maxIterations) - currentPrompt := r.buildIterationPrompt(initialPrompt) - fmt.Printf("Sending prompt (length: %d)...\n", len(currentPrompt)) - - result, err := session.SendAndWait(ctx, copilot.MessageOptions{ - Prompt: currentPrompt, + // Fresh session — each task gets full context budget + session, err := client.CreateSession(ctx, &copilot.SessionConfig{ + Model: "claude-sonnet-4.5", }) if err != nil { - return "", fmt.Errorf("send failed on iteration %d: %w", r.iteration, err) + return fmt.Errorf("failed to create session: %w", err) } - if result != nil && result.Data.Content != nil { - r.LastResponse = *result.Data.Content - } else { - r.LastResponse = "" + _, err = session.SendAndWait(ctx, copilot.MessageOptions{ + Prompt: string(prompt), + }) + session.Destroy() + if err != nil { + return fmt.Errorf("send failed on iteration %d: %w", i, err) } - // Display response summary - summary := r.LastResponse - if len(summary) > 200 { - summary = summary[:200] + "..." - } - fmt.Printf("Response: %s\n", summary) - - // Check for completion promise - if strings.Contains(r.LastResponse, r.completionPromise) { - fmt.Printf("\n✓ Success! Completion promise detected: '%s'\n", r.completionPromise) - return r.LastResponse, nil + // Push changes after each iteration + if err := exec.Command("git", "push", "origin", branch).Run(); err != nil { + exec.Command("git", "push", "-u", "origin", branch).Run() } - fmt.Printf("Iteration %d complete. Continuing...\n", r.iteration) + fmt.Printf("\nIteration %d complete.\n", i) } - return "", fmt.Errorf("maximum iterations (%d) reached without detecting completion promise: '%s'", - r.maxIterations, r.completionPromise) -} - -func (r *RalphLoop) buildIterationPrompt(initialPrompt string) string { - if r.iteration == 1 { - return initialPrompt - } - - return fmt.Sprintf(`%s - -=== CONTEXT FROM PREVIOUS ITERATION === -%s -=== END CONTEXT === - -Continue working on this task. Review the previous attempt and improve upon it.`, - initialPrompt, r.LastResponse) + fmt.Printf("\nReached max iterations: %d\n", maxIterations) + return nil } func main() { - prompt := `You are iteratively building a small library. Follow these phases IN ORDER. -Do NOT skip ahead — only do the current phase, then stop and wait for the next iteration. + mode := "build" + maxIterations := 50 -Phase 1: Design a DataValidator struct that validates records against a schema. - - Schema defines field names, types (string, int, float, bool), and whether required. - - Return a slice of validation errors per record. - - Show the struct and method code only. Do NOT output COMPLETE. - -Phase 2: Write at least 4 unit tests covering: missing required field, wrong type, - valid record, and empty input. Show test code only. Do NOT output COMPLETE. - -Phase 3: Review the code from phases 1 and 2. Fix any bugs, add doc comments, and add - an extra edge-case test. Show the final consolidated code with all fixes. - When this phase is fully done, output the exact text: COMPLETE` - - ctx := context.Background() - loop := NewRalphLoop(5, "COMPLETE") - - result, err := loop.Run(ctx, prompt) - if err != nil { - log.Printf("Task did not complete: %v", err) - log.Printf("Last attempt: %s", loop.LastResponse) - return + for _, arg := range os.Args[1:] { + if arg == "plan" { + mode = "plan" + } else if n, err := strconv.Atoi(arg); err == nil { + maxIterations = n + } } - fmt.Println("\n=== FINAL RESULT ===") - fmt.Println(result) + if err := ralphLoop(context.Background(), mode, maxIterations); err != nil { + log.Fatal(err) + } } diff --git a/cookbook/copilot-sdk/nodejs/ralph-loop.md b/cookbook/copilot-sdk/nodejs/ralph-loop.md index deafa628..41ad3c71 100644 --- a/cookbook/copilot-sdk/nodejs/ralph-loop.md +++ b/cookbook/copilot-sdk/nodejs/ralph-loop.md @@ -1,6 +1,6 @@ -# RALPH-loop: Iterative Self-Referential AI Loops +# Ralph Loop: Autonomous AI Task Loops -Implement self-referential feedback loops where an AI agent iteratively improves work by reading its own previous output. +Build autonomous coding loops where an AI agent picks tasks, implements them, validates against backpressure (tests, builds), commits, and repeats — each iteration in a fresh context window. > **Runnable example:** [recipe/ralph-loop.ts](recipe/ralph-loop.ts) > @@ -9,200 +9,217 @@ Implement self-referential feedback loops where an AI agent iteratively improves > npx tsx ralph-loop.ts > ``` -## What is RALPH-loop? +## What is a Ralph Loop? -RALPH-loop is a development methodology for iterative AI-powered task completion. Named after the Ralph Wiggum technique, it embodies the philosophy of persistent iteration: +A [Ralph loop](https://ghuntley.com/ralph/) is an autonomous development workflow where an AI agent iterates through tasks in isolated context windows. The key insight: **state lives on disk, not in the model's context**. Each iteration starts fresh, reads the current state from files, does one task, writes results back to disk, and exits. -- **One prompt, multiple iterations**: The same prompt is processed repeatedly -- **Self-referential feedback**: The AI reads its own previous work (file changes, git history) -- **Completion detection**: Loop exits when a completion promise is detected in output -- **Safety limits**: Always include a maximum iteration count to prevent infinite loops - -## Example Scenario - -You need to iteratively improve code until all tests pass. Instead of asking Copilot to "write perfect code," you use RALPH-loop to: - -1. Send the initial prompt with clear success criteria -2. Copilot writes code and tests -3. Copilot runs tests and sees failures -4. Loop automatically re-sends the prompt -5. Copilot reads test output and previous code, fixes issues -6. Repeat until all tests pass and completion promise is output - -## Basic Implementation - -```typescript -import { CopilotClient } from "@github/copilot-sdk"; - -class RalphLoop { - private client: CopilotClient; - private iteration: number = 0; - private maxIterations: number; - private completionPromise: string; - private lastResponse: string | null = null; - - constructor(maxIterations: number = 10, completionPromise: string = "COMPLETE") { - this.client = new CopilotClient(); - this.maxIterations = maxIterations; - this.completionPromise = completionPromise; - } - - async run(initialPrompt: string): Promise { - await this.client.start(); - const session = await this.client.createSession({ model: "gpt-5.1-codex-mini" }); - - try { - while (this.iteration < this.maxIterations) { - this.iteration++; - console.log(`\n--- Iteration ${this.iteration}/${this.maxIterations} ---`); - - // Build prompt including previous response as context - const prompt = this.iteration === 1 - ? initialPrompt - : `${initialPrompt}\n\nPrevious attempt:\n${this.lastResponse}\n\nContinue improving...`; - - const response = await session.sendAndWait({ prompt }); - this.lastResponse = response?.data.content || ""; - - console.log(`Response (${this.lastResponse.length} chars)`); - - // Check for completion promise - if (this.lastResponse.includes(this.completionPromise)) { - console.log(`✓ Completion promise detected: ${this.completionPromise}`); - return this.lastResponse; - } - - console.log(`Continuing to iteration ${this.iteration + 1}...`); - } - - throw new Error( - `Max iterations (${this.maxIterations}) reached without completion promise` - ); - } finally { - await session.destroy(); - await this.client.stop(); - } - } -} - -// Usage -const loop = new RalphLoop(5, "COMPLETE"); -const result = await loop.run("Your task here"); -console.log(result); +``` +┌─────────────────────────────────────────────────┐ +│ loop.sh │ +│ while true: │ +│ ┌─────────────────────────────────────────┐ │ +│ │ Fresh session (isolated context) │ │ +│ │ │ │ +│ │ 1. Read PROMPT.md + AGENTS.md │ │ +│ │ 2. Study specs/* and code │ │ +│ │ 3. Pick next task from plan │ │ +│ │ 4. Implement + run tests │ │ +│ │ 5. Update plan, commit, exit │ │ +│ └─────────────────────────────────────────┘ │ +│ ↻ next iteration (fresh context) │ +└─────────────────────────────────────────────────┘ ``` -## With File Persistence +**Core principles:** -For tasks involving code generation, persist state to files so the AI can see changes: +- **Fresh context per iteration**: Each loop creates a new session — no context accumulation, always in the "smart zone" +- **Disk as shared state**: `IMPLEMENTATION_PLAN.md` persists between iterations and acts as the coordination mechanism +- **Backpressure steers quality**: Tests, builds, and lints reject bad work — the agent must fix issues before committing +- **Two modes**: PLANNING (gap analysis → generate plan) and BUILDING (implement from plan) + +## Simple Version + +The minimal Ralph loop — the SDK equivalent of `while :; do cat PROMPT.md | claude ; done`: ```typescript -import fs from "fs/promises"; -import path from "path"; +import { readFile } from "fs/promises"; import { CopilotClient } from "@github/copilot-sdk"; -class PersistentRalphLoop { - private client: CopilotClient; - private workDir: string; - private iteration: number = 0; - private maxIterations: number; +async function ralphLoop(promptFile: string, maxIterations: number = 50) { + const client = new CopilotClient(); + await client.start(); - constructor(workDir: string, maxIterations: number = 10) { - this.client = new CopilotClient(); - this.workDir = workDir; - this.maxIterations = maxIterations; - } + try { + const prompt = await readFile(promptFile, "utf-8"); - async run(initialPrompt: string): Promise { - await fs.mkdir(this.workDir, { recursive: true }); - await this.client.start(); - const session = await this.client.createSession({ model: "gpt-5.1-codex-mini" }); + for (let i = 1; i <= maxIterations; i++) { + console.log(`\n=== Iteration ${i}/${maxIterations} ===`); - try { - // Store initial prompt - await fs.writeFile(path.join(this.workDir, "prompt.md"), initialPrompt); - - while (this.iteration < this.maxIterations) { - this.iteration++; - console.log(`\n--- Iteration ${this.iteration} ---`); - - // Build context from previous outputs - let context = initialPrompt; - const prevOutputFile = path.join(this.workDir, `output-${this.iteration - 1}.txt`); - try { - const prevOutput = await fs.readFile(prevOutputFile, "utf-8"); - context += `\n\nPrevious iteration:\n${prevOutput}`; - } catch { - // No previous output yet - } - - const response = await session.sendAndWait({ prompt: context }); - const output = response?.data.content || ""; - - // Persist output - await fs.writeFile( - path.join(this.workDir, `output-${this.iteration}.txt`), - output - ); - - if (output.includes("COMPLETE")) { - return output; - } + // Fresh session each iteration — context isolation is the point + const session = await client.createSession({ model: "claude-sonnet-4.5" }); + try { + await session.sendAndWait({ prompt }, 600_000); + } finally { + await session.destroy(); } - throw new Error("Max iterations reached"); - } finally { - await session.destroy(); - await this.client.stop(); + console.log(`Iteration ${i} complete.`); } + } finally { + await client.stop(); } } + +// Usage: point at your PROMPT.md +ralphLoop("PROMPT.md", 20); +``` + +This is all you need to get started. The prompt file tells the agent what to do; the agent reads project files, does work, commits, and exits. The loop restarts with a clean slate. + +## Ideal Version + +The full Ralph pattern with planning and building modes, matching the [Ralph Playbook](https://github.com/ClaytonFarr/ralph-playbook) architecture: + +```typescript +import { readFile } from "fs/promises"; +import { execSync } from "child_process"; +import { CopilotClient } from "@github/copilot-sdk"; + +type Mode = "plan" | "build"; + +async function ralphLoop(mode: Mode, maxIterations: number = 50) { + const promptFile = mode === "plan" ? "PROMPT_plan.md" : "PROMPT_build.md"; + const client = new CopilotClient(); + await client.start(); + + const branch = execSync("git branch --show-current", { encoding: "utf-8" }).trim(); + console.log(`Mode: ${mode} | Prompt: ${promptFile} | Branch: ${branch}`); + + try { + const prompt = await readFile(promptFile, "utf-8"); + + for (let i = 1; i <= maxIterations; i++) { + console.log(`\n=== Iteration ${i}/${maxIterations} ===`); + + // Fresh session — each task gets full context budget + const session = await client.createSession({ model: "claude-sonnet-4.5" }); + try { + await session.sendAndWait({ prompt }, 600_000); + } finally { + await session.destroy(); + } + + // Push changes after each iteration + try { + execSync(`git push origin ${branch}`, { stdio: "inherit" }); + } catch { + execSync(`git push -u origin ${branch}`, { stdio: "inherit" }); + } + + console.log(`Iteration ${i} complete.`); + } + } finally { + await client.stop(); + } +} + +// Parse CLI args: npx tsx ralph-loop.ts [plan] [max_iterations] +const args = process.argv.slice(2); +const mode: Mode = args.includes("plan") ? "plan" : "build"; +const maxArg = args.find(a => /^\d+$/.test(a)); +const maxIterations = maxArg ? parseInt(maxArg) : 50; + +ralphLoop(mode, maxIterations); +``` + +### Required Project Files + +The ideal version expects this file structure in your project: + +``` +project-root/ +├── PROMPT_plan.md # Planning mode instructions +├── PROMPT_build.md # Building mode instructions +├── AGENTS.md # Operational guide (build/test commands) +├── IMPLEMENTATION_PLAN.md # Task list (generated by planning mode) +├── specs/ # Requirement specs (one per topic) +│ ├── auth.md +│ └── data-pipeline.md +└── src/ # Your source code +``` + +### Example `PROMPT_plan.md` + +```markdown +0a. Study `specs/*` to learn the application specifications. +0b. Study IMPLEMENTATION_PLAN.md (if present) to understand the plan so far. +0c. Study `src/` to understand existing code and shared utilities. + +1. Compare specs against code (gap analysis). Create or update + IMPLEMENTATION_PLAN.md as a prioritized bullet-point list of tasks + yet to be implemented. Do NOT implement anything. + +IMPORTANT: Do NOT assume functionality is missing — search the +codebase first to confirm. Prefer updating existing utilities over +creating ad-hoc copies. +``` + +### Example `PROMPT_build.md` + +```markdown +0a. Study `specs/*` to learn the application specifications. +0b. Study IMPLEMENTATION_PLAN.md. +0c. Study `src/` for reference. + +1. Choose the most important item from IMPLEMENTATION_PLAN.md. Before + making changes, search the codebase (don't assume not implemented). +2. After implementing, run the tests. If functionality is missing, add it. +3. When you discover issues, update IMPLEMENTATION_PLAN.md immediately. +4. When tests pass, update IMPLEMENTATION_PLAN.md, then `git add -A` + then `git commit` with a descriptive message. + +99999. When authoring documentation, capture the why. +999999. Implement completely. No placeholders or stubs. +9999999. Keep IMPLEMENTATION_PLAN.md current — future iterations depend on it. +``` + +### Example `AGENTS.md` + +Keep this brief (~60 lines). It's loaded every iteration, so bloat wastes context. + +```markdown +## Build & Run + +npm run build + +## Validation + +- Tests: `npm test` +- Typecheck: `npx tsc --noEmit` +- Lint: `npm run lint` ``` ## Best Practices -1. **Write clear completion criteria**: Include exactly what "done" looks like -2. **Use output markers**: Include `COMPLETE` or similar in completion condition -3. **Always set max iterations**: Prevents infinite loops on impossible tasks -4. **Persist state**: Save files so AI can see what changed between iterations -5. **Include context**: Feed previous iteration output back as context -6. **Monitor progress**: Log each iteration to track what's happening +1. **Fresh context per iteration**: Never accumulate context across iterations — that's the whole point +2. **Disk is your database**: `IMPLEMENTATION_PLAN.md` is shared state between isolated sessions +3. **Backpressure is essential**: Tests, builds, lints in `AGENTS.md` — the agent must pass them before committing +4. **Start with PLANNING mode**: Generate the plan first, then switch to BUILDING +5. **Observe and tune**: Watch early iterations, add guardrails to prompts when the agent fails in specific ways +6. **The plan is disposable**: If the agent goes off track, delete `IMPLEMENTATION_PLAN.md` and re-plan +7. **Keep `AGENTS.md` brief**: It's loaded every iteration — operational info only, no progress notes +8. **Use a sandbox**: The agent runs autonomously with full tool access — isolate it -## Example: Iterative Code Generation - -```typescript -const prompt = `Write a function that: -1. Parses CSV data -2. Validates required fields -3. Returns parsed records or error -4. Has unit tests -5. Output COMPLETE when done`; - -const loop = new RalphLoop(10, "COMPLETE"); -const result = await loop.run(prompt); -``` - -## Handling Failures - -```typescript -try { - const result = await loop.run(prompt); - console.log("Task completed successfully!"); -} catch (error) { - console.error("Task failed:", error.message); - // Analyze what was attempted and suggest alternatives -} -``` - -## When to Use RALPH-loop +## When to Use a Ralph Loop **Good for:** -- Code generation with automatic verification (tests, linters) -- Tasks with clear success criteria -- Iterative refinement where each attempt learns from previous failures -- Unattended long-running improvements +- Implementing features from specs with test-driven validation +- Large refactors broken into many small tasks +- Unattended, long-running development with clear requirements +- Any work where backpressure (tests/builds) can verify correctness **Not good for:** -- Tasks requiring human judgment or design input -- One-shot operations -- Tasks with vague success criteria -- Real-time interactive debugging +- Tasks requiring human judgment mid-loop +- One-shot operations that don't benefit from iteration +- Vague requirements without testable acceptance criteria +- Exploratory prototyping where direction isn't clear diff --git a/cookbook/copilot-sdk/nodejs/recipe/ralph-loop.ts b/cookbook/copilot-sdk/nodejs/recipe/ralph-loop.ts index 93a7ebb2..018b8074 100644 --- a/cookbook/copilot-sdk/nodejs/recipe/ralph-loop.ts +++ b/cookbook/copilot-sdk/nodejs/recipe/ralph-loop.ts @@ -1,128 +1,79 @@ +import { readFile } from "fs/promises"; +import { execSync } from "child_process"; import { CopilotClient } from "@github/copilot-sdk"; /** - * RALPH-loop implementation: Iterative self-referential AI loops. - * The same prompt is sent repeatedly, with AI reading its own previous output. - * Loop continues until completion promise is detected in the response. + * Ralph loop: autonomous AI task loop with fresh context per iteration. + * + * Two modes: + * - "plan": reads PROMPT_plan.md, generates/updates IMPLEMENTATION_PLAN.md + * - "build": reads PROMPT_build.md, implements tasks, runs tests, commits + * + * Each iteration creates a fresh session so the agent always operates in + * the "smart zone" of its context window. State is shared between + * iterations via files on disk (IMPLEMENTATION_PLAN.md, AGENTS.md, specs/*). + * + * Usage: + * npx tsx ralph-loop.ts # build mode, 50 iterations + * npx tsx ralph-loop.ts plan # planning mode + * npx tsx ralph-loop.ts 20 # build mode, 20 iterations + * npx tsx ralph-loop.ts plan 5 # planning mode, 5 iterations */ -class RalphLoop { - private client: CopilotClient; - private iteration: number = 0; - private readonly maxIterations: number; - private readonly completionPromise: string; - public lastResponse: string | null = null; - constructor(maxIterations: number = 10, completionPromise: string = "COMPLETE") { - this.client = new CopilotClient(); - this.maxIterations = maxIterations; - this.completionPromise = completionPromise; - } +type Mode = "plan" | "build"; - /** - * Run the RALPH-loop until completion promise is detected or max iterations reached. - */ - async run(initialPrompt: string): Promise { - let session: Awaited> | null = null; +async function ralphLoop(mode: Mode, maxIterations: number) { + const promptFile = mode === "plan" ? "PROMPT_plan.md" : "PROMPT_build.md"; - await this.client.start(); - try { - session = await this.client.createSession({ - model: "gpt-5.1-codex-mini" + const client = new CopilotClient(); + await client.start(); + + const branch = execSync("git branch --show-current", { encoding: "utf-8" }).trim(); + + console.log("━".repeat(40)); + console.log(`Mode: ${mode}`); + console.log(`Prompt: ${promptFile}`); + console.log(`Branch: ${branch}`); + console.log(`Max: ${maxIterations} iterations`); + console.log("━".repeat(40)); + + try { + const prompt = await readFile(promptFile, "utf-8"); + + for (let i = 1; i <= maxIterations; i++) { + console.log(`\n=== Iteration ${i}/${maxIterations} ===`); + + // Fresh session — each task gets full context budget + const session = await client.createSession({ + model: "claude-sonnet-4.5", }); try { - while (this.iteration < this.maxIterations) { - this.iteration++; - console.log(`\n=== Iteration ${this.iteration}/${this.maxIterations} ===`); - - // Build the prompt for this iteration - const currentPrompt = this.buildIterationPrompt(initialPrompt); - console.log(`Sending prompt (length: ${currentPrompt.length})...`); - - const response = await session.sendAndWait({ prompt: currentPrompt }, 300_000); - this.lastResponse = response?.data.content || ""; - - // Display response summary - const summary = this.lastResponse.length > 200 - ? this.lastResponse.substring(0, 200) + "..." - : this.lastResponse; - console.log(`Response: ${summary}`); - - // Check for completion promise - if (this.lastResponse.includes(this.completionPromise)) { - console.log(`\n✓ Success! Completion promise detected: '${this.completionPromise}'`); - return this.lastResponse; - } - - console.log(`Iteration ${this.iteration} complete. Checking for next iteration...`); - } - - // Max iterations reached without completion - throw new Error( - `Maximum iterations (${this.maxIterations}) reached without detecting completion promise: '${this.completionPromise}'` - ); - } catch (error) { - console.error(`\nError during RALPH-loop: ${error instanceof Error ? error.message : String(error)}`); - throw error; + await session.sendAndWait({ prompt }, 600_000); } finally { - if (session) { - await session.destroy(); - } + await session.destroy(); } - } finally { - await this.client.stop(); - } - } - /** - * Build the prompt for the current iteration, including previous output as context. - */ - private buildIterationPrompt(initialPrompt: string): string { - if (this.iteration === 1) { - // First iteration: just the initial prompt - return initialPrompt; + // Push changes after each iteration + try { + execSync(`git push origin ${branch}`, { stdio: "inherit" }); + } catch { + execSync(`git push -u origin ${branch}`, { stdio: "inherit" }); + } + + console.log(`\nIteration ${i} complete.`); } - // Subsequent iterations: include previous output as context - return `${initialPrompt} - -=== CONTEXT FROM PREVIOUS ITERATION === -${this.lastResponse} -=== END CONTEXT === - -Continue working on this task. Review the previous attempt and improve upon it.`; + console.log(`\nReached max iterations: ${maxIterations}`); + } finally { + await client.stop(); } } -// Example usage demonstrating RALPH-loop -async function main() { - const prompt = `You are iteratively building a small library. Follow these phases IN ORDER. -Do NOT skip ahead — only do the current phase, then stop and wait for the next iteration. +// Parse CLI args +const args = process.argv.slice(2); +const mode: Mode = args.includes("plan") ? "plan" : "build"; +const maxArg = args.find((a) => /^\d+$/.test(a)); +const maxIterations = maxArg ? parseInt(maxArg) : 50; -Phase 1: Design a DataValidator class that validates records against a schema. - - Schema defines field names, types (str, int, float, bool), and whether required. - - Return a list of validation errors per record. - - Show the class code only. Do NOT output COMPLETE. - -Phase 2: Write at least 4 unit tests covering: missing required field, wrong type, - valid record, and empty input. Show test code only. Do NOT output COMPLETE. - -Phase 3: Review the code from phases 1 and 2. Fix any bugs, add docstrings, and add - an extra edge-case test. Show the final consolidated code with all fixes. - When this phase is fully done, output the exact text: COMPLETE`; - - const loop = new RalphLoop(5, "COMPLETE"); - - try { - const result = await loop.run(prompt); - console.log("\n=== FINAL RESULT ==="); - console.log(result); - } catch (error) { - console.error(`\nTask did not complete: ${error instanceof Error ? error.message : String(error)}`); - if (loop.lastResponse) { - console.log(`\nLast attempt:\n${loop.lastResponse}`); - } - } -} - -main().catch(console.error); +ralphLoop(mode, maxIterations).catch(console.error); diff --git a/cookbook/copilot-sdk/python/ralph-loop.md b/cookbook/copilot-sdk/python/ralph-loop.md index 9a969ce6..fb8e3ced 100644 --- a/cookbook/copilot-sdk/python/ralph-loop.md +++ b/cookbook/copilot-sdk/python/ralph-loop.md @@ -1,6 +1,6 @@ -# RALPH-loop: Iterative Self-Referential AI Loops +# Ralph Loop: Autonomous AI Task Loops -Implement self-referential feedback loops where an AI agent iteratively improves work by reading its own previous output. +Build autonomous coding loops where an AI agent picks tasks, implements them, validates against backpressure (tests, builds), commits, and repeats — each iteration in a fresh context window. > **Runnable example:** [recipe/ralph_loop.py](recipe/ralph_loop.py) > @@ -8,196 +8,235 @@ Implement self-referential feedback loops where an AI agent iteratively improves > cd recipe && pip install -r requirements.txt > python ralph_loop.py > ``` -## What is RALPH-loop? -RALPH-loop is a development methodology for iterative AI-powered task completion. Named after the Ralph Wiggum technique, it embodies the philosophy of persistent iteration: +## What is a Ralph Loop? -- **One prompt, multiple iterations**: The same prompt is processed repeatedly -- **Self-referential feedback**: The AI reads its own previous work (file changes, git history) -- **Completion detection**: Loop exits when a completion promise is detected in output -- **Safety limits**: Always include a maximum iteration count to prevent infinite loops +A [Ralph loop](https://ghuntley.com/ralph/) is an autonomous development workflow where an AI agent iterates through tasks in isolated context windows. The key insight: **state lives on disk, not in the model's context**. Each iteration starts fresh, reads the current state from files, does one task, writes results back to disk, and exits. -## Example Scenario - -You need to iteratively improve code until all tests pass. Instead of asking Copilot to "write perfect code," you use RALPH-loop to: - -1. Send the initial prompt with clear success criteria -2. Copilot writes code and tests -3. Copilot runs tests and sees failures -4. Loop automatically re-sends the prompt -5. Copilot reads test output and previous code, fixes issues -6. Repeat until all tests pass and completion promise is output - -## Basic Implementation - -```python -import asyncio -from copilot import CopilotClient, MessageOptions, SessionConfig - -class RalphLoop: - """Iterative self-referential feedback loop using Copilot.""" - - def __init__(self, max_iterations=10, completion_promise="COMPLETE"): - self.client = CopilotClient() - self.iteration = 0 - self.max_iterations = max_iterations - self.completion_promise = completion_promise - self.last_response = None - - async def run(self, initial_prompt): - """Run the RALPH-loop until completion promise detected or max iterations reached.""" - await self.client.start() - session = await self.client.create_session( - SessionConfig(model="gpt-5.1-codex-mini") - ) - - try: - while self.iteration < self.max_iterations: - self.iteration += 1 - print(f"\n--- Iteration {self.iteration}/{self.max_iterations} ---") - - # Build prompt including previous response as context - if self.iteration == 1: - prompt = initial_prompt - else: - prompt = f"{initial_prompt}\n\nPrevious attempt:\n{self.last_response}\n\nContinue improving..." - - result = await session.send_and_wait( - MessageOptions(prompt=prompt), timeout=300 - ) - - self.last_response = result.data.content if result else "" - print(f"Response ({len(self.last_response)} chars)") - - # Check for completion promise - if self.completion_promise in self.last_response: - print(f"✓ Completion promise detected: {self.completion_promise}") - return self.last_response - - print(f"Continuing to iteration {self.iteration + 1}...") - - raise RuntimeError( - f"Max iterations ({self.max_iterations}) reached without completion promise" - ) - finally: - await session.destroy() - await self.client.stop() - -# Usage -async def main(): - loop = RalphLoop(5, "COMPLETE") - result = await loop.run("Your task here") - print(result) - -asyncio.run(main()) +``` +┌─────────────────────────────────────────────────┐ +│ loop.sh │ +│ while true: │ +│ ┌─────────────────────────────────────────┐ │ +│ │ Fresh session (isolated context) │ │ +│ │ │ │ +│ │ 1. Read PROMPT.md + AGENTS.md │ │ +│ │ 2. Study specs/* and code │ │ +│ │ 3. Pick next task from plan │ │ +│ │ 4. Implement + run tests │ │ +│ │ 5. Update plan, commit, exit │ │ +│ └─────────────────────────────────────────┘ │ +│ ↻ next iteration (fresh context) │ +└─────────────────────────────────────────────────┘ ``` -## With File Persistence +**Core principles:** -For tasks involving code generation, persist state to files so the AI can see changes: +- **Fresh context per iteration**: Each loop creates a new session — no context accumulation, always in the "smart zone" +- **Disk as shared state**: `IMPLEMENTATION_PLAN.md` persists between iterations and acts as the coordination mechanism +- **Backpressure steers quality**: Tests, builds, and lints reject bad work — the agent must fix issues before committing +- **Two modes**: PLANNING (gap analysis → generate plan) and BUILDING (implement from plan) + +## Simple Version + +The minimal Ralph loop — the SDK equivalent of `while :; do cat PROMPT.md | claude ; done`: ```python import asyncio from pathlib import Path from copilot import CopilotClient, MessageOptions, SessionConfig -class PersistentRalphLoop: - """RALPH-loop with file-based state persistence.""" - - def __init__(self, work_dir, max_iterations=10): - self.client = CopilotClient() - self.work_dir = Path(work_dir) - self.work_dir.mkdir(parents=True, exist_ok=True) - self.iteration = 0 - self.max_iterations = max_iterations - async def run(self, initial_prompt): - """Run the loop with persistent state.""" - await self.client.start() - session = await self.client.create_session( - SessionConfig(model="gpt-5.1-codex-mini") - ) +async def ralph_loop(prompt_file: str, max_iterations: int = 50): + client = CopilotClient() + await client.start() - try: - # Store initial prompt - (self.work_dir / "prompt.md").write_text(initial_prompt) + try: + prompt = Path(prompt_file).read_text() - while self.iteration < self.max_iterations: - self.iteration += 1 - print(f"\n--- Iteration {self.iteration} ---") + for i in range(1, max_iterations + 1): + print(f"\n=== Iteration {i}/{max_iterations} ===") - # Build context from previous outputs - context = initial_prompt - prev_output = self.work_dir / f"output-{self.iteration - 1}.txt" - if prev_output.exists(): - context += f"\n\nPrevious iteration:\n{prev_output.read_text()}" - - result = await session.send_and_wait( - MessageOptions(prompt=context), timeout=300 + # Fresh session each iteration — context isolation is the point + session = await client.create_session( + SessionConfig(model="claude-sonnet-4.5") + ) + try: + await session.send_and_wait( + MessageOptions(prompt=prompt), timeout=600 ) - response = result.data.content if result else "" + finally: + await session.destroy() - # Persist output - output_file = self.work_dir / f"output-{self.iteration}.txt" - output_file.write_text(response) + print(f"Iteration {i} complete.") + finally: + await client.stop() - if "COMPLETE" in response: - return response - raise RuntimeError("Max iterations reached") - finally: - await session.destroy() - await self.client.stop() +# Usage: point at your PROMPT.md +asyncio.run(ralph_loop("PROMPT.md", 20)) +``` + +This is all you need to get started. The prompt file tells the agent what to do; the agent reads project files, does work, commits, and exits. The loop restarts with a clean slate. + +## Ideal Version + +The full Ralph pattern with planning and building modes, matching the [Ralph Playbook](https://github.com/ClaytonFarr/ralph-playbook) architecture: + +```python +import asyncio +import subprocess +import sys +from pathlib import Path + +from copilot import CopilotClient, MessageOptions, SessionConfig + + +async def ralph_loop(mode: str = "build", max_iterations: int = 50): + prompt_file = "PROMPT_plan.md" if mode == "plan" else "PROMPT_build.md" + client = CopilotClient() + await client.start() + + branch = subprocess.check_output( + ["git", "branch", "--show-current"], text=True + ).strip() + + print("━" * 40) + print(f"Mode: {mode}") + print(f"Prompt: {prompt_file}") + print(f"Branch: {branch}") + print(f"Max: {max_iterations} iterations") + print("━" * 40) + + try: + prompt = Path(prompt_file).read_text() + + for i in range(1, max_iterations + 1): + print(f"\n=== Iteration {i}/{max_iterations} ===") + + # Fresh session — each task gets full context budget + session = await client.create_session( + SessionConfig(model="claude-sonnet-4.5") + ) + try: + await session.send_and_wait( + MessageOptions(prompt=prompt), timeout=600 + ) + finally: + await session.destroy() + + # Push changes after each iteration + try: + subprocess.run( + ["git", "push", "origin", branch], check=True + ) + except subprocess.CalledProcessError: + subprocess.run( + ["git", "push", "-u", "origin", branch], check=True + ) + + print(f"\nIteration {i} complete.") + + print(f"\nReached max iterations: {max_iterations}") + finally: + await client.stop() + + +if __name__ == "__main__": + args = sys.argv[1:] + mode = "plan" if "plan" in args else "build" + max_iter = next((int(a) for a in args if a.isdigit()), 50) + asyncio.run(ralph_loop(mode, max_iter)) +``` + +### Required Project Files + +The ideal version expects this file structure in your project: + +``` +project-root/ +├── PROMPT_plan.md # Planning mode instructions +├── PROMPT_build.md # Building mode instructions +├── AGENTS.md # Operational guide (build/test commands) +├── IMPLEMENTATION_PLAN.md # Task list (generated by planning mode) +├── specs/ # Requirement specs (one per topic) +│ ├── auth.md +│ └── data-pipeline.md +└── src/ # Your source code +``` + +### Example `PROMPT_plan.md` + +```markdown +0a. Study `specs/*` to learn the application specifications. +0b. Study IMPLEMENTATION_PLAN.md (if present) to understand the plan so far. +0c. Study `src/` to understand existing code and shared utilities. + +1. Compare specs against code (gap analysis). Create or update + IMPLEMENTATION_PLAN.md as a prioritized bullet-point list of tasks + yet to be implemented. Do NOT implement anything. + +IMPORTANT: Do NOT assume functionality is missing — search the +codebase first to confirm. Prefer updating existing utilities over +creating ad-hoc copies. +``` + +### Example `PROMPT_build.md` + +```markdown +0a. Study `specs/*` to learn the application specifications. +0b. Study IMPLEMENTATION_PLAN.md. +0c. Study `src/` for reference. + +1. Choose the most important item from IMPLEMENTATION_PLAN.md. Before + making changes, search the codebase (don't assume not implemented). +2. After implementing, run the tests. If functionality is missing, add it. +3. When you discover issues, update IMPLEMENTATION_PLAN.md immediately. +4. When tests pass, update IMPLEMENTATION_PLAN.md, then `git add -A` + then `git commit` with a descriptive message. + +99999. When authoring documentation, capture the why. +999999. Implement completely. No placeholders or stubs. +9999999. Keep IMPLEMENTATION_PLAN.md current — future iterations depend on it. +``` + +### Example `AGENTS.md` + +Keep this brief (~60 lines). It's loaded every iteration, so bloat wastes context. + +```markdown +## Build & Run + +python -m pytest + +## Validation + +- Tests: `pytest` +- Typecheck: `mypy src/` +- Lint: `ruff check src/` ``` ## Best Practices -1. **Write clear completion criteria**: Include exactly what "done" looks like -2. **Use output markers**: Include `COMPLETE` or similar in completion condition -3. **Always set max iterations**: Prevents infinite loops on impossible tasks -4. **Persist state**: Save files so AI can see what changed between iterations -5. **Include context**: Feed previous iteration output back as context -6. **Monitor progress**: Log each iteration to track what's happening +1. **Fresh context per iteration**: Never accumulate context across iterations — that's the whole point +2. **Disk is your database**: `IMPLEMENTATION_PLAN.md` is shared state between isolated sessions +3. **Backpressure is essential**: Tests, builds, lints in `AGENTS.md` — the agent must pass them before committing +4. **Start with PLANNING mode**: Generate the plan first, then switch to BUILDING +5. **Observe and tune**: Watch early iterations, add guardrails to prompts when the agent fails in specific ways +6. **The plan is disposable**: If the agent goes off track, delete `IMPLEMENTATION_PLAN.md` and re-plan +7. **Keep `AGENTS.md` brief**: It's loaded every iteration — operational info only, no progress notes +8. **Use a sandbox**: The agent runs autonomously with full tool access — isolate it -## Example: Iterative Code Generation - -```python -prompt = """Write a function that: -1. Parses CSV data -2. Validates required fields -3. Returns parsed records or error -4. Has unit tests -5. Output COMPLETE when done""" - -async def main(): - loop = RalphLoop(10, "COMPLETE") - result = await loop.run(prompt) - -asyncio.run(main()) -``` - -## Handling Failures - -```python -try: - result = await loop.run(prompt) - print("Task completed successfully!") -except RuntimeError as e: - print(f"Task failed: {e}") - if loop.last_response: - print(f"\nLast attempt:\n{loop.last_response}") -``` - -## When to Use RALPH-loop +## When to Use a Ralph Loop **Good for:** -- Code generation with automatic verification (tests, linters) -- Tasks with clear success criteria -- Iterative refinement where each attempt learns from previous failures -- Unattended long-running improvements +- Implementing features from specs with test-driven validation +- Large refactors broken into many small tasks +- Unattended, long-running development with clear requirements +- Any work where backpressure (tests/builds) can verify correctness **Not good for:** -- Tasks requiring human judgment or design input -- One-shot operations -- Tasks with vague success criteria -- Real-time interactive debugging +- Tasks requiring human judgment mid-loop +- One-shot operations that don't benefit from iteration +- Vague requirements without testable acceptance criteria +- Exploratory prototyping where direction isn't clear diff --git a/cookbook/copilot-sdk/python/recipe/ralph_loop.py b/cookbook/copilot-sdk/python/recipe/ralph_loop.py index 00ecadcc..5e9d9422 100644 --- a/cookbook/copilot-sdk/python/recipe/ralph_loop.py +++ b/cookbook/copilot-sdk/python/recipe/ralph_loop.py @@ -1,127 +1,84 @@ #!/usr/bin/env python3 +""" +Ralph loop: autonomous AI task loop with fresh context per iteration. + +Two modes: + - "plan": reads PROMPT_plan.md, generates/updates IMPLEMENTATION_PLAN.md + - "build": reads PROMPT_build.md, implements tasks, runs tests, commits + +Each iteration creates a fresh session so the agent always operates in +the "smart zone" of its context window. State is shared between +iterations via files on disk (IMPLEMENTATION_PLAN.md, AGENTS.md, specs/*). + +Usage: + python ralph_loop.py # build mode, 50 iterations + python ralph_loop.py plan # planning mode + python ralph_loop.py 20 # build mode, 20 iterations + python ralph_loop.py plan 5 # planning mode, 5 iterations +""" + import asyncio +import subprocess +import sys +from pathlib import Path from copilot import CopilotClient, MessageOptions, SessionConfig -class RalphLoop: - """ - RALPH-loop implementation: Iterative self-referential AI loops. +async def ralph_loop(mode: str = "build", max_iterations: int = 50): + prompt_file = "PROMPT_plan.md" if mode == "plan" else "PROMPT_build.md" - The same prompt is sent repeatedly, with AI reading its own previous output. - Loop continues until completion promise is detected in the response. - """ + client = CopilotClient() + await client.start() - def __init__(self, max_iterations=10, completion_promise="COMPLETE"): - """Initialize RALPH-loop with iteration limits and completion detection.""" - self.client = CopilotClient() - self.iteration = 0 - self.max_iterations = max_iterations - self.completion_promise = completion_promise - self.last_response = None + branch = subprocess.check_output( + ["git", "branch", "--show-current"], text=True + ).strip() - async def run(self, initial_prompt): - """ - Run the RALPH-loop until completion promise is detected or max iterations reached. - """ - session = None - await self.client.start() - try: - session = await self.client.create_session( - SessionConfig(model="gpt-5.1-codex-mini") - ) - - try: - while self.iteration < self.max_iterations: - self.iteration += 1 - print(f"\n=== Iteration {self.iteration}/{self.max_iterations} ===") - - current_prompt = self._build_iteration_prompt(initial_prompt) - print(f"Sending prompt (length: {len(current_prompt)})...") - - result = await session.send_and_wait( - MessageOptions(prompt=current_prompt), - timeout=300, - ) - - self.last_response = result.data.content if result else "" - - # Display response summary - summary = ( - self.last_response[:200] + "..." - if len(self.last_response) > 200 - else self.last_response - ) - print(f"Response: {summary}") - - # Check for completion promise - if self.completion_promise in self.last_response: - print( - f"\n✓ Success! Completion promise detected: '{self.completion_promise}'" - ) - return self.last_response - - print( - f"Iteration {self.iteration} complete. Checking for next iteration..." - ) - - raise RuntimeError( - f"Maximum iterations ({self.max_iterations}) reached without " - f"detecting completion promise: '{self.completion_promise}'" - ) - - except Exception as e: - print(f"\nError during RALPH-loop: {e}") - raise - finally: - if session is not None: - await session.destroy() - finally: - await self.client.stop() - - def _build_iteration_prompt(self, initial_prompt): - """Build the prompt for the current iteration, including previous output as context.""" - if self.iteration == 1: - return initial_prompt - - return f"""{initial_prompt} - -=== CONTEXT FROM PREVIOUS ITERATION === -{self.last_response} -=== END CONTEXT === - -Continue working on this task. Review the previous attempt and improve upon it.""" - - -async def main(): - """Example usage demonstrating RALPH-loop.""" - prompt = """You are iteratively building a small library. Follow these phases IN ORDER. -Do NOT skip ahead — only do the current phase, then stop and wait for the next iteration. - -Phase 1: Design a DataValidator class that validates records against a schema. - - Schema defines field names, types (str, int, float, bool), and whether required. - - Return a list of validation errors per record. - - Show the class code only. Do NOT output COMPLETE. - -Phase 2: Write at least 4 unit tests covering: missing required field, wrong type, - valid record, and empty input. Show test code only. Do NOT output COMPLETE. - -Phase 3: Review the code from phases 1 and 2. Fix any bugs, add docstrings, and add - an extra edge-case test. Show the final consolidated code with all fixes. - When this phase is fully done, output the exact text: COMPLETE""" - - loop = RalphLoop(max_iterations=5, completion_promise="COMPLETE") + print("━" * 40) + print(f"Mode: {mode}") + print(f"Prompt: {prompt_file}") + print(f"Branch: {branch}") + print(f"Max: {max_iterations} iterations") + print("━" * 40) try: - result = await loop.run(prompt) - print("\n=== FINAL RESULT ===") - print(result) - except RuntimeError as e: - print(f"\nTask did not complete: {e}") - if loop.last_response: - print(f"\nLast attempt:\n{loop.last_response}") + prompt = Path(prompt_file).read_text() + + for i in range(1, max_iterations + 1): + print(f"\n=== Iteration {i}/{max_iterations} ===") + + # Fresh session — each task gets full context budget + session = await client.create_session( + SessionConfig(model="claude-sonnet-4.5") + ) + try: + await session.send_and_wait( + MessageOptions(prompt=prompt), timeout=600 + ) + finally: + await session.destroy() + + # Push changes after each iteration + try: + subprocess.run( + ["git", "push", "origin", branch], check=True + ) + except subprocess.CalledProcessError: + subprocess.run( + ["git", "push", "-u", "origin", branch], check=True + ) + + print(f"\nIteration {i} complete.") + + print(f"\nReached max iterations: {max_iterations}") + finally: + await client.stop() if __name__ == "__main__": - asyncio.run(main()) + args = sys.argv[1:] + mode = "plan" if "plan" in args else "build" + max_iter = next((int(a) for a in args if a.isdigit()), 50) + asyncio.run(ralph_loop(mode, max_iter))