Rewrite Ralph loop recipes: split into simple vs ideal versions

Align all 4 language recipes (Node.js, Python, .NET, Go) with the
Ralph Playbook architecture:

- Simple version: minimal outer loop with fresh session per iteration
- Ideal version: planning/building modes, backpressure, git integration
- Fresh context isolation instead of in-session context accumulation
- Disk-based shared state via IMPLEMENTATION_PLAN.md
- Example prompt templates (PROMPT_plan.md, PROMPT_build.md, AGENTS.md)
- Updated cookbook README descriptions
This commit is contained in:
Anthony Shaw
2026-02-11 11:28:41 -08:00
parent ab82accc08
commit 952372c1ec
9 changed files with 1052 additions and 1122 deletions

View File

@@ -6,7 +6,7 @@ This cookbook collects small, focused recipes showing how to accomplish common t
### .NET (C#) ### .NET (C#)
- [RALPH-loop](dotnet/ralph-loop.md): Implement iterative self-referential AI loops for task completion with automatic retries. - [Ralph Loop](dotnet/ralph-loop.md): Build autonomous AI coding loops with fresh context per iteration, planning/building modes, and backpressure.
- [Error Handling](dotnet/error-handling.md): Handle errors gracefully including connection failures, timeouts, and cleanup. - [Error Handling](dotnet/error-handling.md): Handle errors gracefully including connection failures, timeouts, and cleanup.
- [Multiple Sessions](dotnet/multiple-sessions.md): Manage multiple independent conversations simultaneously. - [Multiple Sessions](dotnet/multiple-sessions.md): Manage multiple independent conversations simultaneously.
- [Managing Local Files](dotnet/managing-local-files.md): Organize files by metadata using AI-powered grouping strategies. - [Managing Local Files](dotnet/managing-local-files.md): Organize files by metadata using AI-powered grouping strategies.
@@ -15,7 +15,7 @@ This cookbook collects small, focused recipes showing how to accomplish common t
### Node.js / TypeScript ### Node.js / TypeScript
- [RALPH-loop](nodejs/ralph-loop.md): Implement iterative self-referential AI loops for task completion with automatic retries. - [Ralph Loop](nodejs/ralph-loop.md): Build autonomous AI coding loops with fresh context per iteration, planning/building modes, and backpressure.
- [Error Handling](nodejs/error-handling.md): Handle errors gracefully including connection failures, timeouts, and cleanup. - [Error Handling](nodejs/error-handling.md): Handle errors gracefully including connection failures, timeouts, and cleanup.
- [Multiple Sessions](nodejs/multiple-sessions.md): Manage multiple independent conversations simultaneously. - [Multiple Sessions](nodejs/multiple-sessions.md): Manage multiple independent conversations simultaneously.
- [Managing Local Files](nodejs/managing-local-files.md): Organize files by metadata using AI-powered grouping strategies. - [Managing Local Files](nodejs/managing-local-files.md): Organize files by metadata using AI-powered grouping strategies.
@@ -24,7 +24,7 @@ This cookbook collects small, focused recipes showing how to accomplish common t
### Python ### Python
- [RALPH-loop](python/ralph-loop.md): Implement iterative self-referential AI loops for task completion with automatic retries. - [Ralph Loop](python/ralph-loop.md): Build autonomous AI coding loops with fresh context per iteration, planning/building modes, and backpressure.
- [Error Handling](python/error-handling.md): Handle errors gracefully including connection failures, timeouts, and cleanup. - [Error Handling](python/error-handling.md): Handle errors gracefully including connection failures, timeouts, and cleanup.
- [Multiple Sessions](python/multiple-sessions.md): Manage multiple independent conversations simultaneously. - [Multiple Sessions](python/multiple-sessions.md): Manage multiple independent conversations simultaneously.
- [Managing Local Files](python/managing-local-files.md): Organize files by metadata using AI-powered grouping strategies. - [Managing Local Files](python/managing-local-files.md): Organize files by metadata using AI-powered grouping strategies.
@@ -33,7 +33,7 @@ This cookbook collects small, focused recipes showing how to accomplish common t
### Go ### Go
- [RALPH-loop](go/ralph-loop.md): Implement iterative self-referential AI loops for task completion with automatic retries. - [Ralph Loop](go/ralph-loop.md): Build autonomous AI coding loops with fresh context per iteration, planning/building modes, and backpressure.
- [Error Handling](go/error-handling.md): Handle errors gracefully including connection failures, timeouts, and cleanup. - [Error Handling](go/error-handling.md): Handle errors gracefully including connection failures, timeouts, and cleanup.
- [Multiple Sessions](go/multiple-sessions.md): Manage multiple independent conversations simultaneously. - [Multiple Sessions](go/multiple-sessions.md): Manage multiple independent conversations simultaneously.
- [Managing Local Files](go/managing-local-files.md): Organize files by metadata using AI-powered grouping strategies. - [Managing Local Files](go/managing-local-files.md): Organize files by metadata using AI-powered grouping strategies.

View File

@@ -1,6 +1,6 @@
# RALPH-loop: Iterative Self-Referential AI Loops # Ralph Loop: Autonomous AI Task Loops
Implement self-referential feedback loops where an AI agent iteratively improves work by reading its own previous output. Build autonomous coding loops where an AI agent picks tasks, implements them, validates against backpressure (tests, builds), commits, and repeats — each iteration in a fresh context window.
> **Runnable example:** [recipe/ralph-loop.cs](recipe/ralph-loop.cs) > **Runnable example:** [recipe/ralph-loop.cs](recipe/ralph-loop.cs)
> >
@@ -9,252 +9,250 @@ Implement self-referential feedback loops where an AI agent iteratively improves
> dotnet run recipe/ralph-loop.cs > dotnet run recipe/ralph-loop.cs
> ``` > ```
## What is RALPH-loop? ## What is a Ralph Loop?
RALPH-loop is a development methodology for iterative AI-powered task completion. Named after the Ralph Wiggum technique, it embodies the philosophy of persistent iteration: A [Ralph loop](https://ghuntley.com/ralph/) is an autonomous development workflow where an AI agent iterates through tasks in isolated context windows. The key insight: **state lives on disk, not in the model's context**. Each iteration starts fresh, reads the current state from files, does one task, writes results back to disk, and exits.
- **One prompt, multiple iterations**: The same prompt is processed repeatedly ```
- **Self-referential feedback**: The AI reads its own previous work (file changes, git history) ┌─────────────────────────────────────────────────┐
- **Completion detection**: Loop exits when a completion promise is detected in output │ loop.sh │
- **Safety limits**: Always include a maximum iteration count to prevent infinite loops │ while true: │
│ ┌─────────────────────────────────────────┐ │
│ │ Fresh session (isolated context) │ │
│ │ │ │
│ │ 1. Read PROMPT.md + AGENTS.md │ │
│ │ 2. Study specs/* and code │ │
│ │ 3. Pick next task from plan │ │
│ │ 4. Implement + run tests │ │
│ │ 5. Update plan, commit, exit │ │
│ └─────────────────────────────────────────┘ │
│ ↻ next iteration (fresh context) │
└─────────────────────────────────────────────────┘
```
## Example Scenario **Core principles:**
You need to iteratively improve code until all tests pass. Instead of asking the model to "write perfect code," you use RALPH-loop to: - **Fresh context per iteration**: Each loop creates a new session — no context accumulation, always in the "smart zone"
- **Disk as shared state**: `IMPLEMENTATION_PLAN.md` persists between iterations and acts as the coordination mechanism
- **Backpressure steers quality**: Tests, builds, and lints reject bad work — the agent must fix issues before committing
- **Two modes**: PLANNING (gap analysis → generate plan) and BUILDING (implement from plan)
1. Send the initial prompt with clear success criteria ## Simple Version
2. The model writes code and tests
3. The model runs tests and sees failures
4. Loop automatically re-sends the prompt
5. The model reads test output and previous code, fixes issues
6. Repeat until all tests pass and completion promise is output
## Basic Implementation The minimal Ralph loop — the SDK equivalent of `while :; do cat PROMPT.md | claude ; done`:
```csharp ```csharp
using GitHub.Copilot.SDK; using GitHub.Copilot.SDK;
public class RalphLoop var client = new CopilotClient();
await client.StartAsync();
try
{ {
private readonly CopilotClient _client; var prompt = await File.ReadAllTextAsync("PROMPT.md");
private int _iteration = 0; var maxIterations = 50;
private readonly int _maxIterations;
private readonly string _completionPromise;
private string? _lastResponse;
public RalphLoop(int maxIterations = 10, string completionPromise = "COMPLETE") for (var i = 1; i <= maxIterations; i++)
{ {
_client = new CopilotClient(); Console.WriteLine($"\n=== Iteration {i}/{maxIterations} ===");
_maxIterations = maxIterations;
_completionPromise = completionPromise;
}
public async Task<string> RunAsync(string prompt)
{
await _client.StartAsync();
// Fresh session each iteration — context isolation is the point
var session = await client.CreateSessionAsync(
new SessionConfig { Model = "claude-sonnet-4.5" });
try try
{ {
var session = await _client.CreateSessionAsync( var done = new TaskCompletionSource<string>();
new SessionConfig { Model = "gpt-5.1-codex-mini" }); session.On(evt =>
try
{ {
var done = new TaskCompletionSource<string>(); if (evt is AssistantMessageEvent msg)
session.On(evt => done.TrySetResult(msg.Data.Content);
{ });
if (evt is AssistantMessageEvent msg)
{
_lastResponse = msg.Data.Content;
done.TrySetResult(msg.Data.Content);
}
});
while (_iteration < _maxIterations) await session.SendAsync(new MessageOptions { Prompt = prompt });
{ await done.Task;
_iteration++;
Console.WriteLine($"\n--- Iteration {_iteration} ---");
done = new TaskCompletionSource<string>();
// Send prompt (on first iteration) or continuation
var messagePrompt = _iteration == 1
? prompt
: $"{prompt}\n\nPrevious attempt:\n{_lastResponse}\n\nContinue iterating...";
await session.SendAsync(new MessageOptions { Prompt = messagePrompt });
var response = await done.Task;
// Check for completion promise
if (response.Contains(_completionPromise))
{
Console.WriteLine($"✓ Completion promise detected: {_completionPromise}");
return response;
}
Console.WriteLine($"Iteration {_iteration} complete. Continuing...");
}
throw new InvalidOperationException(
$"Max iterations ({_maxIterations}) reached without completion promise");
}
finally
{
await session.DisposeAsync();
}
} }
finally finally
{ {
await _client.StopAsync(); await session.DisposeAsync();
} }
Console.WriteLine($"Iteration {i} complete.");
} }
} }
finally
// Usage {
var loop = new RalphLoop(maxIterations: 5, completionPromise: "COMPLETE"); await client.StopAsync();
var result = await loop.RunAsync("Your task here"); }
Console.WriteLine(result);
``` ```
## With File Persistence This is all you need to get started. The prompt file tells the agent what to do; the agent reads project files, does work, commits, and exits. The loop restarts with a clean slate.
For tasks involving code generation, persist state to files so the AI can see changes: ## Ideal Version
The full Ralph pattern with planning and building modes, matching the [Ralph Playbook](https://github.com/ClaytonFarr/ralph-playbook) architecture:
```csharp ```csharp
public class PersistentRalphLoop using System.Diagnostics;
using GitHub.Copilot.SDK;
// Parse args: dotnet run [plan] [max_iterations]
var mode = args.Contains("plan") ? "plan" : "build";
var maxArg = args.FirstOrDefault(a => int.TryParse(a, out _));
var maxIterations = maxArg != null ? int.Parse(maxArg) : 50;
var promptFile = mode == "plan" ? "PROMPT_plan.md" : "PROMPT_build.md";
var client = new CopilotClient();
await client.StartAsync();
var branchInfo = new ProcessStartInfo("git", "branch --show-current")
{ RedirectStandardOutput = true };
var branch = Process.Start(branchInfo)!;
var branchName = (await branch.StandardOutput.ReadToEndAsync()).Trim();
await branch.WaitForExitAsync();
Console.WriteLine(new string('━', 40));
Console.WriteLine($"Mode: {mode}");
Console.WriteLine($"Prompt: {promptFile}");
Console.WriteLine($"Branch: {branchName}");
Console.WriteLine($"Max: {maxIterations} iterations");
Console.WriteLine(new string('━', 40));
try
{ {
private readonly string _workDir; var prompt = await File.ReadAllTextAsync(promptFile);
private readonly CopilotClient _client;
private readonly int _maxIterations;
private int _iteration = 0;
public PersistentRalphLoop(string workDir, int maxIterations = 10) for (var i = 1; i <= maxIterations; i++)
{ {
_workDir = workDir; Console.WriteLine($"\n=== Iteration {i}/{maxIterations} ===");
_maxIterations = maxIterations;
Directory.CreateDirectory(_workDir);
_client = new CopilotClient();
}
public async Task<string> RunAsync(string prompt)
{
await _client.StartAsync();
// Fresh session — each task gets full context budget
var session = await client.CreateSessionAsync(
new SessionConfig { Model = "claude-sonnet-4.5" });
try try
{ {
var session = await _client.CreateSessionAsync( var done = new TaskCompletionSource<string>();
new SessionConfig { Model = "gpt-5.1-codex-mini" }); session.On(evt =>
try
{ {
// Store initial prompt if (evt is AssistantMessageEvent msg)
var promptFile = Path.Combine(_workDir, "prompt.md"); done.TrySetResult(msg.Data.Content);
await File.WriteAllTextAsync(promptFile, prompt); });
var done = new TaskCompletionSource<string>(); await session.SendAsync(new MessageOptions { Prompt = prompt });
string response = ""; await done.Task;
session.On(evt =>
{
if (evt is AssistantMessageEvent msg)
{
response = msg.Data.Content;
done.TrySetResult(msg.Data.Content);
}
});
while (_iteration < _maxIterations)
{
_iteration++;
Console.WriteLine($"\n--- Iteration {_iteration} ---");
done = new TaskCompletionSource<string>();
// Build context including previous work
var contextBuilder = new StringBuilder(prompt);
var previousOutput = Path.Combine(_workDir, $"output-{_iteration - 1}.txt");
if (File.Exists(previousOutput))
{
contextBuilder.AppendLine($"\nPrevious iteration output:\n{await File.ReadAllTextAsync(previousOutput)}");
}
await session.SendAsync(new MessageOptions { Prompt = contextBuilder.ToString() });
await done.Task;
// Persist output
await File.WriteAllTextAsync(
Path.Combine(_workDir, $"output-{_iteration}.txt"),
response);
if (response.Contains("COMPLETE"))
{
return response;
}
}
throw new InvalidOperationException("Max iterations reached");
}
finally
{
await session.DisposeAsync();
}
} }
finally finally
{ {
await _client.StopAsync(); await session.DisposeAsync();
} }
// Push changes after each iteration
try
{
Process.Start("git", $"push origin {branchName}")!.WaitForExit();
}
catch
{
Process.Start("git", $"push -u origin {branchName}")!.WaitForExit();
}
Console.WriteLine($"\nIteration {i} complete.");
} }
Console.WriteLine($"\nReached max iterations: {maxIterations}");
} }
finally
{
await client.StopAsync();
}
```
### Required Project Files
The ideal version expects this file structure in your project:
```
project-root/
├── PROMPT_plan.md # Planning mode instructions
├── PROMPT_build.md # Building mode instructions
├── AGENTS.md # Operational guide (build/test commands)
├── IMPLEMENTATION_PLAN.md # Task list (generated by planning mode)
├── specs/ # Requirement specs (one per topic)
│ ├── auth.md
│ └── data-pipeline.md
└── src/ # Your source code
```
### Example `PROMPT_plan.md`
```markdown
0a. Study `specs/*` to learn the application specifications.
0b. Study IMPLEMENTATION_PLAN.md (if present) to understand the plan so far.
0c. Study `src/` to understand existing code and shared utilities.
1. Compare specs against code (gap analysis). Create or update
IMPLEMENTATION_PLAN.md as a prioritized bullet-point list of tasks
yet to be implemented. Do NOT implement anything.
IMPORTANT: Do NOT assume functionality is missing — search the
codebase first to confirm. Prefer updating existing utilities over
creating ad-hoc copies.
```
### Example `PROMPT_build.md`
```markdown
0a. Study `specs/*` to learn the application specifications.
0b. Study IMPLEMENTATION_PLAN.md.
0c. Study `src/` for reference.
1. Choose the most important item from IMPLEMENTATION_PLAN.md. Before
making changes, search the codebase (don't assume not implemented).
2. After implementing, run the tests. If functionality is missing, add it.
3. When you discover issues, update IMPLEMENTATION_PLAN.md immediately.
4. When tests pass, update IMPLEMENTATION_PLAN.md, then `git add -A`
then `git commit` with a descriptive message.
99999. When authoring documentation, capture the why.
999999. Implement completely. No placeholders or stubs.
9999999. Keep IMPLEMENTATION_PLAN.md current — future iterations depend on it.
```
### Example `AGENTS.md`
Keep this brief (~60 lines). It's loaded every iteration, so bloat wastes context.
```markdown
## Build & Run
dotnet build
## Validation
- Tests: `dotnet test`
- Build: `dotnet build --no-restore`
``` ```
## Best Practices ## Best Practices
1. **Write clear completion criteria**: Include exactly what "done" looks like 1. **Fresh context per iteration**: Never accumulate context across iterations — that's the whole point
2. **Use output markers**: Include `<promise>COMPLETE</promise>` or similar in completion condition 2. **Disk is your database**: `IMPLEMENTATION_PLAN.md` is shared state between isolated sessions
3. **Always set max iterations**: Prevents infinite loops on impossible tasks 3. **Backpressure is essential**: Tests, builds, lints in `AGENTS.md` — the agent must pass them before committing
4. **Persist state**: Save files so AI can see what changed between iterations 4. **Start with PLANNING mode**: Generate the plan first, then switch to BUILDING
5. **Include context**: Feed previous iteration output back as context 5. **Observe and tune**: Watch early iterations, add guardrails to prompts when the agent fails in specific ways
6. **Monitor progress**: Log each iteration to track what's happening 6. **The plan is disposable**: If the agent goes off track, delete `IMPLEMENTATION_PLAN.md` and re-plan
7. **Keep `AGENTS.md` brief**: It's loaded every iteration — operational info only, no progress notes
8. **Use a sandbox**: The agent runs autonomously with full tool access — isolate it
## Example: Iterative Code Generation ## When to Use a Ralph Loop
```csharp
var prompt = @"Write a function that:
1. Parses CSV data
2. Validates required fields
3. Returns parsed records or error
4. Has unit tests
5. Output <promise>COMPLETE</promise> when done";
var loop = new RalphLoop(maxIterations: 10, completionPromise: "COMPLETE");
var result = await loop.RunAsync(prompt);
```
## Handling Failures
```csharp
try
{
var result = await loop.RunAsync(prompt);
Console.WriteLine("Task completed successfully!");
}
catch (InvalidOperationException ex) when (ex.Message.Contains("Max iterations"))
{
Console.WriteLine("Task did not complete within iteration limit.");
Console.WriteLine($"Last response: {loop.LastResponse}");
// Document what was attempted and suggest alternatives
}
```
## When to Use RALPH-loop
**Good for:** **Good for:**
- Code generation with automatic verification (tests, linters) - Implementing features from specs with test-driven validation
- Tasks with clear success criteria - Large refactors broken into many small tasks
- Iterative refinement where each attempt learns from previous failures - Unattended, long-running development with clear requirements
- Unattended long-running improvements - Any work where backpressure (tests/builds) can verify correctness
**Not good for:** **Not good for:**
- Tasks requiring human judgment or design input - Tasks requiring human judgment mid-loop
- One-shot operations - One-shot operations that don't benefit from iteration
- Tasks with vague success criteria - Vague requirements without testable acceptance criteria
- Real-time interactive debugging - Exploratory prototyping where direction isn't clear

View File

@@ -1,141 +1,90 @@
#:package GitHub.Copilot.SDK@* #:package GitHub.Copilot.SDK@*
#:property PublishAot=false #:property PublishAot=false
using System.Diagnostics;
using GitHub.Copilot.SDK; using GitHub.Copilot.SDK;
using System.Text;
// RALPH-loop: Iterative self-referential AI loops. // Ralph loop: autonomous AI task loop with fresh context per iteration.
// The same prompt is sent repeatedly, with AI reading its own previous output. //
// Loop continues until completion promise is detected in the response. // Two modes:
// - "plan": reads PROMPT_plan.md, generates/updates IMPLEMENTATION_PLAN.md
// - "build": reads PROMPT_build.md, implements tasks, runs tests, commits
//
// Each iteration creates a fresh session so the agent always operates in
// the "smart zone" of its context window. State is shared between
// iterations via files on disk (IMPLEMENTATION_PLAN.md, AGENTS.md, specs/*).
//
// Usage:
// dotnet run # build mode, 50 iterations
// dotnet run plan # planning mode
// dotnet run 20 # build mode, 20 iterations
// dotnet run plan 5 # planning mode, 5 iterations
var prompt = @"You are iteratively building a small library. Follow these phases IN ORDER. var mode = args.Contains("plan") ? "plan" : "build";
Do NOT skip ahead — only do the current phase, then stop and wait for the next iteration. var maxArg = args.FirstOrDefault(a => int.TryParse(a, out _));
var maxIterations = maxArg != null ? int.Parse(maxArg) : 50;
var promptFile = mode == "plan" ? "PROMPT_plan.md" : "PROMPT_build.md";
Phase 1: Design a DataValidator class that validates records against a schema. var client = new CopilotClient();
- Schema defines field names, types (string, int, float, bool), and whether required. await client.StartAsync();
- Return a list of validation errors per record.
- Show the class code only. Do NOT output COMPLETE.
Phase 2: Write at least 4 unit tests covering: missing required field, wrong type, var branchProc = Process.Start(new ProcessStartInfo("git", "branch --show-current")
valid record, and empty input. Show test code only. Do NOT output COMPLETE. { RedirectStandardOutput = true })!;
var branch = (await branchProc.StandardOutput.ReadToEndAsync()).Trim();
await branchProc.WaitForExitAsync();
Phase 3: Review the code from phases 1 and 2. Fix any bugs, add docstrings, and add Console.WriteLine(new string('━', 40));
an extra edge-case test. Show the final consolidated code with all fixes. Console.WriteLine($"Mode: {mode}");
When this phase is fully done, output the exact text: COMPLETE"; Console.WriteLine($"Prompt: {promptFile}");
Console.WriteLine($"Branch: {branch}");
var loop = new RalphLoop(maxIterations: 5, completionPromise: "COMPLETE"); Console.WriteLine($"Max: {maxIterations} iterations");
Console.WriteLine(new string('━', 40));
try try
{ {
var result = await loop.RunAsync(prompt); var prompt = await File.ReadAllTextAsync(promptFile);
Console.WriteLine("\n=== FINAL RESULT ===");
Console.WriteLine(result); for (var i = 1; i <= maxIterations; i++)
}
catch (InvalidOperationException ex)
{
Console.WriteLine($"\nTask did not complete: {ex.Message}");
if (loop.LastResponse != null)
{ {
Console.WriteLine($"\nLast attempt:\n{loop.LastResponse}"); Console.WriteLine($"\n=== Iteration {i}/{maxIterations} ===");
}
}
// --- RalphLoop class definition --- // Fresh session — each task gets full context budget
var session = await client.CreateSessionAsync(
public class RalphLoop new SessionConfig { Model = "claude-sonnet-4.5" });
{
private readonly CopilotClient _client;
private int _iteration = 0;
private readonly int _maxIterations;
private readonly string _completionPromise;
private string? _lastResponse;
public RalphLoop(int maxIterations = 10, string completionPromise = "COMPLETE")
{
_client = new CopilotClient();
_maxIterations = maxIterations;
_completionPromise = completionPromise;
}
public string? LastResponse => _lastResponse;
public async Task<string> RunAsync(string initialPrompt)
{
await _client.StartAsync();
try try
{ {
var session = await _client.CreateSessionAsync(new SessionConfig var done = new TaskCompletionSource<string>();
session.On(evt =>
{ {
Model = "gpt-5.1-codex-mini" if (evt is AssistantMessageEvent msg)
done.TrySetResult(msg.Data.Content);
}); });
try await session.SendAsync(new MessageOptions { Prompt = prompt });
{ await done.Task;
var done = new TaskCompletionSource<string>();
session.On(evt =>
{
if (evt is AssistantMessageEvent msg)
{
_lastResponse = msg.Data.Content;
done.TrySetResult(msg.Data.Content);
}
});
while (_iteration < _maxIterations)
{
_iteration++;
Console.WriteLine($"\n=== Iteration {_iteration}/{_maxIterations} ===");
done = new TaskCompletionSource<string>();
var currentPrompt = BuildIterationPrompt(initialPrompt);
Console.WriteLine($"Sending prompt (length: {currentPrompt.Length})...");
await session.SendAsync(new MessageOptions { Prompt = currentPrompt });
var response = await done.Task;
var summary = response.Length > 200
? response.Substring(0, 200) + "..."
: response;
Console.WriteLine($"Response: {summary}");
if (response.Contains(_completionPromise))
{
Console.WriteLine($"\n✓ Completion promise detected: '{_completionPromise}'");
return response;
}
Console.WriteLine($"Iteration {_iteration} complete. Continuing...");
}
throw new InvalidOperationException(
$"Max iterations ({_maxIterations}) reached without completion promise: '{_completionPromise}'");
}
finally
{
await session.DisposeAsync();
}
} }
finally finally
{ {
await _client.StopAsync(); await session.DisposeAsync();
} }
// Push changes after each iteration
try
{
Process.Start("git", $"push origin {branch}")!.WaitForExit();
}
catch
{
Process.Start("git", $"push -u origin {branch}")!.WaitForExit();
}
Console.WriteLine($"\nIteration {i} complete.");
} }
private string BuildIterationPrompt(string initialPrompt) Console.WriteLine($"\nReached max iterations: {maxIterations}");
{ }
if (_iteration == 1) finally
return initialPrompt; {
await client.StopAsync();
var sb = new StringBuilder();
sb.AppendLine(initialPrompt);
sb.AppendLine();
sb.AppendLine("=== CONTEXT FROM PREVIOUS ITERATION ===");
sb.AppendLine(_lastResponse);
sb.AppendLine("=== END CONTEXT ===");
sb.AppendLine();
sb.AppendLine("Continue working on this task. Review the previous attempt and improve upon it.");
return sb.ToString();
}
} }

View File

@@ -1,6 +1,6 @@
# RALPH-loop: Iterative Self-Referential AI Loops # Ralph Loop: Autonomous AI Task Loops
Implement self-referential feedback loops where an AI agent iteratively improves work by reading its own previous output. Build autonomous coding loops where an AI agent picks tasks, implements them, validates against backpressure (tests, builds), commits, and repeats — each iteration in a fresh context window.
> **Runnable example:** [recipe/ralph-loop.go](recipe/ralph-loop.go) > **Runnable example:** [recipe/ralph-loop.go](recipe/ralph-loop.go)
> >
@@ -9,27 +9,37 @@ Implement self-referential feedback loops where an AI agent iteratively improves
> go run recipe/ralph-loop.go > go run recipe/ralph-loop.go
> ``` > ```
## What is RALPH-loop? ## What is a Ralph Loop?
RALPH-loop is a development methodology for iterative AI-powered task completion. Named after the Ralph Wiggum technique, it embodies the philosophy of persistent iteration: A [Ralph loop](https://ghuntley.com/ralph/) is an autonomous development workflow where an AI agent iterates through tasks in isolated context windows. The key insight: **state lives on disk, not in the model's context**. Each iteration starts fresh, reads the current state from files, does one task, writes results back to disk, and exits.
- **One prompt, multiple iterations**: The same prompt is processed repeatedly ```
- **Self-referential feedback**: The AI reads its own previous work (file changes, git history) ┌─────────────────────────────────────────────────┐
- **Completion detection**: Loop exits when a completion promise is detected in output │ loop.sh │
- **Safety limits**: Always include a maximum iteration count to prevent infinite loops │ while true: │
│ ┌─────────────────────────────────────────┐ │
│ │ Fresh session (isolated context) │ │
│ │ │ │
│ │ 1. Read PROMPT.md + AGENTS.md │ │
│ │ 2. Study specs/* and code │ │
│ │ 3. Pick next task from plan │ │
│ │ 4. Implement + run tests │ │
│ │ 5. Update plan, commit, exit │ │
│ └─────────────────────────────────────────┘ │
│ ↻ next iteration (fresh context) │
└─────────────────────────────────────────────────┘
```
## Example Scenario **Core principles:**
You need to iteratively improve code until all tests pass. Instead of asking Copilot to "write perfect code," you use RALPH-loop to: - **Fresh context per iteration**: Each loop creates a new session — no context accumulation, always in the "smart zone"
- **Disk as shared state**: `IMPLEMENTATION_PLAN.md` persists between iterations and acts as the coordination mechanism
- **Backpressure steers quality**: Tests, builds, and lints reject bad work — the agent must fix issues before committing
- **Two modes**: PLANNING (gap analysis → generate plan) and BUILDING (implement from plan)
1. Send the initial prompt with clear success criteria ## Simple Version
2. Copilot writes code and tests
3. Copilot runs tests and sees failures
4. Loop automatically re-sends the prompt
5. Copilot reads test output and previous code, fixes issues
6. Repeat until all tests pass and completion promise is output
## Basic Implementation The minimal Ralph loop — the SDK equivalent of `while :; do cat PROMPT.md | claude ; done`:
```go ```go
package main package main
@@ -38,81 +48,59 @@ import (
"context" "context"
"fmt" "fmt"
"log" "log"
"strings" "os"
copilot "github.com/github/copilot-sdk/go" copilot "github.com/github/copilot-sdk/go"
) )
type RalphLoop struct { func ralphLoop(ctx context.Context, promptFile string, maxIterations int) error {
client *copilot.Client client := copilot.NewClient(nil)
iteration int if err := client.Start(ctx); err != nil {
maxIterations int return err
completionPromise string
LastResponse string
}
func NewRalphLoop(maxIterations int, completionPromise string) *RalphLoop {
return &RalphLoop{
client: copilot.NewClient(nil),
maxIterations: maxIterations,
completionPromise: completionPromise,
} }
} defer client.Stop()
func (r *RalphLoop) Run(ctx context.Context, initialPrompt string) (string, error) { prompt, err := os.ReadFile(promptFile)
if err := r.client.Start(ctx); err != nil {
return "", err
}
defer r.client.Stop()
session, err := r.client.CreateSession(ctx, &copilot.SessionConfig{
Model: "gpt-5.1-codex-mini",
})
if err != nil { if err != nil {
return "", err return err
} }
defer session.Destroy()
for r.iteration < r.maxIterations { for i := 1; i <= maxIterations; i++ {
r.iteration++ fmt.Printf("\n=== Iteration %d/%d ===\n", i, maxIterations)
fmt.Printf("\n--- Iteration %d/%d ---\n", r.iteration, r.maxIterations)
prompt := r.buildIterationPrompt(initialPrompt) // Fresh session each iteration — context isolation is the point
session, err := client.CreateSession(ctx, &copilot.SessionConfig{
result, err := session.SendAndWait(ctx, copilot.MessageOptions{Prompt: prompt}) Model: "claude-sonnet-4.5",
})
if err != nil { if err != nil {
return "", err return err
} }
if result != nil && result.Data.Content != nil { _, err = session.SendAndWait(ctx, copilot.MessageOptions{
r.LastResponse = *result.Data.Content Prompt: string(prompt),
})
session.Destroy()
if err != nil {
return err
} }
if strings.Contains(r.LastResponse, r.completionPromise) { fmt.Printf("Iteration %d complete.\n", i)
fmt.Printf("✓ Completion promise detected: %s\n", r.completionPromise)
return r.LastResponse, nil
}
} }
return nil
return "", fmt.Errorf("max iterations (%d) reached without completion promise",
r.maxIterations)
} }
// Usage
func main() { func main() {
ctx := context.Background() if err := ralphLoop(context.Background(), "PROMPT.md", 20); err != nil {
loop := NewRalphLoop(5, "COMPLETE")
result, err := loop.Run(ctx, "Your task here")
if err != nil {
log.Fatal(err) log.Fatal(err)
} }
fmt.Println(result)
} }
``` ```
## With File Persistence This is all you need to get started. The prompt file tells the agent what to do; the agent reads project files, does work, commits, and exits. The loop restarts with a clean slate.
For tasks involving code generation, persist state to files so the AI can see changes: ## Ideal Version
The full Ralph pattern with planning and building modes, matching the [Ralph Playbook](https://github.com/ClaytonFarr/ralph-playbook) architecture:
```go ```go
package main package main
@@ -120,121 +108,178 @@ package main
import ( import (
"context" "context"
"fmt" "fmt"
"log"
"os" "os"
"path/filepath" "os/exec"
"strconv"
"strings" "strings"
copilot "github.com/github/copilot-sdk/go" copilot "github.com/github/copilot-sdk/go"
) )
type PersistentRalphLoop struct { func ralphLoop(ctx context.Context, mode string, maxIterations int) error {
client *copilot.Client promptFile := "PROMPT_build.md"
workDir string if mode == "plan" {
iteration int promptFile = "PROMPT_plan.md"
maxIterations int
}
func NewPersistentRalphLoop(workDir string, maxIterations int) *PersistentRalphLoop {
os.MkdirAll(workDir, 0755)
return &PersistentRalphLoop{
client: copilot.NewClient(nil),
workDir: workDir,
maxIterations: maxIterations,
} }
}
func (p *PersistentRalphLoop) Run(ctx context.Context, initialPrompt string) (string, error) { client := copilot.NewClient(nil)
if err := p.client.Start(ctx); err != nil { if err := client.Start(ctx); err != nil {
return "", err return err
} }
defer p.client.Stop() defer client.Stop()
os.WriteFile(filepath.Join(p.workDir, "prompt.md"), []byte(initialPrompt), 0644) branchOut, _ := exec.Command("git", "branch", "--show-current").Output()
branch := strings.TrimSpace(string(branchOut))
session, err := p.client.CreateSession(ctx, &copilot.SessionConfig{ fmt.Println(strings.Repeat("━", 40))
Model: "gpt-5.1-codex-mini", fmt.Printf("Mode: %s\n", mode)
}) fmt.Printf("Prompt: %s\n", promptFile)
fmt.Printf("Branch: %s\n", branch)
fmt.Printf("Max: %d iterations\n", maxIterations)
fmt.Println(strings.Repeat("━", 40))
prompt, err := os.ReadFile(promptFile)
if err != nil { if err != nil {
return "", err return err
} }
defer session.Destroy()
for p.iteration < p.maxIterations { for i := 1; i <= maxIterations; i++ {
p.iteration++ fmt.Printf("\n=== Iteration %d/%d ===\n", i, maxIterations)
prompt := initialPrompt // Fresh session — each task gets full context budget
prevFile := filepath.Join(p.workDir, fmt.Sprintf("output-%d.txt", p.iteration-1)) session, err := client.CreateSession(ctx, &copilot.SessionConfig{
if data, err := os.ReadFile(prevFile); err == nil { Model: "claude-sonnet-4.5",
prompt = fmt.Sprintf("%s\n\nPrevious iteration:\n%s", initialPrompt, string(data)) })
}
result, err := session.SendAndWait(ctx, copilot.MessageOptions{Prompt: prompt})
if err != nil { if err != nil {
return "", err return err
} }
response := "" _, err = session.SendAndWait(ctx, copilot.MessageOptions{
if result != nil && result.Data.Content != nil { Prompt: string(prompt),
response = *result.Data.Content })
session.Destroy()
if err != nil {
return err
} }
os.WriteFile(filepath.Join(p.workDir, fmt.Sprintf("output-%d.txt", p.iteration)), // Push changes after each iteration
[]byte(response), 0644) if err := exec.Command("git", "push", "origin", branch).Run(); err != nil {
exec.Command("git", "push", "-u", "origin", branch).Run()
}
if strings.Contains(response, "COMPLETE") { fmt.Printf("\nIteration %d complete.\n", i)
return response, nil }
fmt.Printf("\nReached max iterations: %d\n", maxIterations)
return nil
}
func main() {
mode := "build"
maxIterations := 50
for _, arg := range os.Args[1:] {
if arg == "plan" {
mode = "plan"
} else if n, err := strconv.Atoi(arg); err == nil {
maxIterations = n
} }
} }
return "", fmt.Errorf("max iterations reached") if err := ralphLoop(context.Background(), mode, maxIterations); err != nil {
log.Fatal(err)
}
} }
``` ```
### Required Project Files
The ideal version expects this file structure in your project:
```
project-root/
├── PROMPT_plan.md # Planning mode instructions
├── PROMPT_build.md # Building mode instructions
├── AGENTS.md # Operational guide (build/test commands)
├── IMPLEMENTATION_PLAN.md # Task list (generated by planning mode)
├── specs/ # Requirement specs (one per topic)
│ ├── auth.md
│ └── data-pipeline.md
└── src/ # Your source code
```
### Example `PROMPT_plan.md`
```markdown
0a. Study `specs/*` to learn the application specifications.
0b. Study IMPLEMENTATION_PLAN.md (if present) to understand the plan so far.
0c. Study `src/` to understand existing code and shared utilities.
1. Compare specs against code (gap analysis). Create or update
IMPLEMENTATION_PLAN.md as a prioritized bullet-point list of tasks
yet to be implemented. Do NOT implement anything.
IMPORTANT: Do NOT assume functionality is missing — search the
codebase first to confirm. Prefer updating existing utilities over
creating ad-hoc copies.
```
### Example `PROMPT_build.md`
```markdown
0a. Study `specs/*` to learn the application specifications.
0b. Study IMPLEMENTATION_PLAN.md.
0c. Study `src/` for reference.
1. Choose the most important item from IMPLEMENTATION_PLAN.md. Before
making changes, search the codebase (don't assume not implemented).
2. After implementing, run the tests. If functionality is missing, add it.
3. When you discover issues, update IMPLEMENTATION_PLAN.md immediately.
4. When tests pass, update IMPLEMENTATION_PLAN.md, then `git add -A`
then `git commit` with a descriptive message.
99999. When authoring documentation, capture the why.
999999. Implement completely. No placeholders or stubs.
9999999. Keep IMPLEMENTATION_PLAN.md current — future iterations depend on it.
```
### Example `AGENTS.md`
Keep this brief (~60 lines). It's loaded every iteration, so bloat wastes context.
```markdown
## Build & Run
go build ./...
## Validation
- Tests: `go test ./...`
- Vet: `go vet ./...`
```
## Best Practices ## Best Practices
1. **Write clear completion criteria**: Include exactly what "done" looks like 1. **Fresh context per iteration**: Never accumulate context across iterations — that's the whole point
2. **Use output markers**: Include `<promise>COMPLETE</promise>` or similar in completion condition 2. **Disk is your database**: `IMPLEMENTATION_PLAN.md` is shared state between isolated sessions
3. **Always set max iterations**: Prevents infinite loops on impossible tasks 3. **Backpressure is essential**: Tests, builds, lints in `AGENTS.md` — the agent must pass them before committing
4. **Persist state**: Save files so AI can see what changed between iterations 4. **Start with PLANNING mode**: Generate the plan first, then switch to BUILDING
5. **Include context**: Feed previous iteration output back as context 5. **Observe and tune**: Watch early iterations, add guardrails to prompts when the agent fails in specific ways
6. **Monitor progress**: Log each iteration to track what's happening 6. **The plan is disposable**: If the agent goes off track, delete `IMPLEMENTATION_PLAN.md` and re-plan
7. **Keep `AGENTS.md` brief**: It's loaded every iteration — operational info only, no progress notes
8. **Use a sandbox**: The agent runs autonomously with full tool access — isolate it
## Example: Iterative Code Generation ## When to Use a Ralph Loop
```go
prompt := `Write a function that:
1. Parses CSV data
2. Validates required fields
3. Returns parsed records or error
4. Has unit tests
5. Output <promise>COMPLETE</promise> when done`
loop := NewRalphLoop(10, "COMPLETE")
result, err := loop.Run(context.Background(), prompt)
```
## Handling Failures
```go
ctx := context.Background()
loop := NewRalphLoop(5, "COMPLETE")
result, err := loop.Run(ctx, prompt)
if err != nil {
log.Printf("Task failed: %v", err)
log.Printf("Last attempt: %s", loop.LastResponse)
}
```
## When to Use RALPH-loop
**Good for:** **Good for:**
- Code generation with automatic verification (tests, linters) - Implementing features from specs with test-driven validation
- Tasks with clear success criteria - Large refactors broken into many small tasks
- Iterative refinement where each attempt learns from previous failures - Unattended, long-running development with clear requirements
- Unattended long-running improvements - Any work where backpressure (tests/builds) can verify correctness
**Not good for:** **Not good for:**
- Tasks requiring human judgment or design input - Tasks requiring human judgment mid-loop
- One-shot operations - One-shot operations that don't benefit from iteration
- Tasks with vague success criteria - Vague requirements without testable acceptance criteria
- Real-time interactive debugging - Exploratory prototyping where direction isn't clear

View File

@@ -4,127 +4,101 @@ import (
"context" "context"
"fmt" "fmt"
"log" "log"
"os"
"os/exec"
"strconv"
"strings" "strings"
copilot "github.com/github/copilot-sdk/go" copilot "github.com/github/copilot-sdk/go"
) )
// RalphLoop implements iterative self-referential feedback loops. // Ralph loop: autonomous AI task loop with fresh context per iteration.
// The same prompt is sent repeatedly, with AI reading its own previous output. //
// Loop continues until completion promise is detected in the response. // Two modes:
type RalphLoop struct { // - "plan": reads PROMPT_plan.md, generates/updates IMPLEMENTATION_PLAN.md
client *copilot.Client // - "build": reads PROMPT_build.md, implements tasks, runs tests, commits
iteration int //
maxIterations int // Each iteration creates a fresh session so the agent always operates in
completionPromise string // the "smart zone" of its context window. State is shared between
LastResponse string // iterations via files on disk (IMPLEMENTATION_PLAN.md, AGENTS.md, specs/*).
} //
// Usage:
// go run ralph-loop.go # build mode, 50 iterations
// go run ralph-loop.go plan # planning mode
// go run ralph-loop.go 20 # build mode, 20 iterations
// go run ralph-loop.go plan 5 # planning mode, 5 iterations
// NewRalphLoop creates a new RALPH-loop instance. func ralphLoop(ctx context.Context, mode string, maxIterations int) error {
func NewRalphLoop(maxIterations int, completionPromise string) *RalphLoop { promptFile := "PROMPT_build.md"
return &RalphLoop{ if mode == "plan" {
client: copilot.NewClient(nil), promptFile = "PROMPT_plan.md"
maxIterations: maxIterations,
completionPromise: completionPromise,
} }
}
// Run executes the RALPH-loop until completion promise is detected or max iterations reached. client := copilot.NewClient(nil)
func (r *RalphLoop) Run(ctx context.Context, initialPrompt string) (string, error) { if err := client.Start(ctx); err != nil {
if err := r.client.Start(ctx); err != nil { return fmt.Errorf("failed to start client: %w", err)
return "", fmt.Errorf("failed to start client: %w", err)
} }
defer r.client.Stop() defer client.Stop()
session, err := r.client.CreateSession(ctx, &copilot.SessionConfig{ branchOut, _ := exec.Command("git", "branch", "--show-current").Output()
Model: "gpt-5.1-codex-mini", branch := strings.TrimSpace(string(branchOut))
})
fmt.Println(strings.Repeat("━", 40))
fmt.Printf("Mode: %s\n", mode)
fmt.Printf("Prompt: %s\n", promptFile)
fmt.Printf("Branch: %s\n", branch)
fmt.Printf("Max: %d iterations\n", maxIterations)
fmt.Println(strings.Repeat("━", 40))
prompt, err := os.ReadFile(promptFile)
if err != nil { if err != nil {
return "", fmt.Errorf("failed to create session: %w", err) return fmt.Errorf("failed to read %s: %w", promptFile, err)
} }
defer session.Destroy()
for r.iteration < r.maxIterations { for i := 1; i <= maxIterations; i++ {
r.iteration++ fmt.Printf("\n=== Iteration %d/%d ===\n", i, maxIterations)
fmt.Printf("\n=== Iteration %d/%d ===\n", r.iteration, r.maxIterations)
currentPrompt := r.buildIterationPrompt(initialPrompt) // Fresh session — each task gets full context budget
fmt.Printf("Sending prompt (length: %d)...\n", len(currentPrompt)) session, err := client.CreateSession(ctx, &copilot.SessionConfig{
Model: "claude-sonnet-4.5",
result, err := session.SendAndWait(ctx, copilot.MessageOptions{
Prompt: currentPrompt,
}) })
if err != nil { if err != nil {
return "", fmt.Errorf("send failed on iteration %d: %w", r.iteration, err) return fmt.Errorf("failed to create session: %w", err)
} }
if result != nil && result.Data.Content != nil { _, err = session.SendAndWait(ctx, copilot.MessageOptions{
r.LastResponse = *result.Data.Content Prompt: string(prompt),
} else { })
r.LastResponse = "" session.Destroy()
if err != nil {
return fmt.Errorf("send failed on iteration %d: %w", i, err)
} }
// Display response summary // Push changes after each iteration
summary := r.LastResponse if err := exec.Command("git", "push", "origin", branch).Run(); err != nil {
if len(summary) > 200 { exec.Command("git", "push", "-u", "origin", branch).Run()
summary = summary[:200] + "..."
}
fmt.Printf("Response: %s\n", summary)
// Check for completion promise
if strings.Contains(r.LastResponse, r.completionPromise) {
fmt.Printf("\n✓ Success! Completion promise detected: '%s'\n", r.completionPromise)
return r.LastResponse, nil
} }
fmt.Printf("Iteration %d complete. Continuing...\n", r.iteration) fmt.Printf("\nIteration %d complete.\n", i)
} }
return "", fmt.Errorf("maximum iterations (%d) reached without detecting completion promise: '%s'", fmt.Printf("\nReached max iterations: %d\n", maxIterations)
r.maxIterations, r.completionPromise) return nil
}
func (r *RalphLoop) buildIterationPrompt(initialPrompt string) string {
if r.iteration == 1 {
return initialPrompt
}
return fmt.Sprintf(`%s
=== CONTEXT FROM PREVIOUS ITERATION ===
%s
=== END CONTEXT ===
Continue working on this task. Review the previous attempt and improve upon it.`,
initialPrompt, r.LastResponse)
} }
func main() { func main() {
prompt := `You are iteratively building a small library. Follow these phases IN ORDER. mode := "build"
Do NOT skip ahead — only do the current phase, then stop and wait for the next iteration. maxIterations := 50
Phase 1: Design a DataValidator struct that validates records against a schema. for _, arg := range os.Args[1:] {
- Schema defines field names, types (string, int, float, bool), and whether required. if arg == "plan" {
- Return a slice of validation errors per record. mode = "plan"
- Show the struct and method code only. Do NOT output COMPLETE. } else if n, err := strconv.Atoi(arg); err == nil {
maxIterations = n
Phase 2: Write at least 4 unit tests covering: missing required field, wrong type, }
valid record, and empty input. Show test code only. Do NOT output COMPLETE.
Phase 3: Review the code from phases 1 and 2. Fix any bugs, add doc comments, and add
an extra edge-case test. Show the final consolidated code with all fixes.
When this phase is fully done, output the exact text: COMPLETE`
ctx := context.Background()
loop := NewRalphLoop(5, "COMPLETE")
result, err := loop.Run(ctx, prompt)
if err != nil {
log.Printf("Task did not complete: %v", err)
log.Printf("Last attempt: %s", loop.LastResponse)
return
} }
fmt.Println("\n=== FINAL RESULT ===") if err := ralphLoop(context.Background(), mode, maxIterations); err != nil {
fmt.Println(result) log.Fatal(err)
}
} }

View File

@@ -1,6 +1,6 @@
# RALPH-loop: Iterative Self-Referential AI Loops # Ralph Loop: Autonomous AI Task Loops
Implement self-referential feedback loops where an AI agent iteratively improves work by reading its own previous output. Build autonomous coding loops where an AI agent picks tasks, implements them, validates against backpressure (tests, builds), commits, and repeats — each iteration in a fresh context window.
> **Runnable example:** [recipe/ralph-loop.ts](recipe/ralph-loop.ts) > **Runnable example:** [recipe/ralph-loop.ts](recipe/ralph-loop.ts)
> >
@@ -9,200 +9,217 @@ Implement self-referential feedback loops where an AI agent iteratively improves
> npx tsx ralph-loop.ts > npx tsx ralph-loop.ts
> ``` > ```
## What is RALPH-loop? ## What is a Ralph Loop?
RALPH-loop is a development methodology for iterative AI-powered task completion. Named after the Ralph Wiggum technique, it embodies the philosophy of persistent iteration: A [Ralph loop](https://ghuntley.com/ralph/) is an autonomous development workflow where an AI agent iterates through tasks in isolated context windows. The key insight: **state lives on disk, not in the model's context**. Each iteration starts fresh, reads the current state from files, does one task, writes results back to disk, and exits.
- **One prompt, multiple iterations**: The same prompt is processed repeatedly ```
- **Self-referential feedback**: The AI reads its own previous work (file changes, git history) ┌─────────────────────────────────────────────────┐
- **Completion detection**: Loop exits when a completion promise is detected in output │ loop.sh │
- **Safety limits**: Always include a maximum iteration count to prevent infinite loops │ while true: │
│ ┌─────────────────────────────────────────┐ │
## Example Scenario │ │ Fresh session (isolated context) │ │
│ │ │ │
You need to iteratively improve code until all tests pass. Instead of asking Copilot to "write perfect code," you use RALPH-loop to: │ │ 1. Read PROMPT.md + AGENTS.md │ │
│ │ 2. Study specs/* and code │ │
1. Send the initial prompt with clear success criteria │ │ 3. Pick next task from plan │ │
2. Copilot writes code and tests │ │ 4. Implement + run tests │ │
3. Copilot runs tests and sees failures │ │ 5. Update plan, commit, exit │ │
4. Loop automatically re-sends the prompt │ └─────────────────────────────────────────┘ │
5. Copilot reads test output and previous code, fixes issues │ ↻ next iteration (fresh context) │
6. Repeat until all tests pass and completion promise is output └─────────────────────────────────────────────────┘
## Basic Implementation
```typescript
import { CopilotClient } from "@github/copilot-sdk";
class RalphLoop {
private client: CopilotClient;
private iteration: number = 0;
private maxIterations: number;
private completionPromise: string;
private lastResponse: string | null = null;
constructor(maxIterations: number = 10, completionPromise: string = "COMPLETE") {
this.client = new CopilotClient();
this.maxIterations = maxIterations;
this.completionPromise = completionPromise;
}
async run(initialPrompt: string): Promise<string> {
await this.client.start();
const session = await this.client.createSession({ model: "gpt-5.1-codex-mini" });
try {
while (this.iteration < this.maxIterations) {
this.iteration++;
console.log(`\n--- Iteration ${this.iteration}/${this.maxIterations} ---`);
// Build prompt including previous response as context
const prompt = this.iteration === 1
? initialPrompt
: `${initialPrompt}\n\nPrevious attempt:\n${this.lastResponse}\n\nContinue improving...`;
const response = await session.sendAndWait({ prompt });
this.lastResponse = response?.data.content || "";
console.log(`Response (${this.lastResponse.length} chars)`);
// Check for completion promise
if (this.lastResponse.includes(this.completionPromise)) {
console.log(`✓ Completion promise detected: ${this.completionPromise}`);
return this.lastResponse;
}
console.log(`Continuing to iteration ${this.iteration + 1}...`);
}
throw new Error(
`Max iterations (${this.maxIterations}) reached without completion promise`
);
} finally {
await session.destroy();
await this.client.stop();
}
}
}
// Usage
const loop = new RalphLoop(5, "COMPLETE");
const result = await loop.run("Your task here");
console.log(result);
``` ```
## With File Persistence **Core principles:**
For tasks involving code generation, persist state to files so the AI can see changes: - **Fresh context per iteration**: Each loop creates a new session — no context accumulation, always in the "smart zone"
- **Disk as shared state**: `IMPLEMENTATION_PLAN.md` persists between iterations and acts as the coordination mechanism
- **Backpressure steers quality**: Tests, builds, and lints reject bad work — the agent must fix issues before committing
- **Two modes**: PLANNING (gap analysis → generate plan) and BUILDING (implement from plan)
## Simple Version
The minimal Ralph loop — the SDK equivalent of `while :; do cat PROMPT.md | claude ; done`:
```typescript ```typescript
import fs from "fs/promises"; import { readFile } from "fs/promises";
import path from "path";
import { CopilotClient } from "@github/copilot-sdk"; import { CopilotClient } from "@github/copilot-sdk";
class PersistentRalphLoop { async function ralphLoop(promptFile: string, maxIterations: number = 50) {
private client: CopilotClient; const client = new CopilotClient();
private workDir: string; await client.start();
private iteration: number = 0;
private maxIterations: number;
constructor(workDir: string, maxIterations: number = 10) { try {
this.client = new CopilotClient(); const prompt = await readFile(promptFile, "utf-8");
this.workDir = workDir;
this.maxIterations = maxIterations;
}
async run(initialPrompt: string): Promise<string> { for (let i = 1; i <= maxIterations; i++) {
await fs.mkdir(this.workDir, { recursive: true }); console.log(`\n=== Iteration ${i}/${maxIterations} ===`);
await this.client.start();
const session = await this.client.createSession({ model: "gpt-5.1-codex-mini" });
try { // Fresh session each iteration — context isolation is the point
// Store initial prompt const session = await client.createSession({ model: "claude-sonnet-4.5" });
await fs.writeFile(path.join(this.workDir, "prompt.md"), initialPrompt); try {
await session.sendAndWait({ prompt }, 600_000);
while (this.iteration < this.maxIterations) { } finally {
this.iteration++; await session.destroy();
console.log(`\n--- Iteration ${this.iteration} ---`);
// Build context from previous outputs
let context = initialPrompt;
const prevOutputFile = path.join(this.workDir, `output-${this.iteration - 1}.txt`);
try {
const prevOutput = await fs.readFile(prevOutputFile, "utf-8");
context += `\n\nPrevious iteration:\n${prevOutput}`;
} catch {
// No previous output yet
}
const response = await session.sendAndWait({ prompt: context });
const output = response?.data.content || "";
// Persist output
await fs.writeFile(
path.join(this.workDir, `output-${this.iteration}.txt`),
output
);
if (output.includes("COMPLETE")) {
return output;
}
} }
throw new Error("Max iterations reached"); console.log(`Iteration ${i} complete.`);
} finally {
await session.destroy();
await this.client.stop();
} }
} finally {
await client.stop();
} }
} }
// Usage: point at your PROMPT.md
ralphLoop("PROMPT.md", 20);
```
This is all you need to get started. The prompt file tells the agent what to do; the agent reads project files, does work, commits, and exits. The loop restarts with a clean slate.
## Ideal Version
The full Ralph pattern with planning and building modes, matching the [Ralph Playbook](https://github.com/ClaytonFarr/ralph-playbook) architecture:
```typescript
import { readFile } from "fs/promises";
import { execSync } from "child_process";
import { CopilotClient } from "@github/copilot-sdk";
type Mode = "plan" | "build";
async function ralphLoop(mode: Mode, maxIterations: number = 50) {
const promptFile = mode === "plan" ? "PROMPT_plan.md" : "PROMPT_build.md";
const client = new CopilotClient();
await client.start();
const branch = execSync("git branch --show-current", { encoding: "utf-8" }).trim();
console.log(`Mode: ${mode} | Prompt: ${promptFile} | Branch: ${branch}`);
try {
const prompt = await readFile(promptFile, "utf-8");
for (let i = 1; i <= maxIterations; i++) {
console.log(`\n=== Iteration ${i}/${maxIterations} ===`);
// Fresh session — each task gets full context budget
const session = await client.createSession({ model: "claude-sonnet-4.5" });
try {
await session.sendAndWait({ prompt }, 600_000);
} finally {
await session.destroy();
}
// Push changes after each iteration
try {
execSync(`git push origin ${branch}`, { stdio: "inherit" });
} catch {
execSync(`git push -u origin ${branch}`, { stdio: "inherit" });
}
console.log(`Iteration ${i} complete.`);
}
} finally {
await client.stop();
}
}
// Parse CLI args: npx tsx ralph-loop.ts [plan] [max_iterations]
const args = process.argv.slice(2);
const mode: Mode = args.includes("plan") ? "plan" : "build";
const maxArg = args.find(a => /^\d+$/.test(a));
const maxIterations = maxArg ? parseInt(maxArg) : 50;
ralphLoop(mode, maxIterations);
```
### Required Project Files
The ideal version expects this file structure in your project:
```
project-root/
├── PROMPT_plan.md # Planning mode instructions
├── PROMPT_build.md # Building mode instructions
├── AGENTS.md # Operational guide (build/test commands)
├── IMPLEMENTATION_PLAN.md # Task list (generated by planning mode)
├── specs/ # Requirement specs (one per topic)
│ ├── auth.md
│ └── data-pipeline.md
└── src/ # Your source code
```
### Example `PROMPT_plan.md`
```markdown
0a. Study `specs/*` to learn the application specifications.
0b. Study IMPLEMENTATION_PLAN.md (if present) to understand the plan so far.
0c. Study `src/` to understand existing code and shared utilities.
1. Compare specs against code (gap analysis). Create or update
IMPLEMENTATION_PLAN.md as a prioritized bullet-point list of tasks
yet to be implemented. Do NOT implement anything.
IMPORTANT: Do NOT assume functionality is missing — search the
codebase first to confirm. Prefer updating existing utilities over
creating ad-hoc copies.
```
### Example `PROMPT_build.md`
```markdown
0a. Study `specs/*` to learn the application specifications.
0b. Study IMPLEMENTATION_PLAN.md.
0c. Study `src/` for reference.
1. Choose the most important item from IMPLEMENTATION_PLAN.md. Before
making changes, search the codebase (don't assume not implemented).
2. After implementing, run the tests. If functionality is missing, add it.
3. When you discover issues, update IMPLEMENTATION_PLAN.md immediately.
4. When tests pass, update IMPLEMENTATION_PLAN.md, then `git add -A`
then `git commit` with a descriptive message.
99999. When authoring documentation, capture the why.
999999. Implement completely. No placeholders or stubs.
9999999. Keep IMPLEMENTATION_PLAN.md current — future iterations depend on it.
```
### Example `AGENTS.md`
Keep this brief (~60 lines). It's loaded every iteration, so bloat wastes context.
```markdown
## Build & Run
npm run build
## Validation
- Tests: `npm test`
- Typecheck: `npx tsc --noEmit`
- Lint: `npm run lint`
``` ```
## Best Practices ## Best Practices
1. **Write clear completion criteria**: Include exactly what "done" looks like 1. **Fresh context per iteration**: Never accumulate context across iterations — that's the whole point
2. **Use output markers**: Include `<promise>COMPLETE</promise>` or similar in completion condition 2. **Disk is your database**: `IMPLEMENTATION_PLAN.md` is shared state between isolated sessions
3. **Always set max iterations**: Prevents infinite loops on impossible tasks 3. **Backpressure is essential**: Tests, builds, lints in `AGENTS.md` — the agent must pass them before committing
4. **Persist state**: Save files so AI can see what changed between iterations 4. **Start with PLANNING mode**: Generate the plan first, then switch to BUILDING
5. **Include context**: Feed previous iteration output back as context 5. **Observe and tune**: Watch early iterations, add guardrails to prompts when the agent fails in specific ways
6. **Monitor progress**: Log each iteration to track what's happening 6. **The plan is disposable**: If the agent goes off track, delete `IMPLEMENTATION_PLAN.md` and re-plan
7. **Keep `AGENTS.md` brief**: It's loaded every iteration — operational info only, no progress notes
8. **Use a sandbox**: The agent runs autonomously with full tool access — isolate it
## Example: Iterative Code Generation ## When to Use a Ralph Loop
```typescript
const prompt = `Write a function that:
1. Parses CSV data
2. Validates required fields
3. Returns parsed records or error
4. Has unit tests
5. Output <promise>COMPLETE</promise> when done`;
const loop = new RalphLoop(10, "COMPLETE");
const result = await loop.run(prompt);
```
## Handling Failures
```typescript
try {
const result = await loop.run(prompt);
console.log("Task completed successfully!");
} catch (error) {
console.error("Task failed:", error.message);
// Analyze what was attempted and suggest alternatives
}
```
## When to Use RALPH-loop
**Good for:** **Good for:**
- Code generation with automatic verification (tests, linters) - Implementing features from specs with test-driven validation
- Tasks with clear success criteria - Large refactors broken into many small tasks
- Iterative refinement where each attempt learns from previous failures - Unattended, long-running development with clear requirements
- Unattended long-running improvements - Any work where backpressure (tests/builds) can verify correctness
**Not good for:** **Not good for:**
- Tasks requiring human judgment or design input - Tasks requiring human judgment mid-loop
- One-shot operations - One-shot operations that don't benefit from iteration
- Tasks with vague success criteria - Vague requirements without testable acceptance criteria
- Real-time interactive debugging - Exploratory prototyping where direction isn't clear

View File

@@ -1,128 +1,79 @@
import { readFile } from "fs/promises";
import { execSync } from "child_process";
import { CopilotClient } from "@github/copilot-sdk"; import { CopilotClient } from "@github/copilot-sdk";
/** /**
* RALPH-loop implementation: Iterative self-referential AI loops. * Ralph loop: autonomous AI task loop with fresh context per iteration.
* The same prompt is sent repeatedly, with AI reading its own previous output. *
* Loop continues until completion promise is detected in the response. * Two modes:
* - "plan": reads PROMPT_plan.md, generates/updates IMPLEMENTATION_PLAN.md
* - "build": reads PROMPT_build.md, implements tasks, runs tests, commits
*
* Each iteration creates a fresh session so the agent always operates in
* the "smart zone" of its context window. State is shared between
* iterations via files on disk (IMPLEMENTATION_PLAN.md, AGENTS.md, specs/*).
*
* Usage:
* npx tsx ralph-loop.ts # build mode, 50 iterations
* npx tsx ralph-loop.ts plan # planning mode
* npx tsx ralph-loop.ts 20 # build mode, 20 iterations
* npx tsx ralph-loop.ts plan 5 # planning mode, 5 iterations
*/ */
class RalphLoop {
private client: CopilotClient;
private iteration: number = 0;
private readonly maxIterations: number;
private readonly completionPromise: string;
public lastResponse: string | null = null;
constructor(maxIterations: number = 10, completionPromise: string = "COMPLETE") { type Mode = "plan" | "build";
this.client = new CopilotClient();
this.maxIterations = maxIterations;
this.completionPromise = completionPromise;
}
/** async function ralphLoop(mode: Mode, maxIterations: number) {
* Run the RALPH-loop until completion promise is detected or max iterations reached. const promptFile = mode === "plan" ? "PROMPT_plan.md" : "PROMPT_build.md";
*/
async run(initialPrompt: string): Promise<string> {
let session: Awaited<ReturnType<CopilotClient["createSession"]>> | null = null;
await this.client.start(); const client = new CopilotClient();
try { await client.start();
session = await this.client.createSession({
model: "gpt-5.1-codex-mini" const branch = execSync("git branch --show-current", { encoding: "utf-8" }).trim();
console.log("━".repeat(40));
console.log(`Mode: ${mode}`);
console.log(`Prompt: ${promptFile}`);
console.log(`Branch: ${branch}`);
console.log(`Max: ${maxIterations} iterations`);
console.log("━".repeat(40));
try {
const prompt = await readFile(promptFile, "utf-8");
for (let i = 1; i <= maxIterations; i++) {
console.log(`\n=== Iteration ${i}/${maxIterations} ===`);
// Fresh session — each task gets full context budget
const session = await client.createSession({
model: "claude-sonnet-4.5",
}); });
try { try {
while (this.iteration < this.maxIterations) { await session.sendAndWait({ prompt }, 600_000);
this.iteration++;
console.log(`\n=== Iteration ${this.iteration}/${this.maxIterations} ===`);
// Build the prompt for this iteration
const currentPrompt = this.buildIterationPrompt(initialPrompt);
console.log(`Sending prompt (length: ${currentPrompt.length})...`);
const response = await session.sendAndWait({ prompt: currentPrompt }, 300_000);
this.lastResponse = response?.data.content || "";
// Display response summary
const summary = this.lastResponse.length > 200
? this.lastResponse.substring(0, 200) + "..."
: this.lastResponse;
console.log(`Response: ${summary}`);
// Check for completion promise
if (this.lastResponse.includes(this.completionPromise)) {
console.log(`\n✓ Success! Completion promise detected: '${this.completionPromise}'`);
return this.lastResponse;
}
console.log(`Iteration ${this.iteration} complete. Checking for next iteration...`);
}
// Max iterations reached without completion
throw new Error(
`Maximum iterations (${this.maxIterations}) reached without detecting completion promise: '${this.completionPromise}'`
);
} catch (error) {
console.error(`\nError during RALPH-loop: ${error instanceof Error ? error.message : String(error)}`);
throw error;
} finally { } finally {
if (session) { await session.destroy();
await session.destroy();
}
} }
} finally {
await this.client.stop();
}
}
/** // Push changes after each iteration
* Build the prompt for the current iteration, including previous output as context. try {
*/ execSync(`git push origin ${branch}`, { stdio: "inherit" });
private buildIterationPrompt(initialPrompt: string): string { } catch {
if (this.iteration === 1) { execSync(`git push -u origin ${branch}`, { stdio: "inherit" });
// First iteration: just the initial prompt }
return initialPrompt;
console.log(`\nIteration ${i} complete.`);
} }
// Subsequent iterations: include previous output as context console.log(`\nReached max iterations: ${maxIterations}`);
return `${initialPrompt} } finally {
await client.stop();
=== CONTEXT FROM PREVIOUS ITERATION ===
${this.lastResponse}
=== END CONTEXT ===
Continue working on this task. Review the previous attempt and improve upon it.`;
} }
} }
// Example usage demonstrating RALPH-loop // Parse CLI args
async function main() { const args = process.argv.slice(2);
const prompt = `You are iteratively building a small library. Follow these phases IN ORDER. const mode: Mode = args.includes("plan") ? "plan" : "build";
Do NOT skip ahead — only do the current phase, then stop and wait for the next iteration. const maxArg = args.find((a) => /^\d+$/.test(a));
const maxIterations = maxArg ? parseInt(maxArg) : 50;
Phase 1: Design a DataValidator class that validates records against a schema. ralphLoop(mode, maxIterations).catch(console.error);
- Schema defines field names, types (str, int, float, bool), and whether required.
- Return a list of validation errors per record.
- Show the class code only. Do NOT output COMPLETE.
Phase 2: Write at least 4 unit tests covering: missing required field, wrong type,
valid record, and empty input. Show test code only. Do NOT output COMPLETE.
Phase 3: Review the code from phases 1 and 2. Fix any bugs, add docstrings, and add
an extra edge-case test. Show the final consolidated code with all fixes.
When this phase is fully done, output the exact text: COMPLETE`;
const loop = new RalphLoop(5, "COMPLETE");
try {
const result = await loop.run(prompt);
console.log("\n=== FINAL RESULT ===");
console.log(result);
} catch (error) {
console.error(`\nTask did not complete: ${error instanceof Error ? error.message : String(error)}`);
if (loop.lastResponse) {
console.log(`\nLast attempt:\n${loop.lastResponse}`);
}
}
}
main().catch(console.error);

View File

@@ -1,6 +1,6 @@
# RALPH-loop: Iterative Self-Referential AI Loops # Ralph Loop: Autonomous AI Task Loops
Implement self-referential feedback loops where an AI agent iteratively improves work by reading its own previous output. Build autonomous coding loops where an AI agent picks tasks, implements them, validates against backpressure (tests, builds), commits, and repeats — each iteration in a fresh context window.
> **Runnable example:** [recipe/ralph_loop.py](recipe/ralph_loop.py) > **Runnable example:** [recipe/ralph_loop.py](recipe/ralph_loop.py)
> >
@@ -8,196 +8,235 @@ Implement self-referential feedback loops where an AI agent iteratively improves
> cd recipe && pip install -r requirements.txt > cd recipe && pip install -r requirements.txt
> python ralph_loop.py > python ralph_loop.py
> ``` > ```
## What is RALPH-loop?
RALPH-loop is a development methodology for iterative AI-powered task completion. Named after the Ralph Wiggum technique, it embodies the philosophy of persistent iteration: ## What is a Ralph Loop?
- **One prompt, multiple iterations**: The same prompt is processed repeatedly A [Ralph loop](https://ghuntley.com/ralph/) is an autonomous development workflow where an AI agent iterates through tasks in isolated context windows. The key insight: **state lives on disk, not in the model's context**. Each iteration starts fresh, reads the current state from files, does one task, writes results back to disk, and exits.
- **Self-referential feedback**: The AI reads its own previous work (file changes, git history)
- **Completion detection**: Loop exits when a completion promise is detected in output
- **Safety limits**: Always include a maximum iteration count to prevent infinite loops
## Example Scenario ```
┌─────────────────────────────────────────────────┐
You need to iteratively improve code until all tests pass. Instead of asking Copilot to "write perfect code," you use RALPH-loop to: │ loop.sh │
│ while true: │
1. Send the initial prompt with clear success criteria │ ┌─────────────────────────────────────────┐ │
2. Copilot writes code and tests │ │ Fresh session (isolated context) │ │
3. Copilot runs tests and sees failures │ │ │ │
4. Loop automatically re-sends the prompt │ │ 1. Read PROMPT.md + AGENTS.md │ │
5. Copilot reads test output and previous code, fixes issues │ │ 2. Study specs/* and code │ │
6. Repeat until all tests pass and completion promise is output │ │ 3. Pick next task from plan │ │
│ │ 4. Implement + run tests │ │
## Basic Implementation │ │ 5. Update plan, commit, exit │ │
│ └─────────────────────────────────────────┘ │
```python │ ↻ next iteration (fresh context) │
import asyncio └─────────────────────────────────────────────────┘
from copilot import CopilotClient, MessageOptions, SessionConfig
class RalphLoop:
"""Iterative self-referential feedback loop using Copilot."""
def __init__(self, max_iterations=10, completion_promise="COMPLETE"):
self.client = CopilotClient()
self.iteration = 0
self.max_iterations = max_iterations
self.completion_promise = completion_promise
self.last_response = None
async def run(self, initial_prompt):
"""Run the RALPH-loop until completion promise detected or max iterations reached."""
await self.client.start()
session = await self.client.create_session(
SessionConfig(model="gpt-5.1-codex-mini")
)
try:
while self.iteration < self.max_iterations:
self.iteration += 1
print(f"\n--- Iteration {self.iteration}/{self.max_iterations} ---")
# Build prompt including previous response as context
if self.iteration == 1:
prompt = initial_prompt
else:
prompt = f"{initial_prompt}\n\nPrevious attempt:\n{self.last_response}\n\nContinue improving..."
result = await session.send_and_wait(
MessageOptions(prompt=prompt), timeout=300
)
self.last_response = result.data.content if result else ""
print(f"Response ({len(self.last_response)} chars)")
# Check for completion promise
if self.completion_promise in self.last_response:
print(f"✓ Completion promise detected: {self.completion_promise}")
return self.last_response
print(f"Continuing to iteration {self.iteration + 1}...")
raise RuntimeError(
f"Max iterations ({self.max_iterations}) reached without completion promise"
)
finally:
await session.destroy()
await self.client.stop()
# Usage
async def main():
loop = RalphLoop(5, "COMPLETE")
result = await loop.run("Your task here")
print(result)
asyncio.run(main())
``` ```
## With File Persistence **Core principles:**
For tasks involving code generation, persist state to files so the AI can see changes: - **Fresh context per iteration**: Each loop creates a new session — no context accumulation, always in the "smart zone"
- **Disk as shared state**: `IMPLEMENTATION_PLAN.md` persists between iterations and acts as the coordination mechanism
- **Backpressure steers quality**: Tests, builds, and lints reject bad work — the agent must fix issues before committing
- **Two modes**: PLANNING (gap analysis → generate plan) and BUILDING (implement from plan)
## Simple Version
The minimal Ralph loop — the SDK equivalent of `while :; do cat PROMPT.md | claude ; done`:
```python ```python
import asyncio import asyncio
from pathlib import Path from pathlib import Path
from copilot import CopilotClient, MessageOptions, SessionConfig from copilot import CopilotClient, MessageOptions, SessionConfig
class PersistentRalphLoop:
"""RALPH-loop with file-based state persistence."""
def __init__(self, work_dir, max_iterations=10): async def ralph_loop(prompt_file: str, max_iterations: int = 50):
self.client = CopilotClient() client = CopilotClient()
self.work_dir = Path(work_dir) await client.start()
self.work_dir.mkdir(parents=True, exist_ok=True)
self.iteration = 0
self.max_iterations = max_iterations
async def run(self, initial_prompt): try:
"""Run the loop with persistent state.""" prompt = Path(prompt_file).read_text()
await self.client.start()
session = await self.client.create_session(
SessionConfig(model="gpt-5.1-codex-mini")
)
try: for i in range(1, max_iterations + 1):
# Store initial prompt print(f"\n=== Iteration {i}/{max_iterations} ===")
(self.work_dir / "prompt.md").write_text(initial_prompt)
while self.iteration < self.max_iterations: # Fresh session each iteration — context isolation is the point
self.iteration += 1 session = await client.create_session(
print(f"\n--- Iteration {self.iteration} ---") SessionConfig(model="claude-sonnet-4.5")
)
# Build context from previous outputs try:
context = initial_prompt await session.send_and_wait(
prev_output = self.work_dir / f"output-{self.iteration - 1}.txt" MessageOptions(prompt=prompt), timeout=600
if prev_output.exists():
context += f"\n\nPrevious iteration:\n{prev_output.read_text()}"
result = await session.send_and_wait(
MessageOptions(prompt=context), timeout=300
) )
response = result.data.content if result else "" finally:
await session.destroy()
# Persist output print(f"Iteration {i} complete.")
output_file = self.work_dir / f"output-{self.iteration}.txt" finally:
output_file.write_text(response) await client.stop()
if "COMPLETE" in response:
return response
raise RuntimeError("Max iterations reached") # Usage: point at your PROMPT.md
finally: asyncio.run(ralph_loop("PROMPT.md", 20))
await session.destroy() ```
await self.client.stop()
This is all you need to get started. The prompt file tells the agent what to do; the agent reads project files, does work, commits, and exits. The loop restarts with a clean slate.
## Ideal Version
The full Ralph pattern with planning and building modes, matching the [Ralph Playbook](https://github.com/ClaytonFarr/ralph-playbook) architecture:
```python
import asyncio
import subprocess
import sys
from pathlib import Path
from copilot import CopilotClient, MessageOptions, SessionConfig
async def ralph_loop(mode: str = "build", max_iterations: int = 50):
prompt_file = "PROMPT_plan.md" if mode == "plan" else "PROMPT_build.md"
client = CopilotClient()
await client.start()
branch = subprocess.check_output(
["git", "branch", "--show-current"], text=True
).strip()
print("" * 40)
print(f"Mode: {mode}")
print(f"Prompt: {prompt_file}")
print(f"Branch: {branch}")
print(f"Max: {max_iterations} iterations")
print("" * 40)
try:
prompt = Path(prompt_file).read_text()
for i in range(1, max_iterations + 1):
print(f"\n=== Iteration {i}/{max_iterations} ===")
# Fresh session — each task gets full context budget
session = await client.create_session(
SessionConfig(model="claude-sonnet-4.5")
)
try:
await session.send_and_wait(
MessageOptions(prompt=prompt), timeout=600
)
finally:
await session.destroy()
# Push changes after each iteration
try:
subprocess.run(
["git", "push", "origin", branch], check=True
)
except subprocess.CalledProcessError:
subprocess.run(
["git", "push", "-u", "origin", branch], check=True
)
print(f"\nIteration {i} complete.")
print(f"\nReached max iterations: {max_iterations}")
finally:
await client.stop()
if __name__ == "__main__":
args = sys.argv[1:]
mode = "plan" if "plan" in args else "build"
max_iter = next((int(a) for a in args if a.isdigit()), 50)
asyncio.run(ralph_loop(mode, max_iter))
```
### Required Project Files
The ideal version expects this file structure in your project:
```
project-root/
├── PROMPT_plan.md # Planning mode instructions
├── PROMPT_build.md # Building mode instructions
├── AGENTS.md # Operational guide (build/test commands)
├── IMPLEMENTATION_PLAN.md # Task list (generated by planning mode)
├── specs/ # Requirement specs (one per topic)
│ ├── auth.md
│ └── data-pipeline.md
└── src/ # Your source code
```
### Example `PROMPT_plan.md`
```markdown
0a. Study `specs/*` to learn the application specifications.
0b. Study IMPLEMENTATION_PLAN.md (if present) to understand the plan so far.
0c. Study `src/` to understand existing code and shared utilities.
1. Compare specs against code (gap analysis). Create or update
IMPLEMENTATION_PLAN.md as a prioritized bullet-point list of tasks
yet to be implemented. Do NOT implement anything.
IMPORTANT: Do NOT assume functionality is missing — search the
codebase first to confirm. Prefer updating existing utilities over
creating ad-hoc copies.
```
### Example `PROMPT_build.md`
```markdown
0a. Study `specs/*` to learn the application specifications.
0b. Study IMPLEMENTATION_PLAN.md.
0c. Study `src/` for reference.
1. Choose the most important item from IMPLEMENTATION_PLAN.md. Before
making changes, search the codebase (don't assume not implemented).
2. After implementing, run the tests. If functionality is missing, add it.
3. When you discover issues, update IMPLEMENTATION_PLAN.md immediately.
4. When tests pass, update IMPLEMENTATION_PLAN.md, then `git add -A`
then `git commit` with a descriptive message.
99999. When authoring documentation, capture the why.
999999. Implement completely. No placeholders or stubs.
9999999. Keep IMPLEMENTATION_PLAN.md current — future iterations depend on it.
```
### Example `AGENTS.md`
Keep this brief (~60 lines). It's loaded every iteration, so bloat wastes context.
```markdown
## Build & Run
python -m pytest
## Validation
- Tests: `pytest`
- Typecheck: `mypy src/`
- Lint: `ruff check src/`
``` ```
## Best Practices ## Best Practices
1. **Write clear completion criteria**: Include exactly what "done" looks like 1. **Fresh context per iteration**: Never accumulate context across iterations — that's the whole point
2. **Use output markers**: Include `<promise>COMPLETE</promise>` or similar in completion condition 2. **Disk is your database**: `IMPLEMENTATION_PLAN.md` is shared state between isolated sessions
3. **Always set max iterations**: Prevents infinite loops on impossible tasks 3. **Backpressure is essential**: Tests, builds, lints in `AGENTS.md` — the agent must pass them before committing
4. **Persist state**: Save files so AI can see what changed between iterations 4. **Start with PLANNING mode**: Generate the plan first, then switch to BUILDING
5. **Include context**: Feed previous iteration output back as context 5. **Observe and tune**: Watch early iterations, add guardrails to prompts when the agent fails in specific ways
6. **Monitor progress**: Log each iteration to track what's happening 6. **The plan is disposable**: If the agent goes off track, delete `IMPLEMENTATION_PLAN.md` and re-plan
7. **Keep `AGENTS.md` brief**: It's loaded every iteration — operational info only, no progress notes
8. **Use a sandbox**: The agent runs autonomously with full tool access — isolate it
## Example: Iterative Code Generation ## When to Use a Ralph Loop
```python
prompt = """Write a function that:
1. Parses CSV data
2. Validates required fields
3. Returns parsed records or error
4. Has unit tests
5. Output <promise>COMPLETE</promise> when done"""
async def main():
loop = RalphLoop(10, "COMPLETE")
result = await loop.run(prompt)
asyncio.run(main())
```
## Handling Failures
```python
try:
result = await loop.run(prompt)
print("Task completed successfully!")
except RuntimeError as e:
print(f"Task failed: {e}")
if loop.last_response:
print(f"\nLast attempt:\n{loop.last_response}")
```
## When to Use RALPH-loop
**Good for:** **Good for:**
- Code generation with automatic verification (tests, linters) - Implementing features from specs with test-driven validation
- Tasks with clear success criteria - Large refactors broken into many small tasks
- Iterative refinement where each attempt learns from previous failures - Unattended, long-running development with clear requirements
- Unattended long-running improvements - Any work where backpressure (tests/builds) can verify correctness
**Not good for:** **Not good for:**
- Tasks requiring human judgment or design input - Tasks requiring human judgment mid-loop
- One-shot operations - One-shot operations that don't benefit from iteration
- Tasks with vague success criteria - Vague requirements without testable acceptance criteria
- Real-time interactive debugging - Exploratory prototyping where direction isn't clear

View File

@@ -1,127 +1,84 @@
#!/usr/bin/env python3 #!/usr/bin/env python3
"""
Ralph loop: autonomous AI task loop with fresh context per iteration.
Two modes:
- "plan": reads PROMPT_plan.md, generates/updates IMPLEMENTATION_PLAN.md
- "build": reads PROMPT_build.md, implements tasks, runs tests, commits
Each iteration creates a fresh session so the agent always operates in
the "smart zone" of its context window. State is shared between
iterations via files on disk (IMPLEMENTATION_PLAN.md, AGENTS.md, specs/*).
Usage:
python ralph_loop.py # build mode, 50 iterations
python ralph_loop.py plan # planning mode
python ralph_loop.py 20 # build mode, 20 iterations
python ralph_loop.py plan 5 # planning mode, 5 iterations
"""
import asyncio import asyncio
import subprocess
import sys
from pathlib import Path
from copilot import CopilotClient, MessageOptions, SessionConfig from copilot import CopilotClient, MessageOptions, SessionConfig
class RalphLoop: async def ralph_loop(mode: str = "build", max_iterations: int = 50):
""" prompt_file = "PROMPT_plan.md" if mode == "plan" else "PROMPT_build.md"
RALPH-loop implementation: Iterative self-referential AI loops.
The same prompt is sent repeatedly, with AI reading its own previous output. client = CopilotClient()
Loop continues until completion promise is detected in the response. await client.start()
"""
def __init__(self, max_iterations=10, completion_promise="COMPLETE"): branch = subprocess.check_output(
"""Initialize RALPH-loop with iteration limits and completion detection.""" ["git", "branch", "--show-current"], text=True
self.client = CopilotClient() ).strip()
self.iteration = 0
self.max_iterations = max_iterations
self.completion_promise = completion_promise
self.last_response = None
async def run(self, initial_prompt): print("" * 40)
""" print(f"Mode: {mode}")
Run the RALPH-loop until completion promise is detected or max iterations reached. print(f"Prompt: {prompt_file}")
""" print(f"Branch: {branch}")
session = None print(f"Max: {max_iterations} iterations")
await self.client.start() print("" * 40)
try:
session = await self.client.create_session(
SessionConfig(model="gpt-5.1-codex-mini")
)
try:
while self.iteration < self.max_iterations:
self.iteration += 1
print(f"\n=== Iteration {self.iteration}/{self.max_iterations} ===")
current_prompt = self._build_iteration_prompt(initial_prompt)
print(f"Sending prompt (length: {len(current_prompt)})...")
result = await session.send_and_wait(
MessageOptions(prompt=current_prompt),
timeout=300,
)
self.last_response = result.data.content if result else ""
# Display response summary
summary = (
self.last_response[:200] + "..."
if len(self.last_response) > 200
else self.last_response
)
print(f"Response: {summary}")
# Check for completion promise
if self.completion_promise in self.last_response:
print(
f"\n✓ Success! Completion promise detected: '{self.completion_promise}'"
)
return self.last_response
print(
f"Iteration {self.iteration} complete. Checking for next iteration..."
)
raise RuntimeError(
f"Maximum iterations ({self.max_iterations}) reached without "
f"detecting completion promise: '{self.completion_promise}'"
)
except Exception as e:
print(f"\nError during RALPH-loop: {e}")
raise
finally:
if session is not None:
await session.destroy()
finally:
await self.client.stop()
def _build_iteration_prompt(self, initial_prompt):
"""Build the prompt for the current iteration, including previous output as context."""
if self.iteration == 1:
return initial_prompt
return f"""{initial_prompt}
=== CONTEXT FROM PREVIOUS ITERATION ===
{self.last_response}
=== END CONTEXT ===
Continue working on this task. Review the previous attempt and improve upon it."""
async def main():
"""Example usage demonstrating RALPH-loop."""
prompt = """You are iteratively building a small library. Follow these phases IN ORDER.
Do NOT skip ahead — only do the current phase, then stop and wait for the next iteration.
Phase 1: Design a DataValidator class that validates records against a schema.
- Schema defines field names, types (str, int, float, bool), and whether required.
- Return a list of validation errors per record.
- Show the class code only. Do NOT output COMPLETE.
Phase 2: Write at least 4 unit tests covering: missing required field, wrong type,
valid record, and empty input. Show test code only. Do NOT output COMPLETE.
Phase 3: Review the code from phases 1 and 2. Fix any bugs, add docstrings, and add
an extra edge-case test. Show the final consolidated code with all fixes.
When this phase is fully done, output the exact text: COMPLETE"""
loop = RalphLoop(max_iterations=5, completion_promise="COMPLETE")
try: try:
result = await loop.run(prompt) prompt = Path(prompt_file).read_text()
print("\n=== FINAL RESULT ===")
print(result) for i in range(1, max_iterations + 1):
except RuntimeError as e: print(f"\n=== Iteration {i}/{max_iterations} ===")
print(f"\nTask did not complete: {e}")
if loop.last_response: # Fresh session — each task gets full context budget
print(f"\nLast attempt:\n{loop.last_response}") session = await client.create_session(
SessionConfig(model="claude-sonnet-4.5")
)
try:
await session.send_and_wait(
MessageOptions(prompt=prompt), timeout=600
)
finally:
await session.destroy()
# Push changes after each iteration
try:
subprocess.run(
["git", "push", "origin", branch], check=True
)
except subprocess.CalledProcessError:
subprocess.run(
["git", "push", "-u", "origin", branch], check=True
)
print(f"\nIteration {i} complete.")
print(f"\nReached max iterations: {max_iterations}")
finally:
await client.stop()
if __name__ == "__main__": if __name__ == "__main__":
asyncio.run(main()) args = sys.argv[1:]
mode = "plan" if "plan" in args else "build"
max_iter = next((int(a) for a in args if a.isdigit()), 50)
asyncio.run(ralph_loop(mode, max_iter))