mirror of
https://github.com/github/awesome-copilot.git
synced 2026-02-23 11:55:12 +00:00
Rewrite Ralph loop recipes: split into simple vs ideal versions
Align all 4 language recipes (Node.js, Python, .NET, Go) with the Ralph Playbook architecture: - Simple version: minimal outer loop with fresh session per iteration - Ideal version: planning/building modes, backpressure, git integration - Fresh context isolation instead of in-session context accumulation - Disk-based shared state via IMPLEMENTATION_PLAN.md - Example prompt templates (PROMPT_plan.md, PROMPT_build.md, AGENTS.md) - Updated cookbook README descriptions
This commit is contained in:
@@ -6,7 +6,7 @@ This cookbook collects small, focused recipes showing how to accomplish common t
|
|||||||
|
|
||||||
### .NET (C#)
|
### .NET (C#)
|
||||||
|
|
||||||
- [RALPH-loop](dotnet/ralph-loop.md): Implement iterative self-referential AI loops for task completion with automatic retries.
|
- [Ralph Loop](dotnet/ralph-loop.md): Build autonomous AI coding loops with fresh context per iteration, planning/building modes, and backpressure.
|
||||||
- [Error Handling](dotnet/error-handling.md): Handle errors gracefully including connection failures, timeouts, and cleanup.
|
- [Error Handling](dotnet/error-handling.md): Handle errors gracefully including connection failures, timeouts, and cleanup.
|
||||||
- [Multiple Sessions](dotnet/multiple-sessions.md): Manage multiple independent conversations simultaneously.
|
- [Multiple Sessions](dotnet/multiple-sessions.md): Manage multiple independent conversations simultaneously.
|
||||||
- [Managing Local Files](dotnet/managing-local-files.md): Organize files by metadata using AI-powered grouping strategies.
|
- [Managing Local Files](dotnet/managing-local-files.md): Organize files by metadata using AI-powered grouping strategies.
|
||||||
@@ -15,7 +15,7 @@ This cookbook collects small, focused recipes showing how to accomplish common t
|
|||||||
|
|
||||||
### Node.js / TypeScript
|
### Node.js / TypeScript
|
||||||
|
|
||||||
- [RALPH-loop](nodejs/ralph-loop.md): Implement iterative self-referential AI loops for task completion with automatic retries.
|
- [Ralph Loop](nodejs/ralph-loop.md): Build autonomous AI coding loops with fresh context per iteration, planning/building modes, and backpressure.
|
||||||
- [Error Handling](nodejs/error-handling.md): Handle errors gracefully including connection failures, timeouts, and cleanup.
|
- [Error Handling](nodejs/error-handling.md): Handle errors gracefully including connection failures, timeouts, and cleanup.
|
||||||
- [Multiple Sessions](nodejs/multiple-sessions.md): Manage multiple independent conversations simultaneously.
|
- [Multiple Sessions](nodejs/multiple-sessions.md): Manage multiple independent conversations simultaneously.
|
||||||
- [Managing Local Files](nodejs/managing-local-files.md): Organize files by metadata using AI-powered grouping strategies.
|
- [Managing Local Files](nodejs/managing-local-files.md): Organize files by metadata using AI-powered grouping strategies.
|
||||||
@@ -24,7 +24,7 @@ This cookbook collects small, focused recipes showing how to accomplish common t
|
|||||||
|
|
||||||
### Python
|
### Python
|
||||||
|
|
||||||
- [RALPH-loop](python/ralph-loop.md): Implement iterative self-referential AI loops for task completion with automatic retries.
|
- [Ralph Loop](python/ralph-loop.md): Build autonomous AI coding loops with fresh context per iteration, planning/building modes, and backpressure.
|
||||||
- [Error Handling](python/error-handling.md): Handle errors gracefully including connection failures, timeouts, and cleanup.
|
- [Error Handling](python/error-handling.md): Handle errors gracefully including connection failures, timeouts, and cleanup.
|
||||||
- [Multiple Sessions](python/multiple-sessions.md): Manage multiple independent conversations simultaneously.
|
- [Multiple Sessions](python/multiple-sessions.md): Manage multiple independent conversations simultaneously.
|
||||||
- [Managing Local Files](python/managing-local-files.md): Organize files by metadata using AI-powered grouping strategies.
|
- [Managing Local Files](python/managing-local-files.md): Organize files by metadata using AI-powered grouping strategies.
|
||||||
@@ -33,7 +33,7 @@ This cookbook collects small, focused recipes showing how to accomplish common t
|
|||||||
|
|
||||||
### Go
|
### Go
|
||||||
|
|
||||||
- [RALPH-loop](go/ralph-loop.md): Implement iterative self-referential AI loops for task completion with automatic retries.
|
- [Ralph Loop](go/ralph-loop.md): Build autonomous AI coding loops with fresh context per iteration, planning/building modes, and backpressure.
|
||||||
- [Error Handling](go/error-handling.md): Handle errors gracefully including connection failures, timeouts, and cleanup.
|
- [Error Handling](go/error-handling.md): Handle errors gracefully including connection failures, timeouts, and cleanup.
|
||||||
- [Multiple Sessions](go/multiple-sessions.md): Manage multiple independent conversations simultaneously.
|
- [Multiple Sessions](go/multiple-sessions.md): Manage multiple independent conversations simultaneously.
|
||||||
- [Managing Local Files](go/managing-local-files.md): Organize files by metadata using AI-powered grouping strategies.
|
- [Managing Local Files](go/managing-local-files.md): Organize files by metadata using AI-powered grouping strategies.
|
||||||
|
|||||||
@@ -1,6 +1,6 @@
|
|||||||
# RALPH-loop: Iterative Self-Referential AI Loops
|
# Ralph Loop: Autonomous AI Task Loops
|
||||||
|
|
||||||
Implement self-referential feedback loops where an AI agent iteratively improves work by reading its own previous output.
|
Build autonomous coding loops where an AI agent picks tasks, implements them, validates against backpressure (tests, builds), commits, and repeats — each iteration in a fresh context window.
|
||||||
|
|
||||||
> **Runnable example:** [recipe/ralph-loop.cs](recipe/ralph-loop.cs)
|
> **Runnable example:** [recipe/ralph-loop.cs](recipe/ralph-loop.cs)
|
||||||
>
|
>
|
||||||
@@ -9,252 +9,250 @@ Implement self-referential feedback loops where an AI agent iteratively improves
|
|||||||
> dotnet run recipe/ralph-loop.cs
|
> dotnet run recipe/ralph-loop.cs
|
||||||
> ```
|
> ```
|
||||||
|
|
||||||
## What is RALPH-loop?
|
## What is a Ralph Loop?
|
||||||
|
|
||||||
RALPH-loop is a development methodology for iterative AI-powered task completion. Named after the Ralph Wiggum technique, it embodies the philosophy of persistent iteration:
|
A [Ralph loop](https://ghuntley.com/ralph/) is an autonomous development workflow where an AI agent iterates through tasks in isolated context windows. The key insight: **state lives on disk, not in the model's context**. Each iteration starts fresh, reads the current state from files, does one task, writes results back to disk, and exits.
|
||||||
|
|
||||||
- **One prompt, multiple iterations**: The same prompt is processed repeatedly
|
```
|
||||||
- **Self-referential feedback**: The AI reads its own previous work (file changes, git history)
|
┌─────────────────────────────────────────────────┐
|
||||||
- **Completion detection**: Loop exits when a completion promise is detected in output
|
│ loop.sh │
|
||||||
- **Safety limits**: Always include a maximum iteration count to prevent infinite loops
|
│ while true: │
|
||||||
|
│ ┌─────────────────────────────────────────┐ │
|
||||||
|
│ │ Fresh session (isolated context) │ │
|
||||||
|
│ │ │ │
|
||||||
|
│ │ 1. Read PROMPT.md + AGENTS.md │ │
|
||||||
|
│ │ 2. Study specs/* and code │ │
|
||||||
|
│ │ 3. Pick next task from plan │ │
|
||||||
|
│ │ 4. Implement + run tests │ │
|
||||||
|
│ │ 5. Update plan, commit, exit │ │
|
||||||
|
│ └─────────────────────────────────────────┘ │
|
||||||
|
│ ↻ next iteration (fresh context) │
|
||||||
|
└─────────────────────────────────────────────────┘
|
||||||
|
```
|
||||||
|
|
||||||
## Example Scenario
|
**Core principles:**
|
||||||
|
|
||||||
You need to iteratively improve code until all tests pass. Instead of asking the model to "write perfect code," you use RALPH-loop to:
|
- **Fresh context per iteration**: Each loop creates a new session — no context accumulation, always in the "smart zone"
|
||||||
|
- **Disk as shared state**: `IMPLEMENTATION_PLAN.md` persists between iterations and acts as the coordination mechanism
|
||||||
|
- **Backpressure steers quality**: Tests, builds, and lints reject bad work — the agent must fix issues before committing
|
||||||
|
- **Two modes**: PLANNING (gap analysis → generate plan) and BUILDING (implement from plan)
|
||||||
|
|
||||||
1. Send the initial prompt with clear success criteria
|
## Simple Version
|
||||||
2. The model writes code and tests
|
|
||||||
3. The model runs tests and sees failures
|
|
||||||
4. Loop automatically re-sends the prompt
|
|
||||||
5. The model reads test output and previous code, fixes issues
|
|
||||||
6. Repeat until all tests pass and completion promise is output
|
|
||||||
|
|
||||||
## Basic Implementation
|
The minimal Ralph loop — the SDK equivalent of `while :; do cat PROMPT.md | claude ; done`:
|
||||||
|
|
||||||
```csharp
|
```csharp
|
||||||
using GitHub.Copilot.SDK;
|
using GitHub.Copilot.SDK;
|
||||||
|
|
||||||
public class RalphLoop
|
var client = new CopilotClient();
|
||||||
|
await client.StartAsync();
|
||||||
|
|
||||||
|
try
|
||||||
{
|
{
|
||||||
private readonly CopilotClient _client;
|
var prompt = await File.ReadAllTextAsync("PROMPT.md");
|
||||||
private int _iteration = 0;
|
var maxIterations = 50;
|
||||||
private readonly int _maxIterations;
|
|
||||||
private readonly string _completionPromise;
|
|
||||||
private string? _lastResponse;
|
|
||||||
|
|
||||||
public RalphLoop(int maxIterations = 10, string completionPromise = "COMPLETE")
|
for (var i = 1; i <= maxIterations; i++)
|
||||||
{
|
{
|
||||||
_client = new CopilotClient();
|
Console.WriteLine($"\n=== Iteration {i}/{maxIterations} ===");
|
||||||
_maxIterations = maxIterations;
|
|
||||||
_completionPromise = completionPromise;
|
|
||||||
}
|
|
||||||
|
|
||||||
public async Task<string> RunAsync(string prompt)
|
|
||||||
{
|
|
||||||
await _client.StartAsync();
|
|
||||||
|
|
||||||
|
// Fresh session each iteration — context isolation is the point
|
||||||
|
var session = await client.CreateSessionAsync(
|
||||||
|
new SessionConfig { Model = "claude-sonnet-4.5" });
|
||||||
try
|
try
|
||||||
{
|
{
|
||||||
var session = await _client.CreateSessionAsync(
|
var done = new TaskCompletionSource<string>();
|
||||||
new SessionConfig { Model = "gpt-5.1-codex-mini" });
|
session.On(evt =>
|
||||||
|
|
||||||
try
|
|
||||||
{
|
{
|
||||||
var done = new TaskCompletionSource<string>();
|
if (evt is AssistantMessageEvent msg)
|
||||||
session.On(evt =>
|
done.TrySetResult(msg.Data.Content);
|
||||||
{
|
});
|
||||||
if (evt is AssistantMessageEvent msg)
|
|
||||||
{
|
|
||||||
_lastResponse = msg.Data.Content;
|
|
||||||
done.TrySetResult(msg.Data.Content);
|
|
||||||
}
|
|
||||||
});
|
|
||||||
|
|
||||||
while (_iteration < _maxIterations)
|
await session.SendAsync(new MessageOptions { Prompt = prompt });
|
||||||
{
|
await done.Task;
|
||||||
_iteration++;
|
|
||||||
Console.WriteLine($"\n--- Iteration {_iteration} ---");
|
|
||||||
|
|
||||||
done = new TaskCompletionSource<string>();
|
|
||||||
|
|
||||||
// Send prompt (on first iteration) or continuation
|
|
||||||
var messagePrompt = _iteration == 1
|
|
||||||
? prompt
|
|
||||||
: $"{prompt}\n\nPrevious attempt:\n{_lastResponse}\n\nContinue iterating...";
|
|
||||||
|
|
||||||
await session.SendAsync(new MessageOptions { Prompt = messagePrompt });
|
|
||||||
var response = await done.Task;
|
|
||||||
|
|
||||||
// Check for completion promise
|
|
||||||
if (response.Contains(_completionPromise))
|
|
||||||
{
|
|
||||||
Console.WriteLine($"✓ Completion promise detected: {_completionPromise}");
|
|
||||||
return response;
|
|
||||||
}
|
|
||||||
|
|
||||||
Console.WriteLine($"Iteration {_iteration} complete. Continuing...");
|
|
||||||
}
|
|
||||||
|
|
||||||
throw new InvalidOperationException(
|
|
||||||
$"Max iterations ({_maxIterations}) reached without completion promise");
|
|
||||||
}
|
|
||||||
finally
|
|
||||||
{
|
|
||||||
await session.DisposeAsync();
|
|
||||||
}
|
|
||||||
}
|
}
|
||||||
finally
|
finally
|
||||||
{
|
{
|
||||||
await _client.StopAsync();
|
await session.DisposeAsync();
|
||||||
}
|
}
|
||||||
|
|
||||||
|
Console.WriteLine($"Iteration {i} complete.");
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
finally
|
||||||
// Usage
|
{
|
||||||
var loop = new RalphLoop(maxIterations: 5, completionPromise: "COMPLETE");
|
await client.StopAsync();
|
||||||
var result = await loop.RunAsync("Your task here");
|
}
|
||||||
Console.WriteLine(result);
|
|
||||||
```
|
```
|
||||||
|
|
||||||
## With File Persistence
|
This is all you need to get started. The prompt file tells the agent what to do; the agent reads project files, does work, commits, and exits. The loop restarts with a clean slate.
|
||||||
|
|
||||||
For tasks involving code generation, persist state to files so the AI can see changes:
|
## Ideal Version
|
||||||
|
|
||||||
|
The full Ralph pattern with planning and building modes, matching the [Ralph Playbook](https://github.com/ClaytonFarr/ralph-playbook) architecture:
|
||||||
|
|
||||||
```csharp
|
```csharp
|
||||||
public class PersistentRalphLoop
|
using System.Diagnostics;
|
||||||
|
using GitHub.Copilot.SDK;
|
||||||
|
|
||||||
|
// Parse args: dotnet run [plan] [max_iterations]
|
||||||
|
var mode = args.Contains("plan") ? "plan" : "build";
|
||||||
|
var maxArg = args.FirstOrDefault(a => int.TryParse(a, out _));
|
||||||
|
var maxIterations = maxArg != null ? int.Parse(maxArg) : 50;
|
||||||
|
var promptFile = mode == "plan" ? "PROMPT_plan.md" : "PROMPT_build.md";
|
||||||
|
|
||||||
|
var client = new CopilotClient();
|
||||||
|
await client.StartAsync();
|
||||||
|
|
||||||
|
var branchInfo = new ProcessStartInfo("git", "branch --show-current")
|
||||||
|
{ RedirectStandardOutput = true };
|
||||||
|
var branch = Process.Start(branchInfo)!;
|
||||||
|
var branchName = (await branch.StandardOutput.ReadToEndAsync()).Trim();
|
||||||
|
await branch.WaitForExitAsync();
|
||||||
|
|
||||||
|
Console.WriteLine(new string('━', 40));
|
||||||
|
Console.WriteLine($"Mode: {mode}");
|
||||||
|
Console.WriteLine($"Prompt: {promptFile}");
|
||||||
|
Console.WriteLine($"Branch: {branchName}");
|
||||||
|
Console.WriteLine($"Max: {maxIterations} iterations");
|
||||||
|
Console.WriteLine(new string('━', 40));
|
||||||
|
|
||||||
|
try
|
||||||
{
|
{
|
||||||
private readonly string _workDir;
|
var prompt = await File.ReadAllTextAsync(promptFile);
|
||||||
private readonly CopilotClient _client;
|
|
||||||
private readonly int _maxIterations;
|
|
||||||
private int _iteration = 0;
|
|
||||||
|
|
||||||
public PersistentRalphLoop(string workDir, int maxIterations = 10)
|
for (var i = 1; i <= maxIterations; i++)
|
||||||
{
|
{
|
||||||
_workDir = workDir;
|
Console.WriteLine($"\n=== Iteration {i}/{maxIterations} ===");
|
||||||
_maxIterations = maxIterations;
|
|
||||||
Directory.CreateDirectory(_workDir);
|
|
||||||
_client = new CopilotClient();
|
|
||||||
}
|
|
||||||
|
|
||||||
public async Task<string> RunAsync(string prompt)
|
|
||||||
{
|
|
||||||
await _client.StartAsync();
|
|
||||||
|
|
||||||
|
// Fresh session — each task gets full context budget
|
||||||
|
var session = await client.CreateSessionAsync(
|
||||||
|
new SessionConfig { Model = "claude-sonnet-4.5" });
|
||||||
try
|
try
|
||||||
{
|
{
|
||||||
var session = await _client.CreateSessionAsync(
|
var done = new TaskCompletionSource<string>();
|
||||||
new SessionConfig { Model = "gpt-5.1-codex-mini" });
|
session.On(evt =>
|
||||||
|
|
||||||
try
|
|
||||||
{
|
{
|
||||||
// Store initial prompt
|
if (evt is AssistantMessageEvent msg)
|
||||||
var promptFile = Path.Combine(_workDir, "prompt.md");
|
done.TrySetResult(msg.Data.Content);
|
||||||
await File.WriteAllTextAsync(promptFile, prompt);
|
});
|
||||||
|
|
||||||
var done = new TaskCompletionSource<string>();
|
await session.SendAsync(new MessageOptions { Prompt = prompt });
|
||||||
string response = "";
|
await done.Task;
|
||||||
session.On(evt =>
|
|
||||||
{
|
|
||||||
if (evt is AssistantMessageEvent msg)
|
|
||||||
{
|
|
||||||
response = msg.Data.Content;
|
|
||||||
done.TrySetResult(msg.Data.Content);
|
|
||||||
}
|
|
||||||
});
|
|
||||||
|
|
||||||
while (_iteration < _maxIterations)
|
|
||||||
{
|
|
||||||
_iteration++;
|
|
||||||
Console.WriteLine($"\n--- Iteration {_iteration} ---");
|
|
||||||
|
|
||||||
done = new TaskCompletionSource<string>();
|
|
||||||
|
|
||||||
// Build context including previous work
|
|
||||||
var contextBuilder = new StringBuilder(prompt);
|
|
||||||
var previousOutput = Path.Combine(_workDir, $"output-{_iteration - 1}.txt");
|
|
||||||
if (File.Exists(previousOutput))
|
|
||||||
{
|
|
||||||
contextBuilder.AppendLine($"\nPrevious iteration output:\n{await File.ReadAllTextAsync(previousOutput)}");
|
|
||||||
}
|
|
||||||
|
|
||||||
await session.SendAsync(new MessageOptions { Prompt = contextBuilder.ToString() });
|
|
||||||
await done.Task;
|
|
||||||
|
|
||||||
// Persist output
|
|
||||||
await File.WriteAllTextAsync(
|
|
||||||
Path.Combine(_workDir, $"output-{_iteration}.txt"),
|
|
||||||
response);
|
|
||||||
|
|
||||||
if (response.Contains("COMPLETE"))
|
|
||||||
{
|
|
||||||
return response;
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
throw new InvalidOperationException("Max iterations reached");
|
|
||||||
}
|
|
||||||
finally
|
|
||||||
{
|
|
||||||
await session.DisposeAsync();
|
|
||||||
}
|
|
||||||
}
|
}
|
||||||
finally
|
finally
|
||||||
{
|
{
|
||||||
await _client.StopAsync();
|
await session.DisposeAsync();
|
||||||
}
|
}
|
||||||
|
|
||||||
|
// Push changes after each iteration
|
||||||
|
try
|
||||||
|
{
|
||||||
|
Process.Start("git", $"push origin {branchName}")!.WaitForExit();
|
||||||
|
}
|
||||||
|
catch
|
||||||
|
{
|
||||||
|
Process.Start("git", $"push -u origin {branchName}")!.WaitForExit();
|
||||||
|
}
|
||||||
|
|
||||||
|
Console.WriteLine($"\nIteration {i} complete.");
|
||||||
}
|
}
|
||||||
|
|
||||||
|
Console.WriteLine($"\nReached max iterations: {maxIterations}");
|
||||||
}
|
}
|
||||||
|
finally
|
||||||
|
{
|
||||||
|
await client.StopAsync();
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
### Required Project Files
|
||||||
|
|
||||||
|
The ideal version expects this file structure in your project:
|
||||||
|
|
||||||
|
```
|
||||||
|
project-root/
|
||||||
|
├── PROMPT_plan.md # Planning mode instructions
|
||||||
|
├── PROMPT_build.md # Building mode instructions
|
||||||
|
├── AGENTS.md # Operational guide (build/test commands)
|
||||||
|
├── IMPLEMENTATION_PLAN.md # Task list (generated by planning mode)
|
||||||
|
├── specs/ # Requirement specs (one per topic)
|
||||||
|
│ ├── auth.md
|
||||||
|
│ └── data-pipeline.md
|
||||||
|
└── src/ # Your source code
|
||||||
|
```
|
||||||
|
|
||||||
|
### Example `PROMPT_plan.md`
|
||||||
|
|
||||||
|
```markdown
|
||||||
|
0a. Study `specs/*` to learn the application specifications.
|
||||||
|
0b. Study IMPLEMENTATION_PLAN.md (if present) to understand the plan so far.
|
||||||
|
0c. Study `src/` to understand existing code and shared utilities.
|
||||||
|
|
||||||
|
1. Compare specs against code (gap analysis). Create or update
|
||||||
|
IMPLEMENTATION_PLAN.md as a prioritized bullet-point list of tasks
|
||||||
|
yet to be implemented. Do NOT implement anything.
|
||||||
|
|
||||||
|
IMPORTANT: Do NOT assume functionality is missing — search the
|
||||||
|
codebase first to confirm. Prefer updating existing utilities over
|
||||||
|
creating ad-hoc copies.
|
||||||
|
```
|
||||||
|
|
||||||
|
### Example `PROMPT_build.md`
|
||||||
|
|
||||||
|
```markdown
|
||||||
|
0a. Study `specs/*` to learn the application specifications.
|
||||||
|
0b. Study IMPLEMENTATION_PLAN.md.
|
||||||
|
0c. Study `src/` for reference.
|
||||||
|
|
||||||
|
1. Choose the most important item from IMPLEMENTATION_PLAN.md. Before
|
||||||
|
making changes, search the codebase (don't assume not implemented).
|
||||||
|
2. After implementing, run the tests. If functionality is missing, add it.
|
||||||
|
3. When you discover issues, update IMPLEMENTATION_PLAN.md immediately.
|
||||||
|
4. When tests pass, update IMPLEMENTATION_PLAN.md, then `git add -A`
|
||||||
|
then `git commit` with a descriptive message.
|
||||||
|
|
||||||
|
99999. When authoring documentation, capture the why.
|
||||||
|
999999. Implement completely. No placeholders or stubs.
|
||||||
|
9999999. Keep IMPLEMENTATION_PLAN.md current — future iterations depend on it.
|
||||||
|
```
|
||||||
|
|
||||||
|
### Example `AGENTS.md`
|
||||||
|
|
||||||
|
Keep this brief (~60 lines). It's loaded every iteration, so bloat wastes context.
|
||||||
|
|
||||||
|
```markdown
|
||||||
|
## Build & Run
|
||||||
|
|
||||||
|
dotnet build
|
||||||
|
|
||||||
|
## Validation
|
||||||
|
|
||||||
|
- Tests: `dotnet test`
|
||||||
|
- Build: `dotnet build --no-restore`
|
||||||
```
|
```
|
||||||
|
|
||||||
## Best Practices
|
## Best Practices
|
||||||
|
|
||||||
1. **Write clear completion criteria**: Include exactly what "done" looks like
|
1. **Fresh context per iteration**: Never accumulate context across iterations — that's the whole point
|
||||||
2. **Use output markers**: Include `<promise>COMPLETE</promise>` or similar in completion condition
|
2. **Disk is your database**: `IMPLEMENTATION_PLAN.md` is shared state between isolated sessions
|
||||||
3. **Always set max iterations**: Prevents infinite loops on impossible tasks
|
3. **Backpressure is essential**: Tests, builds, lints in `AGENTS.md` — the agent must pass them before committing
|
||||||
4. **Persist state**: Save files so AI can see what changed between iterations
|
4. **Start with PLANNING mode**: Generate the plan first, then switch to BUILDING
|
||||||
5. **Include context**: Feed previous iteration output back as context
|
5. **Observe and tune**: Watch early iterations, add guardrails to prompts when the agent fails in specific ways
|
||||||
6. **Monitor progress**: Log each iteration to track what's happening
|
6. **The plan is disposable**: If the agent goes off track, delete `IMPLEMENTATION_PLAN.md` and re-plan
|
||||||
|
7. **Keep `AGENTS.md` brief**: It's loaded every iteration — operational info only, no progress notes
|
||||||
|
8. **Use a sandbox**: The agent runs autonomously with full tool access — isolate it
|
||||||
|
|
||||||
## Example: Iterative Code Generation
|
## When to Use a Ralph Loop
|
||||||
|
|
||||||
```csharp
|
|
||||||
var prompt = @"Write a function that:
|
|
||||||
1. Parses CSV data
|
|
||||||
2. Validates required fields
|
|
||||||
3. Returns parsed records or error
|
|
||||||
4. Has unit tests
|
|
||||||
5. Output <promise>COMPLETE</promise> when done";
|
|
||||||
|
|
||||||
var loop = new RalphLoop(maxIterations: 10, completionPromise: "COMPLETE");
|
|
||||||
var result = await loop.RunAsync(prompt);
|
|
||||||
```
|
|
||||||
|
|
||||||
## Handling Failures
|
|
||||||
|
|
||||||
```csharp
|
|
||||||
try
|
|
||||||
{
|
|
||||||
var result = await loop.RunAsync(prompt);
|
|
||||||
Console.WriteLine("Task completed successfully!");
|
|
||||||
}
|
|
||||||
catch (InvalidOperationException ex) when (ex.Message.Contains("Max iterations"))
|
|
||||||
{
|
|
||||||
Console.WriteLine("Task did not complete within iteration limit.");
|
|
||||||
Console.WriteLine($"Last response: {loop.LastResponse}");
|
|
||||||
// Document what was attempted and suggest alternatives
|
|
||||||
}
|
|
||||||
```
|
|
||||||
|
|
||||||
## When to Use RALPH-loop
|
|
||||||
|
|
||||||
**Good for:**
|
**Good for:**
|
||||||
- Code generation with automatic verification (tests, linters)
|
- Implementing features from specs with test-driven validation
|
||||||
- Tasks with clear success criteria
|
- Large refactors broken into many small tasks
|
||||||
- Iterative refinement where each attempt learns from previous failures
|
- Unattended, long-running development with clear requirements
|
||||||
- Unattended long-running improvements
|
- Any work where backpressure (tests/builds) can verify correctness
|
||||||
|
|
||||||
**Not good for:**
|
**Not good for:**
|
||||||
- Tasks requiring human judgment or design input
|
- Tasks requiring human judgment mid-loop
|
||||||
- One-shot operations
|
- One-shot operations that don't benefit from iteration
|
||||||
- Tasks with vague success criteria
|
- Vague requirements without testable acceptance criteria
|
||||||
- Real-time interactive debugging
|
- Exploratory prototyping where direction isn't clear
|
||||||
|
|||||||
@@ -1,141 +1,90 @@
|
|||||||
#:package GitHub.Copilot.SDK@*
|
#:package GitHub.Copilot.SDK@*
|
||||||
#:property PublishAot=false
|
#:property PublishAot=false
|
||||||
|
|
||||||
|
using System.Diagnostics;
|
||||||
using GitHub.Copilot.SDK;
|
using GitHub.Copilot.SDK;
|
||||||
using System.Text;
|
|
||||||
|
|
||||||
// RALPH-loop: Iterative self-referential AI loops.
|
// Ralph loop: autonomous AI task loop with fresh context per iteration.
|
||||||
// The same prompt is sent repeatedly, with AI reading its own previous output.
|
//
|
||||||
// Loop continues until completion promise is detected in the response.
|
// Two modes:
|
||||||
|
// - "plan": reads PROMPT_plan.md, generates/updates IMPLEMENTATION_PLAN.md
|
||||||
|
// - "build": reads PROMPT_build.md, implements tasks, runs tests, commits
|
||||||
|
//
|
||||||
|
// Each iteration creates a fresh session so the agent always operates in
|
||||||
|
// the "smart zone" of its context window. State is shared between
|
||||||
|
// iterations via files on disk (IMPLEMENTATION_PLAN.md, AGENTS.md, specs/*).
|
||||||
|
//
|
||||||
|
// Usage:
|
||||||
|
// dotnet run # build mode, 50 iterations
|
||||||
|
// dotnet run plan # planning mode
|
||||||
|
// dotnet run 20 # build mode, 20 iterations
|
||||||
|
// dotnet run plan 5 # planning mode, 5 iterations
|
||||||
|
|
||||||
var prompt = @"You are iteratively building a small library. Follow these phases IN ORDER.
|
var mode = args.Contains("plan") ? "plan" : "build";
|
||||||
Do NOT skip ahead — only do the current phase, then stop and wait for the next iteration.
|
var maxArg = args.FirstOrDefault(a => int.TryParse(a, out _));
|
||||||
|
var maxIterations = maxArg != null ? int.Parse(maxArg) : 50;
|
||||||
|
var promptFile = mode == "plan" ? "PROMPT_plan.md" : "PROMPT_build.md";
|
||||||
|
|
||||||
Phase 1: Design a DataValidator class that validates records against a schema.
|
var client = new CopilotClient();
|
||||||
- Schema defines field names, types (string, int, float, bool), and whether required.
|
await client.StartAsync();
|
||||||
- Return a list of validation errors per record.
|
|
||||||
- Show the class code only. Do NOT output COMPLETE.
|
|
||||||
|
|
||||||
Phase 2: Write at least 4 unit tests covering: missing required field, wrong type,
|
var branchProc = Process.Start(new ProcessStartInfo("git", "branch --show-current")
|
||||||
valid record, and empty input. Show test code only. Do NOT output COMPLETE.
|
{ RedirectStandardOutput = true })!;
|
||||||
|
var branch = (await branchProc.StandardOutput.ReadToEndAsync()).Trim();
|
||||||
|
await branchProc.WaitForExitAsync();
|
||||||
|
|
||||||
Phase 3: Review the code from phases 1 and 2. Fix any bugs, add docstrings, and add
|
Console.WriteLine(new string('━', 40));
|
||||||
an extra edge-case test. Show the final consolidated code with all fixes.
|
Console.WriteLine($"Mode: {mode}");
|
||||||
When this phase is fully done, output the exact text: COMPLETE";
|
Console.WriteLine($"Prompt: {promptFile}");
|
||||||
|
Console.WriteLine($"Branch: {branch}");
|
||||||
var loop = new RalphLoop(maxIterations: 5, completionPromise: "COMPLETE");
|
Console.WriteLine($"Max: {maxIterations} iterations");
|
||||||
|
Console.WriteLine(new string('━', 40));
|
||||||
|
|
||||||
try
|
try
|
||||||
{
|
{
|
||||||
var result = await loop.RunAsync(prompt);
|
var prompt = await File.ReadAllTextAsync(promptFile);
|
||||||
Console.WriteLine("\n=== FINAL RESULT ===");
|
|
||||||
Console.WriteLine(result);
|
for (var i = 1; i <= maxIterations; i++)
|
||||||
}
|
|
||||||
catch (InvalidOperationException ex)
|
|
||||||
{
|
|
||||||
Console.WriteLine($"\nTask did not complete: {ex.Message}");
|
|
||||||
if (loop.LastResponse != null)
|
|
||||||
{
|
{
|
||||||
Console.WriteLine($"\nLast attempt:\n{loop.LastResponse}");
|
Console.WriteLine($"\n=== Iteration {i}/{maxIterations} ===");
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
// --- RalphLoop class definition ---
|
// Fresh session — each task gets full context budget
|
||||||
|
var session = await client.CreateSessionAsync(
|
||||||
public class RalphLoop
|
new SessionConfig { Model = "claude-sonnet-4.5" });
|
||||||
{
|
|
||||||
private readonly CopilotClient _client;
|
|
||||||
private int _iteration = 0;
|
|
||||||
private readonly int _maxIterations;
|
|
||||||
private readonly string _completionPromise;
|
|
||||||
private string? _lastResponse;
|
|
||||||
|
|
||||||
public RalphLoop(int maxIterations = 10, string completionPromise = "COMPLETE")
|
|
||||||
{
|
|
||||||
_client = new CopilotClient();
|
|
||||||
_maxIterations = maxIterations;
|
|
||||||
_completionPromise = completionPromise;
|
|
||||||
}
|
|
||||||
|
|
||||||
public string? LastResponse => _lastResponse;
|
|
||||||
|
|
||||||
public async Task<string> RunAsync(string initialPrompt)
|
|
||||||
{
|
|
||||||
await _client.StartAsync();
|
|
||||||
|
|
||||||
try
|
try
|
||||||
{
|
{
|
||||||
var session = await _client.CreateSessionAsync(new SessionConfig
|
var done = new TaskCompletionSource<string>();
|
||||||
{
|
session.On(evt =>
|
||||||
Model = "gpt-5.1-codex-mini"
|
{
|
||||||
|
if (evt is AssistantMessageEvent msg)
|
||||||
|
done.TrySetResult(msg.Data.Content);
|
||||||
});
|
});
|
||||||
|
|
||||||
try
|
await session.SendAsync(new MessageOptions { Prompt = prompt });
|
||||||
{
|
await done.Task;
|
||||||
var done = new TaskCompletionSource<string>();
|
|
||||||
session.On(evt =>
|
|
||||||
{
|
|
||||||
if (evt is AssistantMessageEvent msg)
|
|
||||||
{
|
|
||||||
_lastResponse = msg.Data.Content;
|
|
||||||
done.TrySetResult(msg.Data.Content);
|
|
||||||
}
|
|
||||||
});
|
|
||||||
|
|
||||||
while (_iteration < _maxIterations)
|
|
||||||
{
|
|
||||||
_iteration++;
|
|
||||||
Console.WriteLine($"\n=== Iteration {_iteration}/{_maxIterations} ===");
|
|
||||||
|
|
||||||
done = new TaskCompletionSource<string>();
|
|
||||||
|
|
||||||
var currentPrompt = BuildIterationPrompt(initialPrompt);
|
|
||||||
Console.WriteLine($"Sending prompt (length: {currentPrompt.Length})...");
|
|
||||||
|
|
||||||
await session.SendAsync(new MessageOptions { Prompt = currentPrompt });
|
|
||||||
var response = await done.Task;
|
|
||||||
|
|
||||||
var summary = response.Length > 200
|
|
||||||
? response.Substring(0, 200) + "..."
|
|
||||||
: response;
|
|
||||||
Console.WriteLine($"Response: {summary}");
|
|
||||||
|
|
||||||
if (response.Contains(_completionPromise))
|
|
||||||
{
|
|
||||||
Console.WriteLine($"\n✓ Completion promise detected: '{_completionPromise}'");
|
|
||||||
return response;
|
|
||||||
}
|
|
||||||
|
|
||||||
Console.WriteLine($"Iteration {_iteration} complete. Continuing...");
|
|
||||||
}
|
|
||||||
|
|
||||||
throw new InvalidOperationException(
|
|
||||||
$"Max iterations ({_maxIterations}) reached without completion promise: '{_completionPromise}'");
|
|
||||||
}
|
|
||||||
finally
|
|
||||||
{
|
|
||||||
await session.DisposeAsync();
|
|
||||||
}
|
|
||||||
}
|
}
|
||||||
finally
|
finally
|
||||||
{
|
{
|
||||||
await _client.StopAsync();
|
await session.DisposeAsync();
|
||||||
}
|
}
|
||||||
|
|
||||||
|
// Push changes after each iteration
|
||||||
|
try
|
||||||
|
{
|
||||||
|
Process.Start("git", $"push origin {branch}")!.WaitForExit();
|
||||||
|
}
|
||||||
|
catch
|
||||||
|
{
|
||||||
|
Process.Start("git", $"push -u origin {branch}")!.WaitForExit();
|
||||||
|
}
|
||||||
|
|
||||||
|
Console.WriteLine($"\nIteration {i} complete.");
|
||||||
}
|
}
|
||||||
|
|
||||||
private string BuildIterationPrompt(string initialPrompt)
|
Console.WriteLine($"\nReached max iterations: {maxIterations}");
|
||||||
{
|
}
|
||||||
if (_iteration == 1)
|
finally
|
||||||
return initialPrompt;
|
{
|
||||||
|
await client.StopAsync();
|
||||||
var sb = new StringBuilder();
|
|
||||||
sb.AppendLine(initialPrompt);
|
|
||||||
sb.AppendLine();
|
|
||||||
sb.AppendLine("=== CONTEXT FROM PREVIOUS ITERATION ===");
|
|
||||||
sb.AppendLine(_lastResponse);
|
|
||||||
sb.AppendLine("=== END CONTEXT ===");
|
|
||||||
sb.AppendLine();
|
|
||||||
sb.AppendLine("Continue working on this task. Review the previous attempt and improve upon it.");
|
|
||||||
return sb.ToString();
|
|
||||||
}
|
|
||||||
}
|
}
|
||||||
|
|||||||
@@ -1,6 +1,6 @@
|
|||||||
# RALPH-loop: Iterative Self-Referential AI Loops
|
# Ralph Loop: Autonomous AI Task Loops
|
||||||
|
|
||||||
Implement self-referential feedback loops where an AI agent iteratively improves work by reading its own previous output.
|
Build autonomous coding loops where an AI agent picks tasks, implements them, validates against backpressure (tests, builds), commits, and repeats — each iteration in a fresh context window.
|
||||||
|
|
||||||
> **Runnable example:** [recipe/ralph-loop.go](recipe/ralph-loop.go)
|
> **Runnable example:** [recipe/ralph-loop.go](recipe/ralph-loop.go)
|
||||||
>
|
>
|
||||||
@@ -9,27 +9,37 @@ Implement self-referential feedback loops where an AI agent iteratively improves
|
|||||||
> go run recipe/ralph-loop.go
|
> go run recipe/ralph-loop.go
|
||||||
> ```
|
> ```
|
||||||
|
|
||||||
## What is RALPH-loop?
|
## What is a Ralph Loop?
|
||||||
|
|
||||||
RALPH-loop is a development methodology for iterative AI-powered task completion. Named after the Ralph Wiggum technique, it embodies the philosophy of persistent iteration:
|
A [Ralph loop](https://ghuntley.com/ralph/) is an autonomous development workflow where an AI agent iterates through tasks in isolated context windows. The key insight: **state lives on disk, not in the model's context**. Each iteration starts fresh, reads the current state from files, does one task, writes results back to disk, and exits.
|
||||||
|
|
||||||
- **One prompt, multiple iterations**: The same prompt is processed repeatedly
|
```
|
||||||
- **Self-referential feedback**: The AI reads its own previous work (file changes, git history)
|
┌─────────────────────────────────────────────────┐
|
||||||
- **Completion detection**: Loop exits when a completion promise is detected in output
|
│ loop.sh │
|
||||||
- **Safety limits**: Always include a maximum iteration count to prevent infinite loops
|
│ while true: │
|
||||||
|
│ ┌─────────────────────────────────────────┐ │
|
||||||
|
│ │ Fresh session (isolated context) │ │
|
||||||
|
│ │ │ │
|
||||||
|
│ │ 1. Read PROMPT.md + AGENTS.md │ │
|
||||||
|
│ │ 2. Study specs/* and code │ │
|
||||||
|
│ │ 3. Pick next task from plan │ │
|
||||||
|
│ │ 4. Implement + run tests │ │
|
||||||
|
│ │ 5. Update plan, commit, exit │ │
|
||||||
|
│ └─────────────────────────────────────────┘ │
|
||||||
|
│ ↻ next iteration (fresh context) │
|
||||||
|
└─────────────────────────────────────────────────┘
|
||||||
|
```
|
||||||
|
|
||||||
## Example Scenario
|
**Core principles:**
|
||||||
|
|
||||||
You need to iteratively improve code until all tests pass. Instead of asking Copilot to "write perfect code," you use RALPH-loop to:
|
- **Fresh context per iteration**: Each loop creates a new session — no context accumulation, always in the "smart zone"
|
||||||
|
- **Disk as shared state**: `IMPLEMENTATION_PLAN.md` persists between iterations and acts as the coordination mechanism
|
||||||
|
- **Backpressure steers quality**: Tests, builds, and lints reject bad work — the agent must fix issues before committing
|
||||||
|
- **Two modes**: PLANNING (gap analysis → generate plan) and BUILDING (implement from plan)
|
||||||
|
|
||||||
1. Send the initial prompt with clear success criteria
|
## Simple Version
|
||||||
2. Copilot writes code and tests
|
|
||||||
3. Copilot runs tests and sees failures
|
|
||||||
4. Loop automatically re-sends the prompt
|
|
||||||
5. Copilot reads test output and previous code, fixes issues
|
|
||||||
6. Repeat until all tests pass and completion promise is output
|
|
||||||
|
|
||||||
## Basic Implementation
|
The minimal Ralph loop — the SDK equivalent of `while :; do cat PROMPT.md | claude ; done`:
|
||||||
|
|
||||||
```go
|
```go
|
||||||
package main
|
package main
|
||||||
@@ -38,81 +48,59 @@ import (
|
|||||||
"context"
|
"context"
|
||||||
"fmt"
|
"fmt"
|
||||||
"log"
|
"log"
|
||||||
"strings"
|
"os"
|
||||||
|
|
||||||
copilot "github.com/github/copilot-sdk/go"
|
copilot "github.com/github/copilot-sdk/go"
|
||||||
)
|
)
|
||||||
|
|
||||||
type RalphLoop struct {
|
func ralphLoop(ctx context.Context, promptFile string, maxIterations int) error {
|
||||||
client *copilot.Client
|
client := copilot.NewClient(nil)
|
||||||
iteration int
|
if err := client.Start(ctx); err != nil {
|
||||||
maxIterations int
|
return err
|
||||||
completionPromise string
|
|
||||||
LastResponse string
|
|
||||||
}
|
|
||||||
|
|
||||||
func NewRalphLoop(maxIterations int, completionPromise string) *RalphLoop {
|
|
||||||
return &RalphLoop{
|
|
||||||
client: copilot.NewClient(nil),
|
|
||||||
maxIterations: maxIterations,
|
|
||||||
completionPromise: completionPromise,
|
|
||||||
}
|
}
|
||||||
}
|
defer client.Stop()
|
||||||
|
|
||||||
func (r *RalphLoop) Run(ctx context.Context, initialPrompt string) (string, error) {
|
prompt, err := os.ReadFile(promptFile)
|
||||||
if err := r.client.Start(ctx); err != nil {
|
|
||||||
return "", err
|
|
||||||
}
|
|
||||||
defer r.client.Stop()
|
|
||||||
|
|
||||||
session, err := r.client.CreateSession(ctx, &copilot.SessionConfig{
|
|
||||||
Model: "gpt-5.1-codex-mini",
|
|
||||||
})
|
|
||||||
if err != nil {
|
if err != nil {
|
||||||
return "", err
|
return err
|
||||||
}
|
}
|
||||||
defer session.Destroy()
|
|
||||||
|
|
||||||
for r.iteration < r.maxIterations {
|
for i := 1; i <= maxIterations; i++ {
|
||||||
r.iteration++
|
fmt.Printf("\n=== Iteration %d/%d ===\n", i, maxIterations)
|
||||||
fmt.Printf("\n--- Iteration %d/%d ---\n", r.iteration, r.maxIterations)
|
|
||||||
|
|
||||||
prompt := r.buildIterationPrompt(initialPrompt)
|
// Fresh session each iteration — context isolation is the point
|
||||||
|
session, err := client.CreateSession(ctx, &copilot.SessionConfig{
|
||||||
result, err := session.SendAndWait(ctx, copilot.MessageOptions{Prompt: prompt})
|
Model: "claude-sonnet-4.5",
|
||||||
|
})
|
||||||
if err != nil {
|
if err != nil {
|
||||||
return "", err
|
return err
|
||||||
}
|
}
|
||||||
|
|
||||||
if result != nil && result.Data.Content != nil {
|
_, err = session.SendAndWait(ctx, copilot.MessageOptions{
|
||||||
r.LastResponse = *result.Data.Content
|
Prompt: string(prompt),
|
||||||
|
})
|
||||||
|
session.Destroy()
|
||||||
|
if err != nil {
|
||||||
|
return err
|
||||||
}
|
}
|
||||||
|
|
||||||
if strings.Contains(r.LastResponse, r.completionPromise) {
|
fmt.Printf("Iteration %d complete.\n", i)
|
||||||
fmt.Printf("✓ Completion promise detected: %s\n", r.completionPromise)
|
|
||||||
return r.LastResponse, nil
|
|
||||||
}
|
|
||||||
}
|
}
|
||||||
|
return nil
|
||||||
return "", fmt.Errorf("max iterations (%d) reached without completion promise",
|
|
||||||
r.maxIterations)
|
|
||||||
}
|
}
|
||||||
|
|
||||||
// Usage
|
|
||||||
func main() {
|
func main() {
|
||||||
ctx := context.Background()
|
if err := ralphLoop(context.Background(), "PROMPT.md", 20); err != nil {
|
||||||
loop := NewRalphLoop(5, "COMPLETE")
|
|
||||||
result, err := loop.Run(ctx, "Your task here")
|
|
||||||
if err != nil {
|
|
||||||
log.Fatal(err)
|
log.Fatal(err)
|
||||||
}
|
}
|
||||||
fmt.Println(result)
|
|
||||||
}
|
}
|
||||||
```
|
```
|
||||||
|
|
||||||
## With File Persistence
|
This is all you need to get started. The prompt file tells the agent what to do; the agent reads project files, does work, commits, and exits. The loop restarts with a clean slate.
|
||||||
|
|
||||||
For tasks involving code generation, persist state to files so the AI can see changes:
|
## Ideal Version
|
||||||
|
|
||||||
|
The full Ralph pattern with planning and building modes, matching the [Ralph Playbook](https://github.com/ClaytonFarr/ralph-playbook) architecture:
|
||||||
|
|
||||||
```go
|
```go
|
||||||
package main
|
package main
|
||||||
@@ -120,121 +108,178 @@ package main
|
|||||||
import (
|
import (
|
||||||
"context"
|
"context"
|
||||||
"fmt"
|
"fmt"
|
||||||
|
"log"
|
||||||
"os"
|
"os"
|
||||||
"path/filepath"
|
"os/exec"
|
||||||
|
"strconv"
|
||||||
"strings"
|
"strings"
|
||||||
|
|
||||||
copilot "github.com/github/copilot-sdk/go"
|
copilot "github.com/github/copilot-sdk/go"
|
||||||
)
|
)
|
||||||
|
|
||||||
type PersistentRalphLoop struct {
|
func ralphLoop(ctx context.Context, mode string, maxIterations int) error {
|
||||||
client *copilot.Client
|
promptFile := "PROMPT_build.md"
|
||||||
workDir string
|
if mode == "plan" {
|
||||||
iteration int
|
promptFile = "PROMPT_plan.md"
|
||||||
maxIterations int
|
|
||||||
}
|
|
||||||
|
|
||||||
func NewPersistentRalphLoop(workDir string, maxIterations int) *PersistentRalphLoop {
|
|
||||||
os.MkdirAll(workDir, 0755)
|
|
||||||
return &PersistentRalphLoop{
|
|
||||||
client: copilot.NewClient(nil),
|
|
||||||
workDir: workDir,
|
|
||||||
maxIterations: maxIterations,
|
|
||||||
}
|
}
|
||||||
}
|
|
||||||
|
|
||||||
func (p *PersistentRalphLoop) Run(ctx context.Context, initialPrompt string) (string, error) {
|
client := copilot.NewClient(nil)
|
||||||
if err := p.client.Start(ctx); err != nil {
|
if err := client.Start(ctx); err != nil {
|
||||||
return "", err
|
return err
|
||||||
}
|
}
|
||||||
defer p.client.Stop()
|
defer client.Stop()
|
||||||
|
|
||||||
os.WriteFile(filepath.Join(p.workDir, "prompt.md"), []byte(initialPrompt), 0644)
|
branchOut, _ := exec.Command("git", "branch", "--show-current").Output()
|
||||||
|
branch := strings.TrimSpace(string(branchOut))
|
||||||
|
|
||||||
session, err := p.client.CreateSession(ctx, &copilot.SessionConfig{
|
fmt.Println(strings.Repeat("━", 40))
|
||||||
Model: "gpt-5.1-codex-mini",
|
fmt.Printf("Mode: %s\n", mode)
|
||||||
})
|
fmt.Printf("Prompt: %s\n", promptFile)
|
||||||
|
fmt.Printf("Branch: %s\n", branch)
|
||||||
|
fmt.Printf("Max: %d iterations\n", maxIterations)
|
||||||
|
fmt.Println(strings.Repeat("━", 40))
|
||||||
|
|
||||||
|
prompt, err := os.ReadFile(promptFile)
|
||||||
if err != nil {
|
if err != nil {
|
||||||
return "", err
|
return err
|
||||||
}
|
}
|
||||||
defer session.Destroy()
|
|
||||||
|
|
||||||
for p.iteration < p.maxIterations {
|
for i := 1; i <= maxIterations; i++ {
|
||||||
p.iteration++
|
fmt.Printf("\n=== Iteration %d/%d ===\n", i, maxIterations)
|
||||||
|
|
||||||
prompt := initialPrompt
|
// Fresh session — each task gets full context budget
|
||||||
prevFile := filepath.Join(p.workDir, fmt.Sprintf("output-%d.txt", p.iteration-1))
|
session, err := client.CreateSession(ctx, &copilot.SessionConfig{
|
||||||
if data, err := os.ReadFile(prevFile); err == nil {
|
Model: "claude-sonnet-4.5",
|
||||||
prompt = fmt.Sprintf("%s\n\nPrevious iteration:\n%s", initialPrompt, string(data))
|
})
|
||||||
}
|
|
||||||
|
|
||||||
result, err := session.SendAndWait(ctx, copilot.MessageOptions{Prompt: prompt})
|
|
||||||
if err != nil {
|
if err != nil {
|
||||||
return "", err
|
return err
|
||||||
}
|
}
|
||||||
|
|
||||||
response := ""
|
_, err = session.SendAndWait(ctx, copilot.MessageOptions{
|
||||||
if result != nil && result.Data.Content != nil {
|
Prompt: string(prompt),
|
||||||
response = *result.Data.Content
|
})
|
||||||
|
session.Destroy()
|
||||||
|
if err != nil {
|
||||||
|
return err
|
||||||
}
|
}
|
||||||
|
|
||||||
os.WriteFile(filepath.Join(p.workDir, fmt.Sprintf("output-%d.txt", p.iteration)),
|
// Push changes after each iteration
|
||||||
[]byte(response), 0644)
|
if err := exec.Command("git", "push", "origin", branch).Run(); err != nil {
|
||||||
|
exec.Command("git", "push", "-u", "origin", branch).Run()
|
||||||
|
}
|
||||||
|
|
||||||
if strings.Contains(response, "COMPLETE") {
|
fmt.Printf("\nIteration %d complete.\n", i)
|
||||||
return response, nil
|
}
|
||||||
|
|
||||||
|
fmt.Printf("\nReached max iterations: %d\n", maxIterations)
|
||||||
|
return nil
|
||||||
|
}
|
||||||
|
|
||||||
|
func main() {
|
||||||
|
mode := "build"
|
||||||
|
maxIterations := 50
|
||||||
|
|
||||||
|
for _, arg := range os.Args[1:] {
|
||||||
|
if arg == "plan" {
|
||||||
|
mode = "plan"
|
||||||
|
} else if n, err := strconv.Atoi(arg); err == nil {
|
||||||
|
maxIterations = n
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
return "", fmt.Errorf("max iterations reached")
|
if err := ralphLoop(context.Background(), mode, maxIterations); err != nil {
|
||||||
|
log.Fatal(err)
|
||||||
|
}
|
||||||
}
|
}
|
||||||
```
|
```
|
||||||
|
|
||||||
|
### Required Project Files
|
||||||
|
|
||||||
|
The ideal version expects this file structure in your project:
|
||||||
|
|
||||||
|
```
|
||||||
|
project-root/
|
||||||
|
├── PROMPT_plan.md # Planning mode instructions
|
||||||
|
├── PROMPT_build.md # Building mode instructions
|
||||||
|
├── AGENTS.md # Operational guide (build/test commands)
|
||||||
|
├── IMPLEMENTATION_PLAN.md # Task list (generated by planning mode)
|
||||||
|
├── specs/ # Requirement specs (one per topic)
|
||||||
|
│ ├── auth.md
|
||||||
|
│ └── data-pipeline.md
|
||||||
|
└── src/ # Your source code
|
||||||
|
```
|
||||||
|
|
||||||
|
### Example `PROMPT_plan.md`
|
||||||
|
|
||||||
|
```markdown
|
||||||
|
0a. Study `specs/*` to learn the application specifications.
|
||||||
|
0b. Study IMPLEMENTATION_PLAN.md (if present) to understand the plan so far.
|
||||||
|
0c. Study `src/` to understand existing code and shared utilities.
|
||||||
|
|
||||||
|
1. Compare specs against code (gap analysis). Create or update
|
||||||
|
IMPLEMENTATION_PLAN.md as a prioritized bullet-point list of tasks
|
||||||
|
yet to be implemented. Do NOT implement anything.
|
||||||
|
|
||||||
|
IMPORTANT: Do NOT assume functionality is missing — search the
|
||||||
|
codebase first to confirm. Prefer updating existing utilities over
|
||||||
|
creating ad-hoc copies.
|
||||||
|
```
|
||||||
|
|
||||||
|
### Example `PROMPT_build.md`
|
||||||
|
|
||||||
|
```markdown
|
||||||
|
0a. Study `specs/*` to learn the application specifications.
|
||||||
|
0b. Study IMPLEMENTATION_PLAN.md.
|
||||||
|
0c. Study `src/` for reference.
|
||||||
|
|
||||||
|
1. Choose the most important item from IMPLEMENTATION_PLAN.md. Before
|
||||||
|
making changes, search the codebase (don't assume not implemented).
|
||||||
|
2. After implementing, run the tests. If functionality is missing, add it.
|
||||||
|
3. When you discover issues, update IMPLEMENTATION_PLAN.md immediately.
|
||||||
|
4. When tests pass, update IMPLEMENTATION_PLAN.md, then `git add -A`
|
||||||
|
then `git commit` with a descriptive message.
|
||||||
|
|
||||||
|
99999. When authoring documentation, capture the why.
|
||||||
|
999999. Implement completely. No placeholders or stubs.
|
||||||
|
9999999. Keep IMPLEMENTATION_PLAN.md current — future iterations depend on it.
|
||||||
|
```
|
||||||
|
|
||||||
|
### Example `AGENTS.md`
|
||||||
|
|
||||||
|
Keep this brief (~60 lines). It's loaded every iteration, so bloat wastes context.
|
||||||
|
|
||||||
|
```markdown
|
||||||
|
## Build & Run
|
||||||
|
|
||||||
|
go build ./...
|
||||||
|
|
||||||
|
## Validation
|
||||||
|
|
||||||
|
- Tests: `go test ./...`
|
||||||
|
- Vet: `go vet ./...`
|
||||||
|
```
|
||||||
|
|
||||||
## Best Practices
|
## Best Practices
|
||||||
|
|
||||||
1. **Write clear completion criteria**: Include exactly what "done" looks like
|
1. **Fresh context per iteration**: Never accumulate context across iterations — that's the whole point
|
||||||
2. **Use output markers**: Include `<promise>COMPLETE</promise>` or similar in completion condition
|
2. **Disk is your database**: `IMPLEMENTATION_PLAN.md` is shared state between isolated sessions
|
||||||
3. **Always set max iterations**: Prevents infinite loops on impossible tasks
|
3. **Backpressure is essential**: Tests, builds, lints in `AGENTS.md` — the agent must pass them before committing
|
||||||
4. **Persist state**: Save files so AI can see what changed between iterations
|
4. **Start with PLANNING mode**: Generate the plan first, then switch to BUILDING
|
||||||
5. **Include context**: Feed previous iteration output back as context
|
5. **Observe and tune**: Watch early iterations, add guardrails to prompts when the agent fails in specific ways
|
||||||
6. **Monitor progress**: Log each iteration to track what's happening
|
6. **The plan is disposable**: If the agent goes off track, delete `IMPLEMENTATION_PLAN.md` and re-plan
|
||||||
|
7. **Keep `AGENTS.md` brief**: It's loaded every iteration — operational info only, no progress notes
|
||||||
|
8. **Use a sandbox**: The agent runs autonomously with full tool access — isolate it
|
||||||
|
|
||||||
## Example: Iterative Code Generation
|
## When to Use a Ralph Loop
|
||||||
|
|
||||||
```go
|
|
||||||
prompt := `Write a function that:
|
|
||||||
1. Parses CSV data
|
|
||||||
2. Validates required fields
|
|
||||||
3. Returns parsed records or error
|
|
||||||
4. Has unit tests
|
|
||||||
5. Output <promise>COMPLETE</promise> when done`
|
|
||||||
|
|
||||||
loop := NewRalphLoop(10, "COMPLETE")
|
|
||||||
result, err := loop.Run(context.Background(), prompt)
|
|
||||||
```
|
|
||||||
|
|
||||||
## Handling Failures
|
|
||||||
|
|
||||||
```go
|
|
||||||
ctx := context.Background()
|
|
||||||
loop := NewRalphLoop(5, "COMPLETE")
|
|
||||||
result, err := loop.Run(ctx, prompt)
|
|
||||||
if err != nil {
|
|
||||||
log.Printf("Task failed: %v", err)
|
|
||||||
log.Printf("Last attempt: %s", loop.LastResponse)
|
|
||||||
}
|
|
||||||
```
|
|
||||||
|
|
||||||
## When to Use RALPH-loop
|
|
||||||
|
|
||||||
**Good for:**
|
**Good for:**
|
||||||
- Code generation with automatic verification (tests, linters)
|
- Implementing features from specs with test-driven validation
|
||||||
- Tasks with clear success criteria
|
- Large refactors broken into many small tasks
|
||||||
- Iterative refinement where each attempt learns from previous failures
|
- Unattended, long-running development with clear requirements
|
||||||
- Unattended long-running improvements
|
- Any work where backpressure (tests/builds) can verify correctness
|
||||||
|
|
||||||
**Not good for:**
|
**Not good for:**
|
||||||
- Tasks requiring human judgment or design input
|
- Tasks requiring human judgment mid-loop
|
||||||
- One-shot operations
|
- One-shot operations that don't benefit from iteration
|
||||||
- Tasks with vague success criteria
|
- Vague requirements without testable acceptance criteria
|
||||||
- Real-time interactive debugging
|
- Exploratory prototyping where direction isn't clear
|
||||||
|
|||||||
@@ -4,127 +4,101 @@ import (
|
|||||||
"context"
|
"context"
|
||||||
"fmt"
|
"fmt"
|
||||||
"log"
|
"log"
|
||||||
|
"os"
|
||||||
|
"os/exec"
|
||||||
|
"strconv"
|
||||||
"strings"
|
"strings"
|
||||||
|
|
||||||
copilot "github.com/github/copilot-sdk/go"
|
copilot "github.com/github/copilot-sdk/go"
|
||||||
)
|
)
|
||||||
|
|
||||||
// RalphLoop implements iterative self-referential feedback loops.
|
// Ralph loop: autonomous AI task loop with fresh context per iteration.
|
||||||
// The same prompt is sent repeatedly, with AI reading its own previous output.
|
//
|
||||||
// Loop continues until completion promise is detected in the response.
|
// Two modes:
|
||||||
type RalphLoop struct {
|
// - "plan": reads PROMPT_plan.md, generates/updates IMPLEMENTATION_PLAN.md
|
||||||
client *copilot.Client
|
// - "build": reads PROMPT_build.md, implements tasks, runs tests, commits
|
||||||
iteration int
|
//
|
||||||
maxIterations int
|
// Each iteration creates a fresh session so the agent always operates in
|
||||||
completionPromise string
|
// the "smart zone" of its context window. State is shared between
|
||||||
LastResponse string
|
// iterations via files on disk (IMPLEMENTATION_PLAN.md, AGENTS.md, specs/*).
|
||||||
}
|
//
|
||||||
|
// Usage:
|
||||||
|
// go run ralph-loop.go # build mode, 50 iterations
|
||||||
|
// go run ralph-loop.go plan # planning mode
|
||||||
|
// go run ralph-loop.go 20 # build mode, 20 iterations
|
||||||
|
// go run ralph-loop.go plan 5 # planning mode, 5 iterations
|
||||||
|
|
||||||
// NewRalphLoop creates a new RALPH-loop instance.
|
func ralphLoop(ctx context.Context, mode string, maxIterations int) error {
|
||||||
func NewRalphLoop(maxIterations int, completionPromise string) *RalphLoop {
|
promptFile := "PROMPT_build.md"
|
||||||
return &RalphLoop{
|
if mode == "plan" {
|
||||||
client: copilot.NewClient(nil),
|
promptFile = "PROMPT_plan.md"
|
||||||
maxIterations: maxIterations,
|
|
||||||
completionPromise: completionPromise,
|
|
||||||
}
|
}
|
||||||
}
|
|
||||||
|
|
||||||
// Run executes the RALPH-loop until completion promise is detected or max iterations reached.
|
client := copilot.NewClient(nil)
|
||||||
func (r *RalphLoop) Run(ctx context.Context, initialPrompt string) (string, error) {
|
if err := client.Start(ctx); err != nil {
|
||||||
if err := r.client.Start(ctx); err != nil {
|
return fmt.Errorf("failed to start client: %w", err)
|
||||||
return "", fmt.Errorf("failed to start client: %w", err)
|
|
||||||
}
|
}
|
||||||
defer r.client.Stop()
|
defer client.Stop()
|
||||||
|
|
||||||
session, err := r.client.CreateSession(ctx, &copilot.SessionConfig{
|
branchOut, _ := exec.Command("git", "branch", "--show-current").Output()
|
||||||
Model: "gpt-5.1-codex-mini",
|
branch := strings.TrimSpace(string(branchOut))
|
||||||
})
|
|
||||||
|
fmt.Println(strings.Repeat("━", 40))
|
||||||
|
fmt.Printf("Mode: %s\n", mode)
|
||||||
|
fmt.Printf("Prompt: %s\n", promptFile)
|
||||||
|
fmt.Printf("Branch: %s\n", branch)
|
||||||
|
fmt.Printf("Max: %d iterations\n", maxIterations)
|
||||||
|
fmt.Println(strings.Repeat("━", 40))
|
||||||
|
|
||||||
|
prompt, err := os.ReadFile(promptFile)
|
||||||
if err != nil {
|
if err != nil {
|
||||||
return "", fmt.Errorf("failed to create session: %w", err)
|
return fmt.Errorf("failed to read %s: %w", promptFile, err)
|
||||||
}
|
}
|
||||||
defer session.Destroy()
|
|
||||||
|
|
||||||
for r.iteration < r.maxIterations {
|
for i := 1; i <= maxIterations; i++ {
|
||||||
r.iteration++
|
fmt.Printf("\n=== Iteration %d/%d ===\n", i, maxIterations)
|
||||||
fmt.Printf("\n=== Iteration %d/%d ===\n", r.iteration, r.maxIterations)
|
|
||||||
|
|
||||||
currentPrompt := r.buildIterationPrompt(initialPrompt)
|
// Fresh session — each task gets full context budget
|
||||||
fmt.Printf("Sending prompt (length: %d)...\n", len(currentPrompt))
|
session, err := client.CreateSession(ctx, &copilot.SessionConfig{
|
||||||
|
Model: "claude-sonnet-4.5",
|
||||||
result, err := session.SendAndWait(ctx, copilot.MessageOptions{
|
|
||||||
Prompt: currentPrompt,
|
|
||||||
})
|
})
|
||||||
if err != nil {
|
if err != nil {
|
||||||
return "", fmt.Errorf("send failed on iteration %d: %w", r.iteration, err)
|
return fmt.Errorf("failed to create session: %w", err)
|
||||||
}
|
}
|
||||||
|
|
||||||
if result != nil && result.Data.Content != nil {
|
_, err = session.SendAndWait(ctx, copilot.MessageOptions{
|
||||||
r.LastResponse = *result.Data.Content
|
Prompt: string(prompt),
|
||||||
} else {
|
})
|
||||||
r.LastResponse = ""
|
session.Destroy()
|
||||||
|
if err != nil {
|
||||||
|
return fmt.Errorf("send failed on iteration %d: %w", i, err)
|
||||||
}
|
}
|
||||||
|
|
||||||
// Display response summary
|
// Push changes after each iteration
|
||||||
summary := r.LastResponse
|
if err := exec.Command("git", "push", "origin", branch).Run(); err != nil {
|
||||||
if len(summary) > 200 {
|
exec.Command("git", "push", "-u", "origin", branch).Run()
|
||||||
summary = summary[:200] + "..."
|
|
||||||
}
|
|
||||||
fmt.Printf("Response: %s\n", summary)
|
|
||||||
|
|
||||||
// Check for completion promise
|
|
||||||
if strings.Contains(r.LastResponse, r.completionPromise) {
|
|
||||||
fmt.Printf("\n✓ Success! Completion promise detected: '%s'\n", r.completionPromise)
|
|
||||||
return r.LastResponse, nil
|
|
||||||
}
|
}
|
||||||
|
|
||||||
fmt.Printf("Iteration %d complete. Continuing...\n", r.iteration)
|
fmt.Printf("\nIteration %d complete.\n", i)
|
||||||
}
|
}
|
||||||
|
|
||||||
return "", fmt.Errorf("maximum iterations (%d) reached without detecting completion promise: '%s'",
|
fmt.Printf("\nReached max iterations: %d\n", maxIterations)
|
||||||
r.maxIterations, r.completionPromise)
|
return nil
|
||||||
}
|
|
||||||
|
|
||||||
func (r *RalphLoop) buildIterationPrompt(initialPrompt string) string {
|
|
||||||
if r.iteration == 1 {
|
|
||||||
return initialPrompt
|
|
||||||
}
|
|
||||||
|
|
||||||
return fmt.Sprintf(`%s
|
|
||||||
|
|
||||||
=== CONTEXT FROM PREVIOUS ITERATION ===
|
|
||||||
%s
|
|
||||||
=== END CONTEXT ===
|
|
||||||
|
|
||||||
Continue working on this task. Review the previous attempt and improve upon it.`,
|
|
||||||
initialPrompt, r.LastResponse)
|
|
||||||
}
|
}
|
||||||
|
|
||||||
func main() {
|
func main() {
|
||||||
prompt := `You are iteratively building a small library. Follow these phases IN ORDER.
|
mode := "build"
|
||||||
Do NOT skip ahead — only do the current phase, then stop and wait for the next iteration.
|
maxIterations := 50
|
||||||
|
|
||||||
Phase 1: Design a DataValidator struct that validates records against a schema.
|
for _, arg := range os.Args[1:] {
|
||||||
- Schema defines field names, types (string, int, float, bool), and whether required.
|
if arg == "plan" {
|
||||||
- Return a slice of validation errors per record.
|
mode = "plan"
|
||||||
- Show the struct and method code only. Do NOT output COMPLETE.
|
} else if n, err := strconv.Atoi(arg); err == nil {
|
||||||
|
maxIterations = n
|
||||||
Phase 2: Write at least 4 unit tests covering: missing required field, wrong type,
|
}
|
||||||
valid record, and empty input. Show test code only. Do NOT output COMPLETE.
|
|
||||||
|
|
||||||
Phase 3: Review the code from phases 1 and 2. Fix any bugs, add doc comments, and add
|
|
||||||
an extra edge-case test. Show the final consolidated code with all fixes.
|
|
||||||
When this phase is fully done, output the exact text: COMPLETE`
|
|
||||||
|
|
||||||
ctx := context.Background()
|
|
||||||
loop := NewRalphLoop(5, "COMPLETE")
|
|
||||||
|
|
||||||
result, err := loop.Run(ctx, prompt)
|
|
||||||
if err != nil {
|
|
||||||
log.Printf("Task did not complete: %v", err)
|
|
||||||
log.Printf("Last attempt: %s", loop.LastResponse)
|
|
||||||
return
|
|
||||||
}
|
}
|
||||||
|
|
||||||
fmt.Println("\n=== FINAL RESULT ===")
|
if err := ralphLoop(context.Background(), mode, maxIterations); err != nil {
|
||||||
fmt.Println(result)
|
log.Fatal(err)
|
||||||
|
}
|
||||||
}
|
}
|
||||||
|
|||||||
@@ -1,6 +1,6 @@
|
|||||||
# RALPH-loop: Iterative Self-Referential AI Loops
|
# Ralph Loop: Autonomous AI Task Loops
|
||||||
|
|
||||||
Implement self-referential feedback loops where an AI agent iteratively improves work by reading its own previous output.
|
Build autonomous coding loops where an AI agent picks tasks, implements them, validates against backpressure (tests, builds), commits, and repeats — each iteration in a fresh context window.
|
||||||
|
|
||||||
> **Runnable example:** [recipe/ralph-loop.ts](recipe/ralph-loop.ts)
|
> **Runnable example:** [recipe/ralph-loop.ts](recipe/ralph-loop.ts)
|
||||||
>
|
>
|
||||||
@@ -9,200 +9,217 @@ Implement self-referential feedback loops where an AI agent iteratively improves
|
|||||||
> npx tsx ralph-loop.ts
|
> npx tsx ralph-loop.ts
|
||||||
> ```
|
> ```
|
||||||
|
|
||||||
## What is RALPH-loop?
|
## What is a Ralph Loop?
|
||||||
|
|
||||||
RALPH-loop is a development methodology for iterative AI-powered task completion. Named after the Ralph Wiggum technique, it embodies the philosophy of persistent iteration:
|
A [Ralph loop](https://ghuntley.com/ralph/) is an autonomous development workflow where an AI agent iterates through tasks in isolated context windows. The key insight: **state lives on disk, not in the model's context**. Each iteration starts fresh, reads the current state from files, does one task, writes results back to disk, and exits.
|
||||||
|
|
||||||
- **One prompt, multiple iterations**: The same prompt is processed repeatedly
|
```
|
||||||
- **Self-referential feedback**: The AI reads its own previous work (file changes, git history)
|
┌─────────────────────────────────────────────────┐
|
||||||
- **Completion detection**: Loop exits when a completion promise is detected in output
|
│ loop.sh │
|
||||||
- **Safety limits**: Always include a maximum iteration count to prevent infinite loops
|
│ while true: │
|
||||||
|
│ ┌─────────────────────────────────────────┐ │
|
||||||
## Example Scenario
|
│ │ Fresh session (isolated context) │ │
|
||||||
|
│ │ │ │
|
||||||
You need to iteratively improve code until all tests pass. Instead of asking Copilot to "write perfect code," you use RALPH-loop to:
|
│ │ 1. Read PROMPT.md + AGENTS.md │ │
|
||||||
|
│ │ 2. Study specs/* and code │ │
|
||||||
1. Send the initial prompt with clear success criteria
|
│ │ 3. Pick next task from plan │ │
|
||||||
2. Copilot writes code and tests
|
│ │ 4. Implement + run tests │ │
|
||||||
3. Copilot runs tests and sees failures
|
│ │ 5. Update plan, commit, exit │ │
|
||||||
4. Loop automatically re-sends the prompt
|
│ └─────────────────────────────────────────┘ │
|
||||||
5. Copilot reads test output and previous code, fixes issues
|
│ ↻ next iteration (fresh context) │
|
||||||
6. Repeat until all tests pass and completion promise is output
|
└─────────────────────────────────────────────────┘
|
||||||
|
|
||||||
## Basic Implementation
|
|
||||||
|
|
||||||
```typescript
|
|
||||||
import { CopilotClient } from "@github/copilot-sdk";
|
|
||||||
|
|
||||||
class RalphLoop {
|
|
||||||
private client: CopilotClient;
|
|
||||||
private iteration: number = 0;
|
|
||||||
private maxIterations: number;
|
|
||||||
private completionPromise: string;
|
|
||||||
private lastResponse: string | null = null;
|
|
||||||
|
|
||||||
constructor(maxIterations: number = 10, completionPromise: string = "COMPLETE") {
|
|
||||||
this.client = new CopilotClient();
|
|
||||||
this.maxIterations = maxIterations;
|
|
||||||
this.completionPromise = completionPromise;
|
|
||||||
}
|
|
||||||
|
|
||||||
async run(initialPrompt: string): Promise<string> {
|
|
||||||
await this.client.start();
|
|
||||||
const session = await this.client.createSession({ model: "gpt-5.1-codex-mini" });
|
|
||||||
|
|
||||||
try {
|
|
||||||
while (this.iteration < this.maxIterations) {
|
|
||||||
this.iteration++;
|
|
||||||
console.log(`\n--- Iteration ${this.iteration}/${this.maxIterations} ---`);
|
|
||||||
|
|
||||||
// Build prompt including previous response as context
|
|
||||||
const prompt = this.iteration === 1
|
|
||||||
? initialPrompt
|
|
||||||
: `${initialPrompt}\n\nPrevious attempt:\n${this.lastResponse}\n\nContinue improving...`;
|
|
||||||
|
|
||||||
const response = await session.sendAndWait({ prompt });
|
|
||||||
this.lastResponse = response?.data.content || "";
|
|
||||||
|
|
||||||
console.log(`Response (${this.lastResponse.length} chars)`);
|
|
||||||
|
|
||||||
// Check for completion promise
|
|
||||||
if (this.lastResponse.includes(this.completionPromise)) {
|
|
||||||
console.log(`✓ Completion promise detected: ${this.completionPromise}`);
|
|
||||||
return this.lastResponse;
|
|
||||||
}
|
|
||||||
|
|
||||||
console.log(`Continuing to iteration ${this.iteration + 1}...`);
|
|
||||||
}
|
|
||||||
|
|
||||||
throw new Error(
|
|
||||||
`Max iterations (${this.maxIterations}) reached without completion promise`
|
|
||||||
);
|
|
||||||
} finally {
|
|
||||||
await session.destroy();
|
|
||||||
await this.client.stop();
|
|
||||||
}
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
// Usage
|
|
||||||
const loop = new RalphLoop(5, "COMPLETE");
|
|
||||||
const result = await loop.run("Your task here");
|
|
||||||
console.log(result);
|
|
||||||
```
|
```
|
||||||
|
|
||||||
## With File Persistence
|
**Core principles:**
|
||||||
|
|
||||||
For tasks involving code generation, persist state to files so the AI can see changes:
|
- **Fresh context per iteration**: Each loop creates a new session — no context accumulation, always in the "smart zone"
|
||||||
|
- **Disk as shared state**: `IMPLEMENTATION_PLAN.md` persists between iterations and acts as the coordination mechanism
|
||||||
|
- **Backpressure steers quality**: Tests, builds, and lints reject bad work — the agent must fix issues before committing
|
||||||
|
- **Two modes**: PLANNING (gap analysis → generate plan) and BUILDING (implement from plan)
|
||||||
|
|
||||||
|
## Simple Version
|
||||||
|
|
||||||
|
The minimal Ralph loop — the SDK equivalent of `while :; do cat PROMPT.md | claude ; done`:
|
||||||
|
|
||||||
```typescript
|
```typescript
|
||||||
import fs from "fs/promises";
|
import { readFile } from "fs/promises";
|
||||||
import path from "path";
|
|
||||||
import { CopilotClient } from "@github/copilot-sdk";
|
import { CopilotClient } from "@github/copilot-sdk";
|
||||||
|
|
||||||
class PersistentRalphLoop {
|
async function ralphLoop(promptFile: string, maxIterations: number = 50) {
|
||||||
private client: CopilotClient;
|
const client = new CopilotClient();
|
||||||
private workDir: string;
|
await client.start();
|
||||||
private iteration: number = 0;
|
|
||||||
private maxIterations: number;
|
|
||||||
|
|
||||||
constructor(workDir: string, maxIterations: number = 10) {
|
try {
|
||||||
this.client = new CopilotClient();
|
const prompt = await readFile(promptFile, "utf-8");
|
||||||
this.workDir = workDir;
|
|
||||||
this.maxIterations = maxIterations;
|
|
||||||
}
|
|
||||||
|
|
||||||
async run(initialPrompt: string): Promise<string> {
|
for (let i = 1; i <= maxIterations; i++) {
|
||||||
await fs.mkdir(this.workDir, { recursive: true });
|
console.log(`\n=== Iteration ${i}/${maxIterations} ===`);
|
||||||
await this.client.start();
|
|
||||||
const session = await this.client.createSession({ model: "gpt-5.1-codex-mini" });
|
|
||||||
|
|
||||||
try {
|
// Fresh session each iteration — context isolation is the point
|
||||||
// Store initial prompt
|
const session = await client.createSession({ model: "claude-sonnet-4.5" });
|
||||||
await fs.writeFile(path.join(this.workDir, "prompt.md"), initialPrompt);
|
try {
|
||||||
|
await session.sendAndWait({ prompt }, 600_000);
|
||||||
while (this.iteration < this.maxIterations) {
|
} finally {
|
||||||
this.iteration++;
|
await session.destroy();
|
||||||
console.log(`\n--- Iteration ${this.iteration} ---`);
|
|
||||||
|
|
||||||
// Build context from previous outputs
|
|
||||||
let context = initialPrompt;
|
|
||||||
const prevOutputFile = path.join(this.workDir, `output-${this.iteration - 1}.txt`);
|
|
||||||
try {
|
|
||||||
const prevOutput = await fs.readFile(prevOutputFile, "utf-8");
|
|
||||||
context += `\n\nPrevious iteration:\n${prevOutput}`;
|
|
||||||
} catch {
|
|
||||||
// No previous output yet
|
|
||||||
}
|
|
||||||
|
|
||||||
const response = await session.sendAndWait({ prompt: context });
|
|
||||||
const output = response?.data.content || "";
|
|
||||||
|
|
||||||
// Persist output
|
|
||||||
await fs.writeFile(
|
|
||||||
path.join(this.workDir, `output-${this.iteration}.txt`),
|
|
||||||
output
|
|
||||||
);
|
|
||||||
|
|
||||||
if (output.includes("COMPLETE")) {
|
|
||||||
return output;
|
|
||||||
}
|
|
||||||
}
|
}
|
||||||
|
|
||||||
throw new Error("Max iterations reached");
|
console.log(`Iteration ${i} complete.`);
|
||||||
} finally {
|
|
||||||
await session.destroy();
|
|
||||||
await this.client.stop();
|
|
||||||
}
|
}
|
||||||
|
} finally {
|
||||||
|
await client.stop();
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
|
// Usage: point at your PROMPT.md
|
||||||
|
ralphLoop("PROMPT.md", 20);
|
||||||
|
```
|
||||||
|
|
||||||
|
This is all you need to get started. The prompt file tells the agent what to do; the agent reads project files, does work, commits, and exits. The loop restarts with a clean slate.
|
||||||
|
|
||||||
|
## Ideal Version
|
||||||
|
|
||||||
|
The full Ralph pattern with planning and building modes, matching the [Ralph Playbook](https://github.com/ClaytonFarr/ralph-playbook) architecture:
|
||||||
|
|
||||||
|
```typescript
|
||||||
|
import { readFile } from "fs/promises";
|
||||||
|
import { execSync } from "child_process";
|
||||||
|
import { CopilotClient } from "@github/copilot-sdk";
|
||||||
|
|
||||||
|
type Mode = "plan" | "build";
|
||||||
|
|
||||||
|
async function ralphLoop(mode: Mode, maxIterations: number = 50) {
|
||||||
|
const promptFile = mode === "plan" ? "PROMPT_plan.md" : "PROMPT_build.md";
|
||||||
|
const client = new CopilotClient();
|
||||||
|
await client.start();
|
||||||
|
|
||||||
|
const branch = execSync("git branch --show-current", { encoding: "utf-8" }).trim();
|
||||||
|
console.log(`Mode: ${mode} | Prompt: ${promptFile} | Branch: ${branch}`);
|
||||||
|
|
||||||
|
try {
|
||||||
|
const prompt = await readFile(promptFile, "utf-8");
|
||||||
|
|
||||||
|
for (let i = 1; i <= maxIterations; i++) {
|
||||||
|
console.log(`\n=== Iteration ${i}/${maxIterations} ===`);
|
||||||
|
|
||||||
|
// Fresh session — each task gets full context budget
|
||||||
|
const session = await client.createSession({ model: "claude-sonnet-4.5" });
|
||||||
|
try {
|
||||||
|
await session.sendAndWait({ prompt }, 600_000);
|
||||||
|
} finally {
|
||||||
|
await session.destroy();
|
||||||
|
}
|
||||||
|
|
||||||
|
// Push changes after each iteration
|
||||||
|
try {
|
||||||
|
execSync(`git push origin ${branch}`, { stdio: "inherit" });
|
||||||
|
} catch {
|
||||||
|
execSync(`git push -u origin ${branch}`, { stdio: "inherit" });
|
||||||
|
}
|
||||||
|
|
||||||
|
console.log(`Iteration ${i} complete.`);
|
||||||
|
}
|
||||||
|
} finally {
|
||||||
|
await client.stop();
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
// Parse CLI args: npx tsx ralph-loop.ts [plan] [max_iterations]
|
||||||
|
const args = process.argv.slice(2);
|
||||||
|
const mode: Mode = args.includes("plan") ? "plan" : "build";
|
||||||
|
const maxArg = args.find(a => /^\d+$/.test(a));
|
||||||
|
const maxIterations = maxArg ? parseInt(maxArg) : 50;
|
||||||
|
|
||||||
|
ralphLoop(mode, maxIterations);
|
||||||
|
```
|
||||||
|
|
||||||
|
### Required Project Files
|
||||||
|
|
||||||
|
The ideal version expects this file structure in your project:
|
||||||
|
|
||||||
|
```
|
||||||
|
project-root/
|
||||||
|
├── PROMPT_plan.md # Planning mode instructions
|
||||||
|
├── PROMPT_build.md # Building mode instructions
|
||||||
|
├── AGENTS.md # Operational guide (build/test commands)
|
||||||
|
├── IMPLEMENTATION_PLAN.md # Task list (generated by planning mode)
|
||||||
|
├── specs/ # Requirement specs (one per topic)
|
||||||
|
│ ├── auth.md
|
||||||
|
│ └── data-pipeline.md
|
||||||
|
└── src/ # Your source code
|
||||||
|
```
|
||||||
|
|
||||||
|
### Example `PROMPT_plan.md`
|
||||||
|
|
||||||
|
```markdown
|
||||||
|
0a. Study `specs/*` to learn the application specifications.
|
||||||
|
0b. Study IMPLEMENTATION_PLAN.md (if present) to understand the plan so far.
|
||||||
|
0c. Study `src/` to understand existing code and shared utilities.
|
||||||
|
|
||||||
|
1. Compare specs against code (gap analysis). Create or update
|
||||||
|
IMPLEMENTATION_PLAN.md as a prioritized bullet-point list of tasks
|
||||||
|
yet to be implemented. Do NOT implement anything.
|
||||||
|
|
||||||
|
IMPORTANT: Do NOT assume functionality is missing — search the
|
||||||
|
codebase first to confirm. Prefer updating existing utilities over
|
||||||
|
creating ad-hoc copies.
|
||||||
|
```
|
||||||
|
|
||||||
|
### Example `PROMPT_build.md`
|
||||||
|
|
||||||
|
```markdown
|
||||||
|
0a. Study `specs/*` to learn the application specifications.
|
||||||
|
0b. Study IMPLEMENTATION_PLAN.md.
|
||||||
|
0c. Study `src/` for reference.
|
||||||
|
|
||||||
|
1. Choose the most important item from IMPLEMENTATION_PLAN.md. Before
|
||||||
|
making changes, search the codebase (don't assume not implemented).
|
||||||
|
2. After implementing, run the tests. If functionality is missing, add it.
|
||||||
|
3. When you discover issues, update IMPLEMENTATION_PLAN.md immediately.
|
||||||
|
4. When tests pass, update IMPLEMENTATION_PLAN.md, then `git add -A`
|
||||||
|
then `git commit` with a descriptive message.
|
||||||
|
|
||||||
|
99999. When authoring documentation, capture the why.
|
||||||
|
999999. Implement completely. No placeholders or stubs.
|
||||||
|
9999999. Keep IMPLEMENTATION_PLAN.md current — future iterations depend on it.
|
||||||
|
```
|
||||||
|
|
||||||
|
### Example `AGENTS.md`
|
||||||
|
|
||||||
|
Keep this brief (~60 lines). It's loaded every iteration, so bloat wastes context.
|
||||||
|
|
||||||
|
```markdown
|
||||||
|
## Build & Run
|
||||||
|
|
||||||
|
npm run build
|
||||||
|
|
||||||
|
## Validation
|
||||||
|
|
||||||
|
- Tests: `npm test`
|
||||||
|
- Typecheck: `npx tsc --noEmit`
|
||||||
|
- Lint: `npm run lint`
|
||||||
```
|
```
|
||||||
|
|
||||||
## Best Practices
|
## Best Practices
|
||||||
|
|
||||||
1. **Write clear completion criteria**: Include exactly what "done" looks like
|
1. **Fresh context per iteration**: Never accumulate context across iterations — that's the whole point
|
||||||
2. **Use output markers**: Include `<promise>COMPLETE</promise>` or similar in completion condition
|
2. **Disk is your database**: `IMPLEMENTATION_PLAN.md` is shared state between isolated sessions
|
||||||
3. **Always set max iterations**: Prevents infinite loops on impossible tasks
|
3. **Backpressure is essential**: Tests, builds, lints in `AGENTS.md` — the agent must pass them before committing
|
||||||
4. **Persist state**: Save files so AI can see what changed between iterations
|
4. **Start with PLANNING mode**: Generate the plan first, then switch to BUILDING
|
||||||
5. **Include context**: Feed previous iteration output back as context
|
5. **Observe and tune**: Watch early iterations, add guardrails to prompts when the agent fails in specific ways
|
||||||
6. **Monitor progress**: Log each iteration to track what's happening
|
6. **The plan is disposable**: If the agent goes off track, delete `IMPLEMENTATION_PLAN.md` and re-plan
|
||||||
|
7. **Keep `AGENTS.md` brief**: It's loaded every iteration — operational info only, no progress notes
|
||||||
|
8. **Use a sandbox**: The agent runs autonomously with full tool access — isolate it
|
||||||
|
|
||||||
## Example: Iterative Code Generation
|
## When to Use a Ralph Loop
|
||||||
|
|
||||||
```typescript
|
|
||||||
const prompt = `Write a function that:
|
|
||||||
1. Parses CSV data
|
|
||||||
2. Validates required fields
|
|
||||||
3. Returns parsed records or error
|
|
||||||
4. Has unit tests
|
|
||||||
5. Output <promise>COMPLETE</promise> when done`;
|
|
||||||
|
|
||||||
const loop = new RalphLoop(10, "COMPLETE");
|
|
||||||
const result = await loop.run(prompt);
|
|
||||||
```
|
|
||||||
|
|
||||||
## Handling Failures
|
|
||||||
|
|
||||||
```typescript
|
|
||||||
try {
|
|
||||||
const result = await loop.run(prompt);
|
|
||||||
console.log("Task completed successfully!");
|
|
||||||
} catch (error) {
|
|
||||||
console.error("Task failed:", error.message);
|
|
||||||
// Analyze what was attempted and suggest alternatives
|
|
||||||
}
|
|
||||||
```
|
|
||||||
|
|
||||||
## When to Use RALPH-loop
|
|
||||||
|
|
||||||
**Good for:**
|
**Good for:**
|
||||||
- Code generation with automatic verification (tests, linters)
|
- Implementing features from specs with test-driven validation
|
||||||
- Tasks with clear success criteria
|
- Large refactors broken into many small tasks
|
||||||
- Iterative refinement where each attempt learns from previous failures
|
- Unattended, long-running development with clear requirements
|
||||||
- Unattended long-running improvements
|
- Any work where backpressure (tests/builds) can verify correctness
|
||||||
|
|
||||||
**Not good for:**
|
**Not good for:**
|
||||||
- Tasks requiring human judgment or design input
|
- Tasks requiring human judgment mid-loop
|
||||||
- One-shot operations
|
- One-shot operations that don't benefit from iteration
|
||||||
- Tasks with vague success criteria
|
- Vague requirements without testable acceptance criteria
|
||||||
- Real-time interactive debugging
|
- Exploratory prototyping where direction isn't clear
|
||||||
|
|||||||
@@ -1,128 +1,79 @@
|
|||||||
|
import { readFile } from "fs/promises";
|
||||||
|
import { execSync } from "child_process";
|
||||||
import { CopilotClient } from "@github/copilot-sdk";
|
import { CopilotClient } from "@github/copilot-sdk";
|
||||||
|
|
||||||
/**
|
/**
|
||||||
* RALPH-loop implementation: Iterative self-referential AI loops.
|
* Ralph loop: autonomous AI task loop with fresh context per iteration.
|
||||||
* The same prompt is sent repeatedly, with AI reading its own previous output.
|
*
|
||||||
* Loop continues until completion promise is detected in the response.
|
* Two modes:
|
||||||
|
* - "plan": reads PROMPT_plan.md, generates/updates IMPLEMENTATION_PLAN.md
|
||||||
|
* - "build": reads PROMPT_build.md, implements tasks, runs tests, commits
|
||||||
|
*
|
||||||
|
* Each iteration creates a fresh session so the agent always operates in
|
||||||
|
* the "smart zone" of its context window. State is shared between
|
||||||
|
* iterations via files on disk (IMPLEMENTATION_PLAN.md, AGENTS.md, specs/*).
|
||||||
|
*
|
||||||
|
* Usage:
|
||||||
|
* npx tsx ralph-loop.ts # build mode, 50 iterations
|
||||||
|
* npx tsx ralph-loop.ts plan # planning mode
|
||||||
|
* npx tsx ralph-loop.ts 20 # build mode, 20 iterations
|
||||||
|
* npx tsx ralph-loop.ts plan 5 # planning mode, 5 iterations
|
||||||
*/
|
*/
|
||||||
class RalphLoop {
|
|
||||||
private client: CopilotClient;
|
|
||||||
private iteration: number = 0;
|
|
||||||
private readonly maxIterations: number;
|
|
||||||
private readonly completionPromise: string;
|
|
||||||
public lastResponse: string | null = null;
|
|
||||||
|
|
||||||
constructor(maxIterations: number = 10, completionPromise: string = "COMPLETE") {
|
type Mode = "plan" | "build";
|
||||||
this.client = new CopilotClient();
|
|
||||||
this.maxIterations = maxIterations;
|
|
||||||
this.completionPromise = completionPromise;
|
|
||||||
}
|
|
||||||
|
|
||||||
/**
|
async function ralphLoop(mode: Mode, maxIterations: number) {
|
||||||
* Run the RALPH-loop until completion promise is detected or max iterations reached.
|
const promptFile = mode === "plan" ? "PROMPT_plan.md" : "PROMPT_build.md";
|
||||||
*/
|
|
||||||
async run(initialPrompt: string): Promise<string> {
|
|
||||||
let session: Awaited<ReturnType<CopilotClient["createSession"]>> | null = null;
|
|
||||||
|
|
||||||
await this.client.start();
|
const client = new CopilotClient();
|
||||||
try {
|
await client.start();
|
||||||
session = await this.client.createSession({
|
|
||||||
model: "gpt-5.1-codex-mini"
|
const branch = execSync("git branch --show-current", { encoding: "utf-8" }).trim();
|
||||||
|
|
||||||
|
console.log("━".repeat(40));
|
||||||
|
console.log(`Mode: ${mode}`);
|
||||||
|
console.log(`Prompt: ${promptFile}`);
|
||||||
|
console.log(`Branch: ${branch}`);
|
||||||
|
console.log(`Max: ${maxIterations} iterations`);
|
||||||
|
console.log("━".repeat(40));
|
||||||
|
|
||||||
|
try {
|
||||||
|
const prompt = await readFile(promptFile, "utf-8");
|
||||||
|
|
||||||
|
for (let i = 1; i <= maxIterations; i++) {
|
||||||
|
console.log(`\n=== Iteration ${i}/${maxIterations} ===`);
|
||||||
|
|
||||||
|
// Fresh session — each task gets full context budget
|
||||||
|
const session = await client.createSession({
|
||||||
|
model: "claude-sonnet-4.5",
|
||||||
});
|
});
|
||||||
|
|
||||||
try {
|
try {
|
||||||
while (this.iteration < this.maxIterations) {
|
await session.sendAndWait({ prompt }, 600_000);
|
||||||
this.iteration++;
|
|
||||||
console.log(`\n=== Iteration ${this.iteration}/${this.maxIterations} ===`);
|
|
||||||
|
|
||||||
// Build the prompt for this iteration
|
|
||||||
const currentPrompt = this.buildIterationPrompt(initialPrompt);
|
|
||||||
console.log(`Sending prompt (length: ${currentPrompt.length})...`);
|
|
||||||
|
|
||||||
const response = await session.sendAndWait({ prompt: currentPrompt }, 300_000);
|
|
||||||
this.lastResponse = response?.data.content || "";
|
|
||||||
|
|
||||||
// Display response summary
|
|
||||||
const summary = this.lastResponse.length > 200
|
|
||||||
? this.lastResponse.substring(0, 200) + "..."
|
|
||||||
: this.lastResponse;
|
|
||||||
console.log(`Response: ${summary}`);
|
|
||||||
|
|
||||||
// Check for completion promise
|
|
||||||
if (this.lastResponse.includes(this.completionPromise)) {
|
|
||||||
console.log(`\n✓ Success! Completion promise detected: '${this.completionPromise}'`);
|
|
||||||
return this.lastResponse;
|
|
||||||
}
|
|
||||||
|
|
||||||
console.log(`Iteration ${this.iteration} complete. Checking for next iteration...`);
|
|
||||||
}
|
|
||||||
|
|
||||||
// Max iterations reached without completion
|
|
||||||
throw new Error(
|
|
||||||
`Maximum iterations (${this.maxIterations}) reached without detecting completion promise: '${this.completionPromise}'`
|
|
||||||
);
|
|
||||||
} catch (error) {
|
|
||||||
console.error(`\nError during RALPH-loop: ${error instanceof Error ? error.message : String(error)}`);
|
|
||||||
throw error;
|
|
||||||
} finally {
|
} finally {
|
||||||
if (session) {
|
await session.destroy();
|
||||||
await session.destroy();
|
|
||||||
}
|
|
||||||
}
|
}
|
||||||
} finally {
|
|
||||||
await this.client.stop();
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
/**
|
// Push changes after each iteration
|
||||||
* Build the prompt for the current iteration, including previous output as context.
|
try {
|
||||||
*/
|
execSync(`git push origin ${branch}`, { stdio: "inherit" });
|
||||||
private buildIterationPrompt(initialPrompt: string): string {
|
} catch {
|
||||||
if (this.iteration === 1) {
|
execSync(`git push -u origin ${branch}`, { stdio: "inherit" });
|
||||||
// First iteration: just the initial prompt
|
}
|
||||||
return initialPrompt;
|
|
||||||
|
console.log(`\nIteration ${i} complete.`);
|
||||||
}
|
}
|
||||||
|
|
||||||
// Subsequent iterations: include previous output as context
|
console.log(`\nReached max iterations: ${maxIterations}`);
|
||||||
return `${initialPrompt}
|
} finally {
|
||||||
|
await client.stop();
|
||||||
=== CONTEXT FROM PREVIOUS ITERATION ===
|
|
||||||
${this.lastResponse}
|
|
||||||
=== END CONTEXT ===
|
|
||||||
|
|
||||||
Continue working on this task. Review the previous attempt and improve upon it.`;
|
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
// Example usage demonstrating RALPH-loop
|
// Parse CLI args
|
||||||
async function main() {
|
const args = process.argv.slice(2);
|
||||||
const prompt = `You are iteratively building a small library. Follow these phases IN ORDER.
|
const mode: Mode = args.includes("plan") ? "plan" : "build";
|
||||||
Do NOT skip ahead — only do the current phase, then stop and wait for the next iteration.
|
const maxArg = args.find((a) => /^\d+$/.test(a));
|
||||||
|
const maxIterations = maxArg ? parseInt(maxArg) : 50;
|
||||||
|
|
||||||
Phase 1: Design a DataValidator class that validates records against a schema.
|
ralphLoop(mode, maxIterations).catch(console.error);
|
||||||
- Schema defines field names, types (str, int, float, bool), and whether required.
|
|
||||||
- Return a list of validation errors per record.
|
|
||||||
- Show the class code only. Do NOT output COMPLETE.
|
|
||||||
|
|
||||||
Phase 2: Write at least 4 unit tests covering: missing required field, wrong type,
|
|
||||||
valid record, and empty input. Show test code only. Do NOT output COMPLETE.
|
|
||||||
|
|
||||||
Phase 3: Review the code from phases 1 and 2. Fix any bugs, add docstrings, and add
|
|
||||||
an extra edge-case test. Show the final consolidated code with all fixes.
|
|
||||||
When this phase is fully done, output the exact text: COMPLETE`;
|
|
||||||
|
|
||||||
const loop = new RalphLoop(5, "COMPLETE");
|
|
||||||
|
|
||||||
try {
|
|
||||||
const result = await loop.run(prompt);
|
|
||||||
console.log("\n=== FINAL RESULT ===");
|
|
||||||
console.log(result);
|
|
||||||
} catch (error) {
|
|
||||||
console.error(`\nTask did not complete: ${error instanceof Error ? error.message : String(error)}`);
|
|
||||||
if (loop.lastResponse) {
|
|
||||||
console.log(`\nLast attempt:\n${loop.lastResponse}`);
|
|
||||||
}
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
main().catch(console.error);
|
|
||||||
|
|||||||
@@ -1,6 +1,6 @@
|
|||||||
# RALPH-loop: Iterative Self-Referential AI Loops
|
# Ralph Loop: Autonomous AI Task Loops
|
||||||
|
|
||||||
Implement self-referential feedback loops where an AI agent iteratively improves work by reading its own previous output.
|
Build autonomous coding loops where an AI agent picks tasks, implements them, validates against backpressure (tests, builds), commits, and repeats — each iteration in a fresh context window.
|
||||||
|
|
||||||
> **Runnable example:** [recipe/ralph_loop.py](recipe/ralph_loop.py)
|
> **Runnable example:** [recipe/ralph_loop.py](recipe/ralph_loop.py)
|
||||||
>
|
>
|
||||||
@@ -8,196 +8,235 @@ Implement self-referential feedback loops where an AI agent iteratively improves
|
|||||||
> cd recipe && pip install -r requirements.txt
|
> cd recipe && pip install -r requirements.txt
|
||||||
> python ralph_loop.py
|
> python ralph_loop.py
|
||||||
> ```
|
> ```
|
||||||
## What is RALPH-loop?
|
|
||||||
|
|
||||||
RALPH-loop is a development methodology for iterative AI-powered task completion. Named after the Ralph Wiggum technique, it embodies the philosophy of persistent iteration:
|
## What is a Ralph Loop?
|
||||||
|
|
||||||
- **One prompt, multiple iterations**: The same prompt is processed repeatedly
|
A [Ralph loop](https://ghuntley.com/ralph/) is an autonomous development workflow where an AI agent iterates through tasks in isolated context windows. The key insight: **state lives on disk, not in the model's context**. Each iteration starts fresh, reads the current state from files, does one task, writes results back to disk, and exits.
|
||||||
- **Self-referential feedback**: The AI reads its own previous work (file changes, git history)
|
|
||||||
- **Completion detection**: Loop exits when a completion promise is detected in output
|
|
||||||
- **Safety limits**: Always include a maximum iteration count to prevent infinite loops
|
|
||||||
|
|
||||||
## Example Scenario
|
```
|
||||||
|
┌─────────────────────────────────────────────────┐
|
||||||
You need to iteratively improve code until all tests pass. Instead of asking Copilot to "write perfect code," you use RALPH-loop to:
|
│ loop.sh │
|
||||||
|
│ while true: │
|
||||||
1. Send the initial prompt with clear success criteria
|
│ ┌─────────────────────────────────────────┐ │
|
||||||
2. Copilot writes code and tests
|
│ │ Fresh session (isolated context) │ │
|
||||||
3. Copilot runs tests and sees failures
|
│ │ │ │
|
||||||
4. Loop automatically re-sends the prompt
|
│ │ 1. Read PROMPT.md + AGENTS.md │ │
|
||||||
5. Copilot reads test output and previous code, fixes issues
|
│ │ 2. Study specs/* and code │ │
|
||||||
6. Repeat until all tests pass and completion promise is output
|
│ │ 3. Pick next task from plan │ │
|
||||||
|
│ │ 4. Implement + run tests │ │
|
||||||
## Basic Implementation
|
│ │ 5. Update plan, commit, exit │ │
|
||||||
|
│ └─────────────────────────────────────────┘ │
|
||||||
```python
|
│ ↻ next iteration (fresh context) │
|
||||||
import asyncio
|
└─────────────────────────────────────────────────┘
|
||||||
from copilot import CopilotClient, MessageOptions, SessionConfig
|
|
||||||
|
|
||||||
class RalphLoop:
|
|
||||||
"""Iterative self-referential feedback loop using Copilot."""
|
|
||||||
|
|
||||||
def __init__(self, max_iterations=10, completion_promise="COMPLETE"):
|
|
||||||
self.client = CopilotClient()
|
|
||||||
self.iteration = 0
|
|
||||||
self.max_iterations = max_iterations
|
|
||||||
self.completion_promise = completion_promise
|
|
||||||
self.last_response = None
|
|
||||||
|
|
||||||
async def run(self, initial_prompt):
|
|
||||||
"""Run the RALPH-loop until completion promise detected or max iterations reached."""
|
|
||||||
await self.client.start()
|
|
||||||
session = await self.client.create_session(
|
|
||||||
SessionConfig(model="gpt-5.1-codex-mini")
|
|
||||||
)
|
|
||||||
|
|
||||||
try:
|
|
||||||
while self.iteration < self.max_iterations:
|
|
||||||
self.iteration += 1
|
|
||||||
print(f"\n--- Iteration {self.iteration}/{self.max_iterations} ---")
|
|
||||||
|
|
||||||
# Build prompt including previous response as context
|
|
||||||
if self.iteration == 1:
|
|
||||||
prompt = initial_prompt
|
|
||||||
else:
|
|
||||||
prompt = f"{initial_prompt}\n\nPrevious attempt:\n{self.last_response}\n\nContinue improving..."
|
|
||||||
|
|
||||||
result = await session.send_and_wait(
|
|
||||||
MessageOptions(prompt=prompt), timeout=300
|
|
||||||
)
|
|
||||||
|
|
||||||
self.last_response = result.data.content if result else ""
|
|
||||||
print(f"Response ({len(self.last_response)} chars)")
|
|
||||||
|
|
||||||
# Check for completion promise
|
|
||||||
if self.completion_promise in self.last_response:
|
|
||||||
print(f"✓ Completion promise detected: {self.completion_promise}")
|
|
||||||
return self.last_response
|
|
||||||
|
|
||||||
print(f"Continuing to iteration {self.iteration + 1}...")
|
|
||||||
|
|
||||||
raise RuntimeError(
|
|
||||||
f"Max iterations ({self.max_iterations}) reached without completion promise"
|
|
||||||
)
|
|
||||||
finally:
|
|
||||||
await session.destroy()
|
|
||||||
await self.client.stop()
|
|
||||||
|
|
||||||
# Usage
|
|
||||||
async def main():
|
|
||||||
loop = RalphLoop(5, "COMPLETE")
|
|
||||||
result = await loop.run("Your task here")
|
|
||||||
print(result)
|
|
||||||
|
|
||||||
asyncio.run(main())
|
|
||||||
```
|
```
|
||||||
|
|
||||||
## With File Persistence
|
**Core principles:**
|
||||||
|
|
||||||
For tasks involving code generation, persist state to files so the AI can see changes:
|
- **Fresh context per iteration**: Each loop creates a new session — no context accumulation, always in the "smart zone"
|
||||||
|
- **Disk as shared state**: `IMPLEMENTATION_PLAN.md` persists between iterations and acts as the coordination mechanism
|
||||||
|
- **Backpressure steers quality**: Tests, builds, and lints reject bad work — the agent must fix issues before committing
|
||||||
|
- **Two modes**: PLANNING (gap analysis → generate plan) and BUILDING (implement from plan)
|
||||||
|
|
||||||
|
## Simple Version
|
||||||
|
|
||||||
|
The minimal Ralph loop — the SDK equivalent of `while :; do cat PROMPT.md | claude ; done`:
|
||||||
|
|
||||||
```python
|
```python
|
||||||
import asyncio
|
import asyncio
|
||||||
from pathlib import Path
|
from pathlib import Path
|
||||||
from copilot import CopilotClient, MessageOptions, SessionConfig
|
from copilot import CopilotClient, MessageOptions, SessionConfig
|
||||||
|
|
||||||
class PersistentRalphLoop:
|
|
||||||
"""RALPH-loop with file-based state persistence."""
|
|
||||||
|
|
||||||
def __init__(self, work_dir, max_iterations=10):
|
|
||||||
self.client = CopilotClient()
|
|
||||||
self.work_dir = Path(work_dir)
|
|
||||||
self.work_dir.mkdir(parents=True, exist_ok=True)
|
|
||||||
self.iteration = 0
|
|
||||||
self.max_iterations = max_iterations
|
|
||||||
|
|
||||||
async def run(self, initial_prompt):
|
async def ralph_loop(prompt_file: str, max_iterations: int = 50):
|
||||||
"""Run the loop with persistent state."""
|
client = CopilotClient()
|
||||||
await self.client.start()
|
await client.start()
|
||||||
session = await self.client.create_session(
|
|
||||||
SessionConfig(model="gpt-5.1-codex-mini")
|
|
||||||
)
|
|
||||||
|
|
||||||
try:
|
try:
|
||||||
# Store initial prompt
|
prompt = Path(prompt_file).read_text()
|
||||||
(self.work_dir / "prompt.md").write_text(initial_prompt)
|
|
||||||
|
|
||||||
while self.iteration < self.max_iterations:
|
for i in range(1, max_iterations + 1):
|
||||||
self.iteration += 1
|
print(f"\n=== Iteration {i}/{max_iterations} ===")
|
||||||
print(f"\n--- Iteration {self.iteration} ---")
|
|
||||||
|
|
||||||
# Build context from previous outputs
|
# Fresh session each iteration — context isolation is the point
|
||||||
context = initial_prompt
|
session = await client.create_session(
|
||||||
prev_output = self.work_dir / f"output-{self.iteration - 1}.txt"
|
SessionConfig(model="claude-sonnet-4.5")
|
||||||
if prev_output.exists():
|
)
|
||||||
context += f"\n\nPrevious iteration:\n{prev_output.read_text()}"
|
try:
|
||||||
|
await session.send_and_wait(
|
||||||
result = await session.send_and_wait(
|
MessageOptions(prompt=prompt), timeout=600
|
||||||
MessageOptions(prompt=context), timeout=300
|
|
||||||
)
|
)
|
||||||
response = result.data.content if result else ""
|
finally:
|
||||||
|
await session.destroy()
|
||||||
|
|
||||||
# Persist output
|
print(f"Iteration {i} complete.")
|
||||||
output_file = self.work_dir / f"output-{self.iteration}.txt"
|
finally:
|
||||||
output_file.write_text(response)
|
await client.stop()
|
||||||
|
|
||||||
if "COMPLETE" in response:
|
|
||||||
return response
|
|
||||||
|
|
||||||
raise RuntimeError("Max iterations reached")
|
# Usage: point at your PROMPT.md
|
||||||
finally:
|
asyncio.run(ralph_loop("PROMPT.md", 20))
|
||||||
await session.destroy()
|
```
|
||||||
await self.client.stop()
|
|
||||||
|
This is all you need to get started. The prompt file tells the agent what to do; the agent reads project files, does work, commits, and exits. The loop restarts with a clean slate.
|
||||||
|
|
||||||
|
## Ideal Version
|
||||||
|
|
||||||
|
The full Ralph pattern with planning and building modes, matching the [Ralph Playbook](https://github.com/ClaytonFarr/ralph-playbook) architecture:
|
||||||
|
|
||||||
|
```python
|
||||||
|
import asyncio
|
||||||
|
import subprocess
|
||||||
|
import sys
|
||||||
|
from pathlib import Path
|
||||||
|
|
||||||
|
from copilot import CopilotClient, MessageOptions, SessionConfig
|
||||||
|
|
||||||
|
|
||||||
|
async def ralph_loop(mode: str = "build", max_iterations: int = 50):
|
||||||
|
prompt_file = "PROMPT_plan.md" if mode == "plan" else "PROMPT_build.md"
|
||||||
|
client = CopilotClient()
|
||||||
|
await client.start()
|
||||||
|
|
||||||
|
branch = subprocess.check_output(
|
||||||
|
["git", "branch", "--show-current"], text=True
|
||||||
|
).strip()
|
||||||
|
|
||||||
|
print("━" * 40)
|
||||||
|
print(f"Mode: {mode}")
|
||||||
|
print(f"Prompt: {prompt_file}")
|
||||||
|
print(f"Branch: {branch}")
|
||||||
|
print(f"Max: {max_iterations} iterations")
|
||||||
|
print("━" * 40)
|
||||||
|
|
||||||
|
try:
|
||||||
|
prompt = Path(prompt_file).read_text()
|
||||||
|
|
||||||
|
for i in range(1, max_iterations + 1):
|
||||||
|
print(f"\n=== Iteration {i}/{max_iterations} ===")
|
||||||
|
|
||||||
|
# Fresh session — each task gets full context budget
|
||||||
|
session = await client.create_session(
|
||||||
|
SessionConfig(model="claude-sonnet-4.5")
|
||||||
|
)
|
||||||
|
try:
|
||||||
|
await session.send_and_wait(
|
||||||
|
MessageOptions(prompt=prompt), timeout=600
|
||||||
|
)
|
||||||
|
finally:
|
||||||
|
await session.destroy()
|
||||||
|
|
||||||
|
# Push changes after each iteration
|
||||||
|
try:
|
||||||
|
subprocess.run(
|
||||||
|
["git", "push", "origin", branch], check=True
|
||||||
|
)
|
||||||
|
except subprocess.CalledProcessError:
|
||||||
|
subprocess.run(
|
||||||
|
["git", "push", "-u", "origin", branch], check=True
|
||||||
|
)
|
||||||
|
|
||||||
|
print(f"\nIteration {i} complete.")
|
||||||
|
|
||||||
|
print(f"\nReached max iterations: {max_iterations}")
|
||||||
|
finally:
|
||||||
|
await client.stop()
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
args = sys.argv[1:]
|
||||||
|
mode = "plan" if "plan" in args else "build"
|
||||||
|
max_iter = next((int(a) for a in args if a.isdigit()), 50)
|
||||||
|
asyncio.run(ralph_loop(mode, max_iter))
|
||||||
|
```
|
||||||
|
|
||||||
|
### Required Project Files
|
||||||
|
|
||||||
|
The ideal version expects this file structure in your project:
|
||||||
|
|
||||||
|
```
|
||||||
|
project-root/
|
||||||
|
├── PROMPT_plan.md # Planning mode instructions
|
||||||
|
├── PROMPT_build.md # Building mode instructions
|
||||||
|
├── AGENTS.md # Operational guide (build/test commands)
|
||||||
|
├── IMPLEMENTATION_PLAN.md # Task list (generated by planning mode)
|
||||||
|
├── specs/ # Requirement specs (one per topic)
|
||||||
|
│ ├── auth.md
|
||||||
|
│ └── data-pipeline.md
|
||||||
|
└── src/ # Your source code
|
||||||
|
```
|
||||||
|
|
||||||
|
### Example `PROMPT_plan.md`
|
||||||
|
|
||||||
|
```markdown
|
||||||
|
0a. Study `specs/*` to learn the application specifications.
|
||||||
|
0b. Study IMPLEMENTATION_PLAN.md (if present) to understand the plan so far.
|
||||||
|
0c. Study `src/` to understand existing code and shared utilities.
|
||||||
|
|
||||||
|
1. Compare specs against code (gap analysis). Create or update
|
||||||
|
IMPLEMENTATION_PLAN.md as a prioritized bullet-point list of tasks
|
||||||
|
yet to be implemented. Do NOT implement anything.
|
||||||
|
|
||||||
|
IMPORTANT: Do NOT assume functionality is missing — search the
|
||||||
|
codebase first to confirm. Prefer updating existing utilities over
|
||||||
|
creating ad-hoc copies.
|
||||||
|
```
|
||||||
|
|
||||||
|
### Example `PROMPT_build.md`
|
||||||
|
|
||||||
|
```markdown
|
||||||
|
0a. Study `specs/*` to learn the application specifications.
|
||||||
|
0b. Study IMPLEMENTATION_PLAN.md.
|
||||||
|
0c. Study `src/` for reference.
|
||||||
|
|
||||||
|
1. Choose the most important item from IMPLEMENTATION_PLAN.md. Before
|
||||||
|
making changes, search the codebase (don't assume not implemented).
|
||||||
|
2. After implementing, run the tests. If functionality is missing, add it.
|
||||||
|
3. When you discover issues, update IMPLEMENTATION_PLAN.md immediately.
|
||||||
|
4. When tests pass, update IMPLEMENTATION_PLAN.md, then `git add -A`
|
||||||
|
then `git commit` with a descriptive message.
|
||||||
|
|
||||||
|
99999. When authoring documentation, capture the why.
|
||||||
|
999999. Implement completely. No placeholders or stubs.
|
||||||
|
9999999. Keep IMPLEMENTATION_PLAN.md current — future iterations depend on it.
|
||||||
|
```
|
||||||
|
|
||||||
|
### Example `AGENTS.md`
|
||||||
|
|
||||||
|
Keep this brief (~60 lines). It's loaded every iteration, so bloat wastes context.
|
||||||
|
|
||||||
|
```markdown
|
||||||
|
## Build & Run
|
||||||
|
|
||||||
|
python -m pytest
|
||||||
|
|
||||||
|
## Validation
|
||||||
|
|
||||||
|
- Tests: `pytest`
|
||||||
|
- Typecheck: `mypy src/`
|
||||||
|
- Lint: `ruff check src/`
|
||||||
```
|
```
|
||||||
|
|
||||||
## Best Practices
|
## Best Practices
|
||||||
|
|
||||||
1. **Write clear completion criteria**: Include exactly what "done" looks like
|
1. **Fresh context per iteration**: Never accumulate context across iterations — that's the whole point
|
||||||
2. **Use output markers**: Include `<promise>COMPLETE</promise>` or similar in completion condition
|
2. **Disk is your database**: `IMPLEMENTATION_PLAN.md` is shared state between isolated sessions
|
||||||
3. **Always set max iterations**: Prevents infinite loops on impossible tasks
|
3. **Backpressure is essential**: Tests, builds, lints in `AGENTS.md` — the agent must pass them before committing
|
||||||
4. **Persist state**: Save files so AI can see what changed between iterations
|
4. **Start with PLANNING mode**: Generate the plan first, then switch to BUILDING
|
||||||
5. **Include context**: Feed previous iteration output back as context
|
5. **Observe and tune**: Watch early iterations, add guardrails to prompts when the agent fails in specific ways
|
||||||
6. **Monitor progress**: Log each iteration to track what's happening
|
6. **The plan is disposable**: If the agent goes off track, delete `IMPLEMENTATION_PLAN.md` and re-plan
|
||||||
|
7. **Keep `AGENTS.md` brief**: It's loaded every iteration — operational info only, no progress notes
|
||||||
|
8. **Use a sandbox**: The agent runs autonomously with full tool access — isolate it
|
||||||
|
|
||||||
## Example: Iterative Code Generation
|
## When to Use a Ralph Loop
|
||||||
|
|
||||||
```python
|
|
||||||
prompt = """Write a function that:
|
|
||||||
1. Parses CSV data
|
|
||||||
2. Validates required fields
|
|
||||||
3. Returns parsed records or error
|
|
||||||
4. Has unit tests
|
|
||||||
5. Output <promise>COMPLETE</promise> when done"""
|
|
||||||
|
|
||||||
async def main():
|
|
||||||
loop = RalphLoop(10, "COMPLETE")
|
|
||||||
result = await loop.run(prompt)
|
|
||||||
|
|
||||||
asyncio.run(main())
|
|
||||||
```
|
|
||||||
|
|
||||||
## Handling Failures
|
|
||||||
|
|
||||||
```python
|
|
||||||
try:
|
|
||||||
result = await loop.run(prompt)
|
|
||||||
print("Task completed successfully!")
|
|
||||||
except RuntimeError as e:
|
|
||||||
print(f"Task failed: {e}")
|
|
||||||
if loop.last_response:
|
|
||||||
print(f"\nLast attempt:\n{loop.last_response}")
|
|
||||||
```
|
|
||||||
|
|
||||||
## When to Use RALPH-loop
|
|
||||||
|
|
||||||
**Good for:**
|
**Good for:**
|
||||||
- Code generation with automatic verification (tests, linters)
|
- Implementing features from specs with test-driven validation
|
||||||
- Tasks with clear success criteria
|
- Large refactors broken into many small tasks
|
||||||
- Iterative refinement where each attempt learns from previous failures
|
- Unattended, long-running development with clear requirements
|
||||||
- Unattended long-running improvements
|
- Any work where backpressure (tests/builds) can verify correctness
|
||||||
|
|
||||||
**Not good for:**
|
**Not good for:**
|
||||||
- Tasks requiring human judgment or design input
|
- Tasks requiring human judgment mid-loop
|
||||||
- One-shot operations
|
- One-shot operations that don't benefit from iteration
|
||||||
- Tasks with vague success criteria
|
- Vague requirements without testable acceptance criteria
|
||||||
- Real-time interactive debugging
|
- Exploratory prototyping where direction isn't clear
|
||||||
|
|||||||
@@ -1,127 +1,84 @@
|
|||||||
#!/usr/bin/env python3
|
#!/usr/bin/env python3
|
||||||
|
|
||||||
|
"""
|
||||||
|
Ralph loop: autonomous AI task loop with fresh context per iteration.
|
||||||
|
|
||||||
|
Two modes:
|
||||||
|
- "plan": reads PROMPT_plan.md, generates/updates IMPLEMENTATION_PLAN.md
|
||||||
|
- "build": reads PROMPT_build.md, implements tasks, runs tests, commits
|
||||||
|
|
||||||
|
Each iteration creates a fresh session so the agent always operates in
|
||||||
|
the "smart zone" of its context window. State is shared between
|
||||||
|
iterations via files on disk (IMPLEMENTATION_PLAN.md, AGENTS.md, specs/*).
|
||||||
|
|
||||||
|
Usage:
|
||||||
|
python ralph_loop.py # build mode, 50 iterations
|
||||||
|
python ralph_loop.py plan # planning mode
|
||||||
|
python ralph_loop.py 20 # build mode, 20 iterations
|
||||||
|
python ralph_loop.py plan 5 # planning mode, 5 iterations
|
||||||
|
"""
|
||||||
|
|
||||||
import asyncio
|
import asyncio
|
||||||
|
import subprocess
|
||||||
|
import sys
|
||||||
|
from pathlib import Path
|
||||||
|
|
||||||
from copilot import CopilotClient, MessageOptions, SessionConfig
|
from copilot import CopilotClient, MessageOptions, SessionConfig
|
||||||
|
|
||||||
|
|
||||||
class RalphLoop:
|
async def ralph_loop(mode: str = "build", max_iterations: int = 50):
|
||||||
"""
|
prompt_file = "PROMPT_plan.md" if mode == "plan" else "PROMPT_build.md"
|
||||||
RALPH-loop implementation: Iterative self-referential AI loops.
|
|
||||||
|
|
||||||
The same prompt is sent repeatedly, with AI reading its own previous output.
|
client = CopilotClient()
|
||||||
Loop continues until completion promise is detected in the response.
|
await client.start()
|
||||||
"""
|
|
||||||
|
|
||||||
def __init__(self, max_iterations=10, completion_promise="COMPLETE"):
|
branch = subprocess.check_output(
|
||||||
"""Initialize RALPH-loop with iteration limits and completion detection."""
|
["git", "branch", "--show-current"], text=True
|
||||||
self.client = CopilotClient()
|
).strip()
|
||||||
self.iteration = 0
|
|
||||||
self.max_iterations = max_iterations
|
|
||||||
self.completion_promise = completion_promise
|
|
||||||
self.last_response = None
|
|
||||||
|
|
||||||
async def run(self, initial_prompt):
|
print("━" * 40)
|
||||||
"""
|
print(f"Mode: {mode}")
|
||||||
Run the RALPH-loop until completion promise is detected or max iterations reached.
|
print(f"Prompt: {prompt_file}")
|
||||||
"""
|
print(f"Branch: {branch}")
|
||||||
session = None
|
print(f"Max: {max_iterations} iterations")
|
||||||
await self.client.start()
|
print("━" * 40)
|
||||||
try:
|
|
||||||
session = await self.client.create_session(
|
|
||||||
SessionConfig(model="gpt-5.1-codex-mini")
|
|
||||||
)
|
|
||||||
|
|
||||||
try:
|
|
||||||
while self.iteration < self.max_iterations:
|
|
||||||
self.iteration += 1
|
|
||||||
print(f"\n=== Iteration {self.iteration}/{self.max_iterations} ===")
|
|
||||||
|
|
||||||
current_prompt = self._build_iteration_prompt(initial_prompt)
|
|
||||||
print(f"Sending prompt (length: {len(current_prompt)})...")
|
|
||||||
|
|
||||||
result = await session.send_and_wait(
|
|
||||||
MessageOptions(prompt=current_prompt),
|
|
||||||
timeout=300,
|
|
||||||
)
|
|
||||||
|
|
||||||
self.last_response = result.data.content if result else ""
|
|
||||||
|
|
||||||
# Display response summary
|
|
||||||
summary = (
|
|
||||||
self.last_response[:200] + "..."
|
|
||||||
if len(self.last_response) > 200
|
|
||||||
else self.last_response
|
|
||||||
)
|
|
||||||
print(f"Response: {summary}")
|
|
||||||
|
|
||||||
# Check for completion promise
|
|
||||||
if self.completion_promise in self.last_response:
|
|
||||||
print(
|
|
||||||
f"\n✓ Success! Completion promise detected: '{self.completion_promise}'"
|
|
||||||
)
|
|
||||||
return self.last_response
|
|
||||||
|
|
||||||
print(
|
|
||||||
f"Iteration {self.iteration} complete. Checking for next iteration..."
|
|
||||||
)
|
|
||||||
|
|
||||||
raise RuntimeError(
|
|
||||||
f"Maximum iterations ({self.max_iterations}) reached without "
|
|
||||||
f"detecting completion promise: '{self.completion_promise}'"
|
|
||||||
)
|
|
||||||
|
|
||||||
except Exception as e:
|
|
||||||
print(f"\nError during RALPH-loop: {e}")
|
|
||||||
raise
|
|
||||||
finally:
|
|
||||||
if session is not None:
|
|
||||||
await session.destroy()
|
|
||||||
finally:
|
|
||||||
await self.client.stop()
|
|
||||||
|
|
||||||
def _build_iteration_prompt(self, initial_prompt):
|
|
||||||
"""Build the prompt for the current iteration, including previous output as context."""
|
|
||||||
if self.iteration == 1:
|
|
||||||
return initial_prompt
|
|
||||||
|
|
||||||
return f"""{initial_prompt}
|
|
||||||
|
|
||||||
=== CONTEXT FROM PREVIOUS ITERATION ===
|
|
||||||
{self.last_response}
|
|
||||||
=== END CONTEXT ===
|
|
||||||
|
|
||||||
Continue working on this task. Review the previous attempt and improve upon it."""
|
|
||||||
|
|
||||||
|
|
||||||
async def main():
|
|
||||||
"""Example usage demonstrating RALPH-loop."""
|
|
||||||
prompt = """You are iteratively building a small library. Follow these phases IN ORDER.
|
|
||||||
Do NOT skip ahead — only do the current phase, then stop and wait for the next iteration.
|
|
||||||
|
|
||||||
Phase 1: Design a DataValidator class that validates records against a schema.
|
|
||||||
- Schema defines field names, types (str, int, float, bool), and whether required.
|
|
||||||
- Return a list of validation errors per record.
|
|
||||||
- Show the class code only. Do NOT output COMPLETE.
|
|
||||||
|
|
||||||
Phase 2: Write at least 4 unit tests covering: missing required field, wrong type,
|
|
||||||
valid record, and empty input. Show test code only. Do NOT output COMPLETE.
|
|
||||||
|
|
||||||
Phase 3: Review the code from phases 1 and 2. Fix any bugs, add docstrings, and add
|
|
||||||
an extra edge-case test. Show the final consolidated code with all fixes.
|
|
||||||
When this phase is fully done, output the exact text: COMPLETE"""
|
|
||||||
|
|
||||||
loop = RalphLoop(max_iterations=5, completion_promise="COMPLETE")
|
|
||||||
|
|
||||||
try:
|
try:
|
||||||
result = await loop.run(prompt)
|
prompt = Path(prompt_file).read_text()
|
||||||
print("\n=== FINAL RESULT ===")
|
|
||||||
print(result)
|
for i in range(1, max_iterations + 1):
|
||||||
except RuntimeError as e:
|
print(f"\n=== Iteration {i}/{max_iterations} ===")
|
||||||
print(f"\nTask did not complete: {e}")
|
|
||||||
if loop.last_response:
|
# Fresh session — each task gets full context budget
|
||||||
print(f"\nLast attempt:\n{loop.last_response}")
|
session = await client.create_session(
|
||||||
|
SessionConfig(model="claude-sonnet-4.5")
|
||||||
|
)
|
||||||
|
try:
|
||||||
|
await session.send_and_wait(
|
||||||
|
MessageOptions(prompt=prompt), timeout=600
|
||||||
|
)
|
||||||
|
finally:
|
||||||
|
await session.destroy()
|
||||||
|
|
||||||
|
# Push changes after each iteration
|
||||||
|
try:
|
||||||
|
subprocess.run(
|
||||||
|
["git", "push", "origin", branch], check=True
|
||||||
|
)
|
||||||
|
except subprocess.CalledProcessError:
|
||||||
|
subprocess.run(
|
||||||
|
["git", "push", "-u", "origin", branch], check=True
|
||||||
|
)
|
||||||
|
|
||||||
|
print(f"\nIteration {i} complete.")
|
||||||
|
|
||||||
|
print(f"\nReached max iterations: {max_iterations}")
|
||||||
|
finally:
|
||||||
|
await client.stop()
|
||||||
|
|
||||||
|
|
||||||
if __name__ == "__main__":
|
if __name__ == "__main__":
|
||||||
asyncio.run(main())
|
args = sys.argv[1:]
|
||||||
|
mode = "plan" if "plan" in args else "build"
|
||||||
|
max_iter = next((int(a) for a in args if a.isdigit()), 50)
|
||||||
|
asyncio.run(ralph_loop(mode, max_iter))
|
||||||
|
|||||||
Reference in New Issue
Block a user