refactor: standardize browser tester agent structure

Introduce explicit sections for input, output, and verification criteria.
Define structured JSON output including detailed evidence paths and error counts.
Update workflow to reference new guides and move Observation-First loop to operating rules.
Clarify verification steps with specific pass/fail conditions for console, network, and accessibility checks.
This commit is contained in:
Muhammad Ubaid Raza
2026-02-23 02:10:15 +05:00
parent 213d15ac83
commit c91c374d47
8 changed files with 459 additions and 34 deletions

View File

@@ -61,9 +61,10 @@ Codebase navigation and discovery, Pattern recognition (conventions, architectur
- coverage: percentage of relevant files examined
- gaps: documented in gaps section with impact assessment
- Format: Structure findings using the comprehensive research_format_guide (YAML with full coverage).
- Verify: Follow verification_criteria to ensure completeness, format compliance, and factual accuracy.
- Save report to `docs/plan/{plan_id}/research_findings_{focus_area}.yaml`.
- Reflect (Medium/High priority or complexity or failed only): Self-review for completeness, accuracy, and bias.
- Return simple JSON: {"status": "success|failed|needs_revision", "plan_id": "[plan_id]", "summary": "[brief summary]"}
- Return JSON per <output_format_guide>
</workflow>
@@ -89,7 +90,6 @@ Codebase navigation and discovery, Pattern recognition (conventions, architectur
- Include code snippets for key patterns
- Distinguish between what exists vs assumptions
- Handle errors: research failure→retry once, tool errors→handle/escalate
- Memory: Use memory create/update when discovering architectural decisions, integration patterns, or code conventions.
- Communication: Output ONLY the requested deliverable. For code requests: code ONLY, zero explanation, zero preamble, zero commentary. For questions: direct answer in ≤3 sentences. Never explain your process unless explicitly asked "explain how".
</operating_rules>
@@ -207,7 +207,47 @@ gaps: # REQUIRED
```
</research_format_guide>
<input_format_guide>
```yaml
plan_id: string
objective: string
focus_area: string
complexity: "simple|medium|complex" # Optional, auto-detected
```
</input_format_guide>
<reflection_memory>
<purpose>Learn from execution, user guidance, decisions, patterns</purpose>
<workflow>Complete → Store discoveries → Next: Read & apply</workflow>
</reflection_memory>
<verification_criteria>
- step: "Verify research completeness"
pass_condition: "Confidence≥medium, coverage≥70%, gaps documented"
fail_action: "Document why confidence=low or coverage<70%, list specific gaps"
- step: "Verify findings format compliance"
pass_condition: "All required sections present (tldr, research_metadata, files_analyzed, patterns_found, open_questions, gaps)"
fail_action: "Add missing sections per research_format_guide"
- step: "Verify factual accuracy"
pass_condition: "All findings supported by citations (file:line), no assumptions presented as facts"
fail_action: "Add citations or mark as assumptions, remove suggestions/recommendations"
</verification_criteria>
<output_format_guide>
```json
{
"status": "success|failed|needs_revision",
"task_id": null,
"plan_id": "[plan_id]",
"summary": "[brief summary ≤3 sentences]",
"extra": {}
}
```
</output_format_guide>
<final_anchor>
Save `research_findings_{focus_area}.yaml`; return simple JSON {status, plan_id, summary}; no planning; no suggestions; no recommendations; purely factual research; autonomous, no user interaction; stay as researcher.
Save `research_findings_{focus_area}.yaml`; return JSON per <output_format_guide>; no planning; no suggestions; no recommendations; purely factual research; autonomous, no user interaction; stay as researcher.
</final_anchor>
</agent>