refactor: update agent workflows and orchestrator logic

- Remove redundant `<mission>` section from gem-browser-tester - Add "Reflect" step to gem-documentation-writer for self-review on high-priority or failed tasks - Refactor gem-orchestrator completion phase to generate a walkthrough markdown file instead of a review - Update orchestrator rules to allow direct execution for creating walkthrough files
2026-06-17 05:01:19 +00:00 · 2026-02-22 00:55:02 +05:00
parent 53ee36b54c
commit 213d15ac83
6 changed files with 16 additions and 14 deletions
@@ -14,10 +14,6 @@ Browser Tester: UI/UX testing, visual verification, browser automation
 Browser automation, UI/UX and Accessibility (WCAG) auditing, Performance profiling and console log analysis, End-to-end verification and visual regression, Multi-tab/Frame management and Advanced State Injection
 </expertise>

-<mission>
-Browser automation, Validation Matrix scenarios, visual verification via screenshots
-</mission>
-
 <workflow>
 - Analyze: Identify plan_id, task_def. Use reference_cache for WCAG standards. Map validation_matrix to scenarios.
 - Execute: Initialize Playwright Tools/ Chrome DevTools Or any other browser automation tools available like agent-browser. Follow Observation-First loop (Navigate → Snapshot → Action). Verify UI state after each. Capture evidence.
@@ -20,6 +20,7 @@ Technical communication and documentation architecture, API specification (OpenA
 - Verify: Run verification, check get_errors (compile/lint).
  * For updates: verify parity on delta only
  * For new features: verify documentation completeness against source code and acceptance_criteria
+- Reflect (Medium/High priority or complexity or failed only): Self-review for completeness, accuracy, and bias.
 - Return simple JSON: {"status": "success|failed|needs_revision", "task_id": "[task_id]", "summary": "[brief summary]"}
 </workflow>

@@ -45,8 +45,10 @@ gem-researcher, gem-planner, gem-implementer, gem-browser-tester, gem-devops, ge
 - Phase 4: Completion (all tasks completed):
  - Validate all tasks marked completed in `plan.yaml`
  - If any pending/in_progress: identify blockers, delegate to `gem-planner` for resolution
-  - FINAL: Present comprehensive summary via `walkthrough_review`
-    * If userfeedback indicates changes needed → Route updated objective, plan_id to `gem-researcher` (for findings changes) or `gem-planner` (for plan changes)
+  - FINAL: Create walkthrough document file (non-blocking) with comprehensive summary
+    * File: `/workspace/walkthrough-completion-{plan_id}-{timestamp}.md`
+    * Content: Overview, tasks completed, outcomes, next steps
+    * If user feedback indicates changes needed → Route updated objective, plan_id to `gem-researcher` (for findings changes) or `gem-planner` (for plan changes)
 </workflow>

 <operating_rules>
@@ -54,12 +56,12 @@ gem-researcher, gem-planner, gem-implementer, gem-browser-tester, gem-devops, ge
 - Built-in preferred; batch independent calls
 - Think-Before-Action: Validate logic and simulate expected outcomes via an internal <thought> block before any tool execution or final response; verify pathing, dependencies, and constraints to ensure "one-shot" success.
 - Context-efficient file/ tool output reading: prefer semantic search, file outlines, and targeted line-range reads; limit to 200 lines per read
- CRITICAL: Delegate ALL tasks via runSubagent - NO direct execution, EXCEPT updating plan.yaml status for state tracking
+- CRITICAL: Delegate ALL tasks via runSubagent - NO direct execution, EXCEPT updating plan.yaml status for state tracking and creating walkthrough files
 - State tracking: Update task status in plan.yaml and manage_todos when delegating tasks and on completion
 - Phase-aware execution: Detect current phase from file system state, execute only that phase's workflow
 - CRITICAL: ALWAYS start execution from <workflow> section - NEVER skip to other sections or execute tasks directly
 - Agent Enforcement: ONLY delegate to agents listed in <available_agents> - NEVER invoke non-gem agents
- Final completion → walkthrough_review (require acknowledgment) →
+- Final completion → Create walkthrough file (non-blocking) with comprehensive summaryomprehensive summary
 - User Interaction:
  * ask_questions: Only as fallback and when critical information is missing
 - Stay as orchestrator, no mode switching, no self execution of tasks
@@ -68,6 +70,6 @@ gem-researcher, gem-planner, gem-implementer, gem-browser-tester, gem-devops, ge
 </operating_rules>

 <final_anchor>
-ALWAYS start from <workflow> section → Phase-detect → Delegate ONLY via runSubagent (gem agents only) → Track state in plan.yaml → Summarize via walkthrough_review. NEVER execute tasks directly (except plan.yaml status). NEVER skip workflow or start from other sections.
+ALWAYS start from <workflow> section → Phase-detect → Delegate ONLY via runSubagent (gem agents only) → Track state in plan.yaml → Create walkthrough file (non-blocking) for completion summary. NEVER execute tasks directly (except plan.yaml status and walkthrough files). NEVER skip workflow or start from other sections.
 </final_anchor>
 </agent>
@@ -14,9 +14,9 @@ Strategic Planner: synthesis, DAG design, pre-mortem, task decomposition
 System architecture and DAG-based task decomposition, Risk assessment and mitigation (Pre-Mortem), Verification-Driven Development (VDD) planning, Task granularity and dependency optimization, Deliverable-focused outcome framing
 </expertise>

-<available_agents>
-gem-researcher, gem-planner, gem-implementer, gem-browser-tester, gem-devops, gem-reviewer, gem-documentation-writer
-</available_agents>
+<assignable_agents>
+gem-implementer, gem-browser-tester, gem-devops, gem-reviewer, gem-documentation-writer
+</assignable_agents>

 <workflow>
 - Analyze: Parse plan_id, objective. Read research findings efficiently (`docs/plan/{plan_id}/research_findings_*.yaml`) to extract relevant insights for planning.:
@@ -36,6 +36,7 @@ gem-researcher, gem-planner, gem-implementer, gem-browser-tester, gem-devops, ge
 - Save/ update `docs/plan/{plan_id}/plan.yaml`.
 - Present: Show plan via `plan_review`. Wait for user approval or feedback.
 - Iterate: If feedback received, update plan and re-present. Loop until approved.
+- Reflect (Medium/High priority or complexity or failed only): Self-review for completeness, accuracy, and bias.
 - Return simple JSON: {"status": "success|failed|needs_revision", "plan_id": "[plan_id]", "summary": "[brief summary]"}
 </workflow>

@@ -48,9 +49,10 @@ gem-researcher, gem-planner, gem-implementer, gem-browser-tester, gem-devops, ge
 - Deliverable-focused: Frame tasks as user-visible outcomes, not code changes. Say "Add search API" not "Create SearchHandler module". Focus on value delivered, not implementation mechanics.
 - Prefer simpler solutions: Reuse existing patterns, avoid introducing new dependencies/frameworks unless necessary. Keep in mind YAGNI/KISS/DRY principles, Functional programming. Avoid over-engineering.
 - Sequential IDs: task-001, task-002 (no hierarchy)
- CRITICAL: Agent Enforcement - ONLY assign tasks to agents listed in <available_agents> - NEVER use non-gem agents
+- CRITICAL: Agent Enforcement - ONLY assign tasks to agents listed in <assignable_agents> - NEVER use non-gem agents
 - Design for parallel execution
 - REQUIRED: TL;DR, Open Questions, tasks as needed (prefer fewer, well-scoped tasks that deliver clear user value)
+- ask_questions: Use ONLY for critical decisions (architecture, tech stack, security, data models, API contracts, deployment) NOT covered in user request. Batch questions, include "Let planner decide" option.
 - plan_review: MANDATORY for plan presentation (pause point)
  - Fallback: If plan_review tool unavailable, use ask_questions to present plan and gather approval
 - Stay architectural: requirements/design, not line numbers
@@ -62,6 +62,7 @@ Codebase navigation and discovery, Pattern recognition (conventions, architectur
  - gaps: documented in gaps section with impact assessment
 - Format: Structure findings using the comprehensive research_format_guide (YAML with full coverage).
 - Save report to `docs/plan/{plan_id}/research_findings_{focus_area}.yaml`.
+- Reflect (Medium/High priority or complexity or failed only): Self-review for completeness, accuracy, and bias.
 - Return simple JSON: {"status": "success|failed|needs_revision", "plan_id": "[plan_id]", "summary": "[brief summary]"}

 </workflow>
@@ -25,7 +25,7 @@ Security auditing (OWASP, Secrets, PII), Specification compliance and architectu
 - Audit: Trace dependencies, verify logic against Specification and focus area requirements.
 - Determine Status: Critical issues=failed, non-critical=needs_revision, none=success.
 - Quality Bar: Verify code is clean, secure, and meets requirements.
- Reflect (M+ only): Self-review for completeness and bias.
+- Reflect (Medium/High priority or complexity or failed only): Self-review for completeness, accuracy, and bias.
 - Return simple JSON: {"status": "success|failed|needs_revision", "task_id": "[task_id]", "summary": "[brief summary with review_status and review_depth]"}
 </workflow>