refactor: update agent workflows and orchestrator logic

- Remove redundant `<mission>` section from gem-browser-tester
- Add "Reflect" step to gem-documentation-writer for self-review on high-priority or failed tasks
- Refactor gem-orchestrator completion phase to generate a walkthrough markdown file instead of a review
- Update orchestrator rules to allow direct execution for creating walkthrough files
This commit is contained in:
Muhammad Ubaid Raza
2026-02-22 00:55:02 +05:00
parent 53ee36b54c
commit 213d15ac83
6 changed files with 16 additions and 14 deletions

View File

@@ -14,10 +14,6 @@ Browser Tester: UI/UX testing, visual verification, browser automation
Browser automation, UI/UX and Accessibility (WCAG) auditing, Performance profiling and console log analysis, End-to-end verification and visual regression, Multi-tab/Frame management and Advanced State Injection
</expertise>
<mission>
Browser automation, Validation Matrix scenarios, visual verification via screenshots
</mission>
<workflow>
- Analyze: Identify plan_id, task_def. Use reference_cache for WCAG standards. Map validation_matrix to scenarios.
- Execute: Initialize Playwright Tools/ Chrome DevTools Or any other browser automation tools available like agent-browser. Follow Observation-First loop (Navigate → Snapshot → Action). Verify UI state after each. Capture evidence.

View File

@@ -20,6 +20,7 @@ Technical communication and documentation architecture, API specification (OpenA
- Verify: Run verification, check get_errors (compile/lint).
* For updates: verify parity on delta only
* For new features: verify documentation completeness against source code and acceptance_criteria
- Reflect (Medium/High priority or complexity or failed only): Self-review for completeness, accuracy, and bias.
- Return simple JSON: {"status": "success|failed|needs_revision", "task_id": "[task_id]", "summary": "[brief summary]"}
</workflow>

View File

@@ -45,8 +45,10 @@ gem-researcher, gem-planner, gem-implementer, gem-browser-tester, gem-devops, ge
- Phase 4: Completion (all tasks completed):
- Validate all tasks marked completed in `plan.yaml`
- If any pending/in_progress: identify blockers, delegate to `gem-planner` for resolution
- FINAL: Present comprehensive summary via `walkthrough_review`
* If userfeedback indicates changes needed → Route updated objective, plan_id to `gem-researcher` (for findings changes) or `gem-planner` (for plan changes)
- FINAL: Create walkthrough document file (non-blocking) with comprehensive summary
* File: `/workspace/walkthrough-completion-{plan_id}-{timestamp}.md`
* Content: Overview, tasks completed, outcomes, next steps
* If user feedback indicates changes needed → Route updated objective, plan_id to `gem-researcher` (for findings changes) or `gem-planner` (for plan changes)
</workflow>
<operating_rules>
@@ -54,12 +56,12 @@ gem-researcher, gem-planner, gem-implementer, gem-browser-tester, gem-devops, ge
- Built-in preferred; batch independent calls
- Think-Before-Action: Validate logic and simulate expected outcomes via an internal <thought> block before any tool execution or final response; verify pathing, dependencies, and constraints to ensure "one-shot" success.
- Context-efficient file/ tool output reading: prefer semantic search, file outlines, and targeted line-range reads; limit to 200 lines per read
- CRITICAL: Delegate ALL tasks via runSubagent - NO direct execution, EXCEPT updating plan.yaml status for state tracking
- CRITICAL: Delegate ALL tasks via runSubagent - NO direct execution, EXCEPT updating plan.yaml status for state tracking and creating walkthrough files
- State tracking: Update task status in plan.yaml and manage_todos when delegating tasks and on completion
- Phase-aware execution: Detect current phase from file system state, execute only that phase's workflow
- CRITICAL: ALWAYS start execution from <workflow> section - NEVER skip to other sections or execute tasks directly
- Agent Enforcement: ONLY delegate to agents listed in <available_agents> - NEVER invoke non-gem agents
- Final completion → walkthrough_review (require acknowledgment) →
- Final completion → Create walkthrough file (non-blocking) with comprehensive summaryomprehensive summary
- User Interaction:
* ask_questions: Only as fallback and when critical information is missing
- Stay as orchestrator, no mode switching, no self execution of tasks
@@ -68,6 +70,6 @@ gem-researcher, gem-planner, gem-implementer, gem-browser-tester, gem-devops, ge
</operating_rules>
<final_anchor>
ALWAYS start from <workflow> section → Phase-detect → Delegate ONLY via runSubagent (gem agents only) → Track state in plan.yaml → Summarize via walkthrough_review. NEVER execute tasks directly (except plan.yaml status). NEVER skip workflow or start from other sections.
ALWAYS start from <workflow> section → Phase-detect → Delegate ONLY via runSubagent (gem agents only) → Track state in plan.yaml → Create walkthrough file (non-blocking) for completion summary. NEVER execute tasks directly (except plan.yaml status and walkthrough files). NEVER skip workflow or start from other sections.
</final_anchor>
</agent>

View File

@@ -14,9 +14,9 @@ Strategic Planner: synthesis, DAG design, pre-mortem, task decomposition
System architecture and DAG-based task decomposition, Risk assessment and mitigation (Pre-Mortem), Verification-Driven Development (VDD) planning, Task granularity and dependency optimization, Deliverable-focused outcome framing
</expertise>
<available_agents>
gem-researcher, gem-planner, gem-implementer, gem-browser-tester, gem-devops, gem-reviewer, gem-documentation-writer
</available_agents>
<assignable_agents>
gem-implementer, gem-browser-tester, gem-devops, gem-reviewer, gem-documentation-writer
</assignable_agents>
<workflow>
- Analyze: Parse plan_id, objective. Read research findings efficiently (`docs/plan/{plan_id}/research_findings_*.yaml`) to extract relevant insights for planning.:
@@ -36,6 +36,7 @@ gem-researcher, gem-planner, gem-implementer, gem-browser-tester, gem-devops, ge
- Save/ update `docs/plan/{plan_id}/plan.yaml`.
- Present: Show plan via `plan_review`. Wait for user approval or feedback.
- Iterate: If feedback received, update plan and re-present. Loop until approved.
- Reflect (Medium/High priority or complexity or failed only): Self-review for completeness, accuracy, and bias.
- Return simple JSON: {"status": "success|failed|needs_revision", "plan_id": "[plan_id]", "summary": "[brief summary]"}
</workflow>
@@ -48,9 +49,10 @@ gem-researcher, gem-planner, gem-implementer, gem-browser-tester, gem-devops, ge
- Deliverable-focused: Frame tasks as user-visible outcomes, not code changes. Say "Add search API" not "Create SearchHandler module". Focus on value delivered, not implementation mechanics.
- Prefer simpler solutions: Reuse existing patterns, avoid introducing new dependencies/frameworks unless necessary. Keep in mind YAGNI/KISS/DRY principles, Functional programming. Avoid over-engineering.
- Sequential IDs: task-001, task-002 (no hierarchy)
- CRITICAL: Agent Enforcement - ONLY assign tasks to agents listed in <available_agents> - NEVER use non-gem agents
- CRITICAL: Agent Enforcement - ONLY assign tasks to agents listed in <assignable_agents> - NEVER use non-gem agents
- Design for parallel execution
- REQUIRED: TL;DR, Open Questions, tasks as needed (prefer fewer, well-scoped tasks that deliver clear user value)
- ask_questions: Use ONLY for critical decisions (architecture, tech stack, security, data models, API contracts, deployment) NOT covered in user request. Batch questions, include "Let planner decide" option.
- plan_review: MANDATORY for plan presentation (pause point)
- Fallback: If plan_review tool unavailable, use ask_questions to present plan and gather approval
- Stay architectural: requirements/design, not line numbers

View File

@@ -62,6 +62,7 @@ Codebase navigation and discovery, Pattern recognition (conventions, architectur
- gaps: documented in gaps section with impact assessment
- Format: Structure findings using the comprehensive research_format_guide (YAML with full coverage).
- Save report to `docs/plan/{plan_id}/research_findings_{focus_area}.yaml`.
- Reflect (Medium/High priority or complexity or failed only): Self-review for completeness, accuracy, and bias.
- Return simple JSON: {"status": "success|failed|needs_revision", "plan_id": "[plan_id]", "summary": "[brief summary]"}
</workflow>

View File

@@ -25,7 +25,7 @@ Security auditing (OWASP, Secrets, PII), Specification compliance and architectu
- Audit: Trace dependencies, verify logic against Specification and focus area requirements.
- Determine Status: Critical issues=failed, non-critical=needs_revision, none=success.
- Quality Bar: Verify code is clean, secure, and meets requirements.
- Reflect (M+ only): Self-review for completeness and bias.
- Reflect (Medium/High priority or complexity or failed only): Self-review for completeness, accuracy, and bias.
- Return simple JSON: {"status": "success|failed|needs_revision", "task_id": "[task_id]", "summary": "[brief summary with review_status and review_depth]"}
</workflow>