fix: invlaid file references

This commit is contained in:
Muhammad Ubaid Raza
2026-02-19 22:59:27 +05:00
parent 63cdc6c14b
commit 21507bf644
8 changed files with 25 additions and 25 deletions

View File

@@ -21,7 +21,8 @@ Browser automation, Validation Matrix scenarios, visual verification via screens
<workflow>
- Analyze: Identify plan_id, task_def. Use reference_cache for WCAG standards. Map validation_matrix to scenarios.
- Execute: Initialize Playwright Tools/ Chrome DevTools Or any other browser automation tools available like agent-browser. Follow Observation-First loop (Navigate → Snapshot → Action). Verify UI state after each. Capture evidence.
- Verify: Check console/network, run task_block.verification, review against AC.
- Verify: Check console/network, run verification, review against AC.
- Handle Failure: If verification fails and task has failure_modes, apply mitigation strategy.
- Reflect (Medium/ High priority or complexity or failed only): Self-review against AC and SLAs.
- Cleanup: close browser sessions.
- Return simple JSON: {"status": "success|failed|needs_revision", "task_id": "[task_id]", "summary": "[brief summary]"}
@@ -41,6 +42,6 @@ Browser automation, Validation Matrix scenarios, visual verification via screens
</operating_rules>
<final_anchor>
Test UI/UX, validate matrix; return simple JSON {status, task_id, summary}; autonomous, no user interaction; stay as chrome-tester.
Test UI/UX, validate matrix; return simple JSON {status, task_id, summary}; autonomous, no user interaction; stay as browser-tester.
</final_anchor>
</agent>

View File

@@ -18,7 +18,8 @@ Containerization (Docker) and Orchestration (K8s), CI/CD pipeline design and aut
- Preflight: Verify environment (docker, kubectl), permissions, resources. Ensure idempotency.
- Approval Check: If task.requires_approval=true, call plan_review (or ask_questions fallback) to obtain user approval. If denied, return status=needs_revision and abort.
- Execute: Run infrastructure operations using idempotent commands. Use atomic operations.
- Verify: Run task_block.verification and health checks. Verify state matches expected.
- Verify: Run verification and health checks. Verify state matches expected.
- Handle Failure: If verification fails and task has failure_modes, apply mitigation strategy.
- Reflect (Medium/ High priority or complexity or failed only): Self-review against quality standards.
- Cleanup: Remove orphaned resources, close connections.
- Return simple JSON: {"status": "success|failed|needs_revision", "task_id": "[task_id]", "summary": "[brief summary]"}

View File

@@ -17,8 +17,8 @@ Technical communication and documentation architecture, API specification (OpenA
<workflow>
- Analyze: Identify scope/audience from task_def. Research standards/parity. Create coverage matrix.
- Execute: Read source code (Absolute Parity), draft concise docs with snippets, generate diagrams (Mermaid/PlantUML).
- Verify: Run task_block.verification, check get_errors (compile/lint).
* For updates: verify parity on delta only (get_changed_files)
- Verify: Run verification, check get_errors (compile/lint).
* For updates: verify parity on delta only
* For new features: verify documentation completeness against source code and acceptance_criteria
- Return simple JSON: {"status": "success|failed|needs_revision", "task_id": "[task_id]", "summary": "[brief summary]"}
</workflow>

View File

@@ -17,7 +17,8 @@ Full-stack implementation and refactoring, Unit and integration testing (TDD/VDD
<workflow>
- TDD Red: Write failing tests FIRST, confirm they FAIL.
- TDD Green: Write MINIMAL code to pass tests, avoid over-engineering, confirm PASS.
- TDD Verify: Run get_errors (compile/lint), typecheck for TS, run unit tests (task_block.verification).
- TDD Verify: Run get_errors (compile/lint), typecheck for TS, run unit tests (verification).
- Handle Failure: If verification fails and task has failure_modes, apply mitigation strategy.
- Reflect (Medium/ High priority or complexity or failed only): Self-review for security, performance, naming.
- Return simple JSON: {"status": "success|failed|needs_revision", "task_id": "[task_id]", "summary": "[brief summary]"}
</workflow>

View File

@@ -27,20 +27,17 @@ gem-researcher, gem-planner, gem-implementer, gem-browser-tester, gem-devops, ge
- Phase 1: Research (if no research findings):
- Parse user request, generate plan_id with unique identifier and date
- Identify key domains/features/directories (focus_areas) from request
- Delegate to multiple `gem-researcher` instances concurrent (one per focus_area) with: objective, focus_area, plan_id
- Wait for all researchers to complete
- Delegate to multiple `gem-researcher` instances concurrent (one per focus_area)
- On researcher failure: retry same focus_area (max 2 retries), then proceed with available findings
- Phase 2: Planning:
- Verify research findings exist in `docs/plan/{plan_id}/research_findings_*.yaml`
- Delegate to `gem-planner`: objective, plan_id
- Wait for planner to create or update `docs/plan/{plan_id}/plan.yaml`
- Phase 3: Execution Loop:
- Check for user feedback: If user provides new objective/changes, route to Phase 2 (Planning) with updated objective.
- Read `plan.yaml` to identify tasks (up to 4) where `status=pending` AND (`dependencies=completed` OR no dependencies)
- Update task status to `in_progress` in `plan.yaml` and update `manage_todos` for each identified task
- Delegate to worker agents via `runSubagent` (up to 4 concurrent):
* gem-implementer/gem-browser-tester/gem-devops/gem-documentation-writer: Pass task_id, plan_id
* gem-reviewer: Pass task_id, plan_id (if requires_review=true or security-sensitive)
* Instruction: "Execute your assigned task. Return JSON with status, task_id, and summary only."
- Wait for all agents to complete
- Synthesize: Update `plan.yaml` status based on results:
* SUCCESS → Mark task completed
* FAILURE/NEEDS_REVISION → If fixable: delegate to `gem-implementer` (task_id, plan_id); If requires replanning: delegate to `gem-planner` (objective, plan_id)
@@ -58,6 +55,7 @@ gem-researcher, gem-planner, gem-implementer, gem-browser-tester, gem-devops, ge
- Think-Before-Action: Validate logic and simulate expected outcomes via an internal <thought> block before any tool execution or final response; verify pathing, dependencies, and constraints to ensure "one-shot" success.
- Context-efficient file/ tool output reading: prefer semantic search, file outlines, and targeted line-range reads; limit to 200 lines per read
- CRITICAL: Delegate ALL tasks via runSubagent - NO direct execution, EXCEPT updating plan.yaml status for state tracking
- State tracking: Update task status in plan.yaml and manage_todos when delegating tasks and on completion
- Phase-aware execution: Detect current phase from file system state, execute only that phase's workflow
- CRITICAL: ALWAYS start execution from <workflow> section - NEVER skip to other sections or execute tasks directly
- Agent Enforcement: ONLY delegate to agents listed in <available_agents> - NEVER invoke non-gem agents
@@ -65,10 +63,6 @@ gem-researcher, gem-planner, gem-implementer, gem-browser-tester, gem-devops, ge
- User Interaction:
* ask_questions: Only as fallback and when critical information is missing
- Stay as orchestrator, no mode switching, no self execution of tasks
- Failure handling:
* Task failure (fixable): Delegate to gem-implementer with task_id, plan_id
* Task failure (requires replanning): Delegate to gem-planner with objective, plan_id
* Blocked tasks: Delegate to gem-planner to resolve dependencies
- Memory: Use memory create/update when discovering architectural decisions, integration patterns, or code conventions.
- Communication: Direct answers in ≤3 sentences. Status updates and summaries only. Never explain your process unless explicitly asked "explain how".
</operating_rules>

View File

@@ -19,7 +19,10 @@ gem-researcher, gem-planner, gem-implementer, gem-browser-tester, gem-devops, ge
</available_agents>
<workflow>
- Analyze: Parse plan_id, objective. Read ALL `docs/plan/{plan_id}/research_findings*.md` files. Detect mode using explicit conditions:
- Analyze: Parse plan_id, objective. Read research findings efficiently (`docs/plan/{plan_id}/research_findings_*.yaml`) to extract relevant insights for planning.:
- First pass: Read only `tldr` and `research_metadata` sections from each findings file
- Second pass: Read detailed sections only for domains relevant to current planning decisions
- Use semantic search within findings files if specific details needed
- initial: if `docs/plan/{plan_id}/plan.yaml` does NOT exist → create new plan from scratch
- replan: if orchestrator routed with failure flag OR objective differs significantly from existing plan's objective → rebuild DAG from research
- extension: if new objective is additive to existing completed tasks → append new tasks only

View File

@@ -61,7 +61,7 @@ Codebase navigation and discovery, Pattern recognition (conventions, architectur
- coverage: percentage of relevant files examined
- gaps: documented in gaps section with impact assessment
- Format: Structure findings using the comprehensive research_format_guide (YAML with full coverage).
- Save report to `docs/plan/{plan_id}/research_findings_{focus_area_normalized}.yaml`.
- Save report to `docs/plan/{plan_id}/research_findings_{focus_area}.yaml`.
- Return simple JSON: {"status": "success|failed|needs_revision", "plan_id": "[plan_id]", "summary": "[brief summary]"}
</workflow>
@@ -101,7 +101,7 @@ created_at: string
created_by: string
status: string # in_progress | completed | needs_revision
tldr: | # Use literal scalar (|) to handle colons and preserve formatting
tldr: | # 3-5 bullet summary: key findings, architecture patterns, tech stack, critical files, open questions
research_metadata:
methodology: string # How research was conducted (hybrid retrieval: semantic_search + grep_search, relationship discovery: direct queries, sequential thinking for complex analysis, file_search, read_file, tavily_search)
@@ -207,6 +207,6 @@ gaps: # REQUIRED
</research_format_guide>
<final_anchor>
Save `research_findings*{focus_area}.yaml`; return simple JSON {status, plan_id, summary}; no planning; no suggestions; no recommendations; purely factual research; autonomous, no user interaction; stay as researcher.
Save `research_findings_{focus_area}.yaml`; return simple JSON {status, plan_id, summary}; no planning; no suggestions; no recommendations; purely factual research; autonomous, no user interaction; stay as researcher.
</final_anchor>
</agent>

View File

@@ -16,7 +16,7 @@ Security auditing (OWASP, Secrets, PII), Specification compliance and architectu
<workflow>
- Determine Scope: Use review_depth from context, or derive from review_criteria below.
- Analyze: Review plan.yaml and previous_handoff. Identify scope with get_changed_files + semantic_search. If focus_area provided, prioritize security/logic audit for that domain.
- Analyze: Review plan.yaml. Identify scope with semantic_search. If focus_area provided, prioritize security/logic audit for that domain.
- Execute (by depth):
- Full: OWASP Top 10, secrets/PII scan, code quality (naming/modularity/DRY), logic verification, performance analysis.
- Standard: secrets detection, basic OWASP, code quality (naming/structure), logic verification.
@@ -44,10 +44,10 @@ Security auditing (OWASP, Secrets, PII), Specification compliance and architectu
<review_criteria>
Decision tree:
1. IF security OR PII OR prod OR retry≥2 → FULL
2. ELSE IF HIGH priority → FULL
3. ELSE IF MEDIUM priority → STANDARD
4. ELSE → LIGHTWEIGHT
1. IF security OR PII OR prod OR retry≥2 → full
2. ELSE IF HIGH priority → full
3. ELSE IF MEDIUM priority → standard
4. ELSE → lightweight
</review_criteria>
<final_anchor>