Files
awesome-copilot/skills/threat-model-analyst/references/orchestrator.md
Vijay Chegu afba5b86b8 Add threat-model-analyst skill: STRIDE-A threat modeling for repositories (#1177)
* Add threat-model-analyst skill: STRIDE-A threat modeling for repositories

Add a comprehensive threat model analysis skill that performs security audits
using STRIDE-A (STRIDE + Abuse) threat modeling, Zero Trust principles, and
defense-in-depth analysis.

Supports two modes:
- Single analysis: full STRIDE-A threat model producing architecture overviews,
  DFD diagrams, prioritized findings, and executive assessments
- Incremental analysis: security posture diff between baseline report and current
  code, producing standalone reports with embedded comparison

Includes bundled reference assets:
- Orchestrator workflows (full and incremental)
- Analysis principles and verification checklists
- Output format specifications and skeleton templates
- DFD diagram conventions and TMT element taxonomy

* Address PR review comments from Copilot reviewer

- Fix SKILL.md description: use single-quoted scalar, rename mode (2) to
  'Incremental analysis' with accurate description
- Replace 'Compare Mode (Deprecated)' sections with 'Comparing Commits or
  Reports' redirect (no deprecated language for first release)
- Fix skeleton-findings.md: move Tier 1 table rows under header, add
  CONDITIONAL-EMPTY block after END-REPEAT (matching Tier 2/3 structure)
- Fix skeleton-threatmodel.md and skeleton-architecture.md: use 4-backtick
  outer fences to avoid nested fence conflicts with inner mermaid fences
- Fix skeleton-incremental-html.md: correct section count from 9 to 8
- Fix output-formats.md: change status 'open' to 'Open' in JSON example,
  move stride_category warning outside JSON fence as blockquote
- Fix incremental-orchestrator.md: replace stale compare-output-formats.md
  reference with inline color conventions
- Regenerate docs/README.skills.md with updated description

* Address second round of Copilot review comments

- Fix diagram-conventions.md: bidirectional flow notation now uses <-->
  matching orchestrator.md and DFD templates
- Fix tmt-element-taxonomy.md: normalize SE.DF.SSH/LDAP/LDAPS to use
  SE.DF.TMCore.* prefix consistent with all other data flow IDs
- Fix output-formats.md: correct TMT category example from SQLDatabase
  to SQL matching taxonomy, fix component type from 'datastore' to
  'data_store' matching canonical enum, remove DaprSidecar from
  inbound_from per no-standalone-sidecar rule
- Fix 5 skeleton files: clarify VERBATIM instruction to 'copy the
  template content below (excluding the outer code fence)' to prevent
  agents from wrapping output in markdown fences
- Genericize product-specific names in examples: replace edgerag with
  myapp, BitNetManager with TaskProcessor, AzureLocalMCP with MyApp.Core,
  AzureLocalInfra with OnPremInfra, MilvusVectorDB with VectorDB

* Address third round of Copilot review comments

- Fix diagram-conventions.md: second bidirectional two-arrow pattern in
  Quick Reference section now uses <-->
- Fix incremental-orchestrator.md: renumber HTML sections 5-9 to 4-8
  matching skeleton-incremental-html.md 8-section structure
- Fix output-formats.md: add incremental-comparison.html to File List
  as conditional output for incremental mode
- Fix skeleton-inventory.md: add tmt_type, sidecars, and boundary_kind
  fields to match output-formats.md JSON schema example
2026-03-30 07:58:56 +11:00

594 lines
54 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Orchestrator — Threat Model Analysis Workflow
This file contains the complete orchestration logic for performing a threat model analysis.
It is the primary workflow document for the `/threat-model-analyst` skill.
## ⚡ Context Budget — Read Files Selectively
**Do NOT read all 10 skill files at session start.** Read only what each phase needs. This preserves context window for the actual codebase analysis.
**Phase 1 (context gathering):** Read this file (`orchestrator.md`) + `analysis-principles.md` + `tmt-element-taxonomy.md`
**Phase 2 (writing reports):** Read the relevant skeleton from `skeletons/` BEFORE writing each file. Read `output-formats.md` + `diagram-conventions.md` for rules — but use the skeleton as the structural template.
- Before `0.1-architecture.md`: read `skeletons/skeleton-architecture.md`
- Before `1.1-threatmodel.mmd`: read `skeletons/skeleton-dfd.md`
- Before `1-threatmodel.md`: read `skeletons/skeleton-threatmodel.md`
- Before `2-stride-analysis.md`: read `skeletons/skeleton-stride-analysis.md`
- Before `3-findings.md`: read `skeletons/skeleton-findings.md`
- Before `0-assessment.md`: read `skeletons/skeleton-assessment.md`
- Before `threat-inventory.json`: read `skeletons/skeleton-inventory.md`
- Before `incremental-comparison.html`: read `skeletons/skeleton-incremental-html.md`
**Phase 3 (verification):** Delegate to a sub-agent and include `verification-checklist.md` in the sub-agent prompt. The sub-agent reads the full checklist with a fresh context window — the parent agent does NOT need to read it.
**Key principle:** Sub-agents get fresh context windows. Delegate verification and JSON generation to sub-agents rather than keeping everything in the parent context.
---
## ✅ Mandatory Rules — READ BEFORE STARTING
These are the required behaviors for every threat model report. Follow each rule exactly:
1. Organize findings by **Exploitability Tier** (Tier 1/2/3), never by severity level
2. Split each component's STRIDE table into Tier 1, Tier 2, Tier 3 sub-sections
3. Include `Exploitability Tier` and `Remediation Effort` on every finding — both are MANDATORY
4. STRIDE summary table MUST include T1, T2, T3 columns
4b. **STRIDE + Abuse Cases categories are exactly:** **S**poofing, **T**ampering, **R**epudiation, **I**nformation Disclosure, **D**enial of Service, **E**levation of Privilege, **A**buse (business logic abuse, workflow manipulation, feature misuse — an extension to standard STRIDE covering misuse of legitimate features). The A is ALWAYS "Abuse" — NEVER "AI Safety", "Authorization", or any other interpretation. Authorization issues belong under E (Elevation of Privilege).
5. `.md` files: start with `# Heading` on line 1. The `create_file` tool writes raw content — no code fences
6. `.mmd` files: start with `%%{init:` on line 1. Raw Mermaid source, no fences
7. Section MUST be titled exactly `## Action Summary`. Include `### Quick Wins` subsection with Tier 1 low-effort findings table
8. K8s sidecars: annotate host container with `<br/>+ Sidecar` — never create separate sidecar nodes (see `diagram-conventions.md` Rule 1)
9. Intra-pod localhost flows are implicit — do NOT draw them in diagrams
10. Action Summary IS the recommendations — no separate `### Key Recommendations` section
11. Include `> **Note on threat counts:**` blockquote in Executive Summary
12. Every finding MUST have CVSS 4.0 (score AND full vector string), CWE (with hyperlink), and OWASP (`:2025` suffix)
13. OWASP suffix is always `:2025` (e.g., `A01:2025 Broken Access Control`)
14. Include Threat Coverage Verification table at end of `3-findings.md` mapping every threat → finding
15. Every component in `0.1-architecture.md` MUST appear in `2-stride-analysis.md`
16. First 3 scenarios in `0.1-architecture.md` MUST have Mermaid sequence diagrams
17. `0-assessment.md` MUST include `## Analysis Context & Assumptions` with `### Needs Verification` and `### Finding Overrides` tables
18. `### Quick Wins` subsection is REQUIRED under Action Summary (include heading with note if none)
19. ALL 7 sections in `0-assessment.md` are MANDATORY: Report Files, Executive Summary, Action Summary, Analysis Context & Assumptions, References Consulted, Report Metadata, Classification Reference
20. **Deployment Classification is BINDING.** In `0.1-architecture.md`, set `Deployment Classification` and fill the Component Exposure Table. If classification is `LOCALHOST_DESKTOP` or `LOCALHOST_SERVICE`: zero T1 findings, zero `Prerequisites = None`, zero `AV:N` for non-listener components. See `analysis-principles.md` Deployment Context table.
21. Finding IDs MUST be sequential top-to-bottom: FIND-01, FIND-02, FIND-03... Renumber after sorting
22. CWE MUST include hyperlink: `[CWE-306](https://cwe.mitre.org/data/definitions/306.html): Missing Authentication`
23. After STRIDE, run the Technology-Specific Security Checklist in `analysis-principles.md`. Every technology in the repo needs at least one finding or documented mitigation
24. CVSS `AV:L` or `PR:H` → finding CANNOT be Tier 1. Downgrade to T2/T3. See CVSS-to-Tier Consistency Check in `analysis-principles.md`
25. Use only `Low`/`Medium`/`High` effort labels. NEVER generate time estimates, sprint phases, or scheduling. See Prohibited Content in `output-formats.md`
26. References Consulted: use the exact two-subsection format from `output-formats.md``### Security Standards` (3-column table with full URLs) and `### Component Documentation` (3-column table with URLs)
27. Report Metadata: include ALL fields from `output-formats.md` template — Model, Analysis Started, Analysis Completed, Duration. Run `Get-Date -Format "yyyy-MM-dd HH:mm:ss" -AsUTC` at Step 1 and before writing `0-assessment.md`
28. `## Summary` table in `2-stride-analysis.md` MUST appear at the TOP, immediately after `## Exploitability Tiers`, BEFORE individual component sections
29. Related Threats: every threat ID MUST be a hyperlink to `2-stride-analysis.md#component-anchor`. Format: `[T02.S](2-stride-analysis.md#component-name)`
30. Diagram colors: copy classDef lines VERBATIM from `diagram-conventions.md`. Only allowed fills: `#6baed6` (process), `#fdae61` (external), `#74c476` (datastore). Only allowed strokes: `#2171b5`, `#d94701`, `#238b45`, `#e31a1c`. Use ONLY `%%{init: {'theme': 'base', 'themeVariables': { 'background': '#ffffff', 'primaryColor': '#ffffff', 'lineColor': '#666666' }}}%%` — no other themeVariables keys
31. Summary DFD: after creating `1.1-threatmodel.mmd`, run the POST-DFD GATE in Step 4. The gate and `skeleton-summary-dfd.md` control whether `1.2-threatmodel-summary.mmd` is generated.
32. Report Files table in `0-assessment.md`: list `0-assessment.md` (this document) as the FIRST row, followed by 0.1-architecture.md, 1-threatmodel.md, etc. Use the exact template from `output-formats.md`
33. `threat-inventory.json` MUST be generated for every analysis run (Step 8b). This file enables future comparisons. See `output-formats.md` for schema.
34. **NEVER delete, modify, or remove any existing `threat-model-*` or `threat-model-compare-*` folders** in the repository. Only write to your own timestamped output folder. Cleaning up temporary git worktrees you created is allowed; deleting other report folders is FORBIDDEN.
### Rule Precedence (when guidance conflicts)
Apply rules in this order:
1. Literal skeletons in `skeletons/skeleton-*.md` — exact section/table headers and attribute rows
2. Mandatory Rules in `orchestrator.md` (this list)
3. Examples in `output-formats.md` (examples are illustrative, not authoritative when they differ from literal skeletons)
If any conflict is detected, follow the highest-precedence item.
**Post-generation:** The verification sub-agent will scan your output for all known deviations listed in `verification-checklist.md` Phase 0. Fix any failures before finalizing.
---
## Workflow
**Exclusions:** Skip these directories:
- `threat-model-*` (previous reports)
- `node_modules`, `.git`, `dist`, `build`, `vendor`, `__pycache__`
**Pre-work:** Before writing any output file, scan `verification-checklist.md` Phase 1 (Per-File Structural Checks) and Phase 2 (Diagram Rendering Checks). This internalizes the quality gates so output is correct on the first pass — preventing costly rework. Do NOT run the full verification yet; that happens in Step 10.
### ⛔ Sub-Agent Governance (MANDATORY — prevents duplicate work)
Sub-agents are **independent execution contexts** — they have no memory of the parent's state, instructions, or other sub-agents. Without strict governance, sub-agents will independently perform the ENTIRE analysis, creating duplicate report folders and wasting ~15 min compute + ~100K tokens per duplication.
**Rule 1 — Parent owns ALL file creation.** The parent agent is the ONLY agent that calls `create_file` for report files (0.1-architecture.md, stride-analysis.md, findings.md, etc.). Sub-agents NEVER write report files.
**Rule 2 — Sub-agents are READ-ONLY helpers.** Sub-agents may:
- Search source code for specific patterns (e.g., "find all auth-related code")
- Read and analyze files, then return structured data to the parent
- Run verification checks and return PASS/FAIL results
- Execute terminal commands (git diff, grep) and return output
**Rule 3 — Sub-agent prompts must be NARROW and SPECIFIC.** Never tell a sub-agent to "perform threat model analysis" or "generate the report." Instead:
- ✅ "Read these 5 Go files and list every function that handles credentials. Return a table of function name, file, line number."
- ✅ "Run the verification checklist against the files in {folder}. Return PASS/FAIL for each check."
- ✅ "Read threat-inventory.json from {path} and verify all array lengths match metrics. Return mismatches."
- ❌ "Analyze this codebase and write the threat model files."
- ❌ "Generate 0.1-architecture.md and stride-analysis.md for this component."
**Rule 4 — Output folder path.** The parent creates the timestamped output folder in Step 1 and uses that exact path for ALL `create_file` calls. If a sub-agent needs to read previously written report files, pass the folder path in the sub-agent prompt.
**Rule 5 — The ONLY exception** is `threat-inventory.json` generation (Step 8b), where the parent MAY delegate JSON writing to a sub-agent IF the data is too large. In that case, the sub-agent prompt MUST include: (a) the exact output file path, (b) the data to serialize, and (c) explicit instruction: "Write ONLY this one file. Do NOT create any other files or folders."
### Steps
1. **Record start time & gather context**
- Run `Get-Date -Format "yyyy-MM-dd HH:mm:ss" -AsUTC` and store as `START_TIME`
- Get git info: `git remote get-url origin`, `git branch --show-current`, `git rev-parse --short HEAD`, `git log -1 --format="%ai" HEAD` (commit date — NOT today's date), `hostname`
- Map the system: identify components, trust boundaries, data flows
- **Reference:** `analysis-principles.md` for security infrastructure inventory
**⛔ DEPLOYMENT CLASSIFICATION (MANDATORY — do this BEFORE analyzing code for threats):**
Determine the system's deployment class from code evidence (see `skeleton-architecture.md` for values).
Record in `0.1-architecture.md` → Deployment Model section. Then fill the **Component Exposure Table** — one row per component showing listen address, auth barrier, external reachability, and minimum prerequisite.
This table is the **single source of truth** for prerequisite floors. No threat or finding may have a lower prerequisite than what the exposure table permits for its component.
**⛔ DETERMINISTIC NAMING — Apply BEFORE writing any files:**
When identifying components, assign each a canonical PascalCase `id`. The naming MUST be deterministic — two independent runs on the same codebase MUST produce the same component IDs.
**⛔ ABSOLUTE RULE: Every component ID MUST be anchored to a real code artifact.**
For every component you identify, you MUST be able to point to a specific class, file, or manifest in the codebase that is the "anchor" for that component. If no such artifact exists, the component does not exist.
**Naming procedure (follow IN ORDER — stop at the first match):**
1. **Primary class name** — Use the EXACT class name from the source code. Do NOT abbreviate, expand, or rephrase it.
- `TaskProcessor.cs``TaskProcessor` (NOT `TaskServer`, NOT `TaskService`)
- `SessionStore.cs``SessionStore` (NOT `FileSessionStore`, NOT `SessionService`)
- `TerminalUserInterface.cs``TerminalUserInterface` (NOT `TerminalUI`)
- `PowerShellCommandExecutor.cs``PowerShellCommandExecutor` (NOT `PowerShellExecutor`)
- `ResponsesAPIService.cs``ResponsesAPIService` (NOT `LLMService` — that's a DIFFERENT class)
- `MCPHost.cs``MCPHost` (NOT `OrchestrationHost`)
2. **Primary script name**`Import-Images.ps1``ImportImages`
3. **Primary config/manifest name**`Dockerfile``DockerContainer`, `values.yaml``HelmChart`
4. **Directory name** (if component spans multiple files) → `src/ParquetParsing/``ParquetParser`
5. **Technology name** (for external services/datastores) → "Azure OpenAI" → `AzureOpenAI`, "Redis" → `Redis`
6. **External actor role**`Operator`, `EndUser` (never drop these)
**⛔ Helm/Kubernetes Deployment Naming (CRITICAL for comparison stability):**
When a component is deployed via Helm chart or Kubernetes manifests, use the **Kubernetes workload name** (from the Deployment/StatefulSet metadata.name) as the component ID — NOT the Helm template filename or directory structure:
- Look at `metadata.name` in deployment YAML → use that as the component ID (PascalCase normalized)
- Example: `metadata.name: devportal` in `templates/knowledge/devportal-deployment.yml` → component ID is `DevPortal`
- Example: `metadata.name: phi-model` in `templates/knowledge/phi-deployment.yml` → component ID is `PhiModel`
- **Why:** Helm templates frequently get reorganized (e.g., moved from `templates/` to `templates/knowledge/`) but the Kubernetes workload name stays the same. Using the workload name ensures the component ID survives directory reorganizations.
- `source_files` MUST include the deployment YAML path AND the application source code path (e.g., both `helmchart/myapp/templates/knowledge/devportal-deployment.yml` AND `developer-portal/src/`)
- `source_directories` MUST include BOTH the Helm template directory AND the source code directory
**External Service Anchoring (for components without repo source code):**
External services (cloud APIs, managed databases, SaaS endpoints) don't have source files in the repository. Anchor them to their **integration point** in the codebase:
- `source_files` → the client class or config file that defines the connection (e.g., `src/MCP/appsettings.json` for Azure OpenAI connection config, `helmchart/values.yaml` for Redis endpoint config)
- `source_directories` → the directory containing the integration code (e.g., `src/MCP/Core/Services/LLM/` for the LLM client)
- `class_names` → the CLIENT class in YOUR repo that talks to the service (e.g., `ResponsesAPIService`), NOT the vendor's SDK class (e.g., NOT `OpenAIClient`). If no dedicated client class exists, leave empty.
- `namespace` → leave empty `""` (external services don't have repo namespaces)
- `config_keys` → the env vars / config keys for the service connection (e.g., `["AZURE_OPENAI_ENDPOINT", "RESPONSES_API_DEPLOYMENT"]`). These are the most stable anchors for external services.
- `api_routes` → leave empty (external services expose their own routes, not yours)
- `dependencies` → the SDK package used (e.g., `["Azure.AI.OpenAI"]` for NuGet, `["pymilvus"]` for pip)
**Why this matters:** External services frequently change display names across LLM runs (e.g., "Azure OpenAI" vs "GPT-4 Endpoint" vs "LLM Backend"). The `config_keys` and `dependencies` fields are what make them matchable across runs.
**⛔ FORBIDDEN naming patterns — NEVER use these:**
- NEVER invent abstract names that don't correspond to a real class: `ConfigurationStore`, `LocalFileSystem`, `DataLayer`, `IngestionPipeline`, `BackendServer`
- NEVER abbreviate a class name: `TerminalUI` for `TerminalUserInterface`, `PSExecutor` for `PowerShellCommandExecutor`
- NEVER substitute a synonym: `TaskServer` for `TaskProcessor`, `LLMService` for `ResponsesAPIService`
- NEVER merge two separate classes into one component: `ResponsesAPIService` and `LLMService` are two different classes → two different components
- NEVER create a component for something that doesn't exist in the code: if there's no Windows Registry access code, don't create a `WindowsRegistry` component
- NEVER rename between runs: if you called it `TaskProcessor` in run 1, it MUST be `TaskProcessor` in run 2
**⛔ COMPONENT ANCHOR VERIFICATION (MANDATORY — do this BEFORE Step 2):**
After identifying all components, create a mental checklist:
```
For EACH component:
Q: What is the EXACT filename or class that anchors this component?
A: [must cite a real file path, e.g., "src/Core/TaskProcessor.cs"]
If you cannot cite a real file → DELETE the component from your list
```
This verification catches invented components like `WindowsRegistry` (no registry code exists), `ConfigurationStore` (no such class), `LocalFileSystem` (abstract concept, not a class).
**⛔ COMPONENT SELECTION STABILITY (when multiple related classes exist):**
Many systems have clusters of related classes (e.g., `CredentialManager`, `AzureCredentialProvider`, `AzureAuthenticationHandler`). To ensure deterministic selection:
- **Pick the class that OWNS the security-relevant behavior** — the one that makes the trust decision, holds the credential, or processes the data
- **Prefer the class registered in dependency injection** over helpers/utilities
- **Prefer the higher-level orchestrator** over its internal implementation classes
- **Once you pick a class, its alternatives become aliases** — add them to the `aliases` array, not as separate components
- **Example**: If `CredentialManager` orchestrates credential lookup and uses `AzureCredentialProvider` internally, `CredentialManager` is the component and `AzureCredentialProvider` is an alias
- **Example**: Do NOT include both `SessionStore` and `SessionFiles` — `SessionStore` is the class, `SessionFiles` is an abstract concept
- **Count rule**: Two runs on the same code MUST produce the same number of components (±1 for edge cases). A difference of ≥3 components indicates the selection rules were not followed.
**⛔ STABILITY ANCHORS (for comparison matching):**
When recording each component in `threat-inventory.json`, the `fingerprint` fields `source_directories`, `class_names`, and `namespace` serve as **stability anchors** — immutable identifiers that persist even when:
- The class is renamed (directory stays the same)
- The file is moved to a different directory (class name stays the same)
- The component ID changes between analysis runs (namespace stays the same)
The comparison matching algorithm relies on these anchors MORE than on the component `id` field. Therefore:
- `source_directories` MUST be populated for every process-type component (never empty `[]`)
- `class_names` MUST include at least the primary class name
- `namespace` MUST be the actual code namespace (e.g., `MyApp.Core.Servers.Health`), not a made-up grouping
- These fields are what make a component identifiable across independent analysis runs, even if two LLMs pick different display names
**⛔ COMPONENT ELIGIBILITY — What qualifies as a threat model component:**
A class/service becomes a threat model component ONLY if it meets ALL of these criteria:
1. **It crosses a trust boundary OR handles security-sensitive data** (credentials, user input, network I/O, file I/O, process execution)
2. **It is a top-level service**, not an internal helper (registered in DI, or the main entry point, or an agent with its own responsibility)
3. **It would appear in a deployment diagram** — you could point to it and say "this runs here, talks to that"
**ALWAYS include these component types (if they exist in the code):**
- ALL agent classes (HealthAgent, InfrastructureAgent, InvestigatorAgent, SupportabilityAgent, etc.)
- ALL MCP server classes (HealthServer, InfrastructureServer, etc.)
- The main host/orchestrator (MCPHost, etc.)
- ALL external service connections (AzureOpenAI, AzureAD, etc.)
- ALL credential/auth managers
- The user interface entry point
- ALL tool execution services (PowerShellCommandExecutor, etc.)
- ALL session/state persistence services
- ALL LLM service classes (ResponsesAPIService, LLMService — if they are separate classes, they are separate components)
- External actors (Operator, EndUser)
**NEVER include these as separate components:**
- Loggers (LocalFileLogger, TelemetryLogger) — these are cross-cutting concerns, not threat model components
- Static helper classes
- Model/DTO classes
- Configuration builders (unless they handle secrets)
- Infrastructure-as-code classes that don't exist at runtime (AzureStackHCI cluster reference, deployment scripts)
**The goal:** Every run on the same code should identify the SAME set of ~12-20 components. If you're including a logger or excluding an agent, you're doing it wrong.
**Boundary naming rules:**
- Boundary IDs MUST be PascalCase (never `Layer`, `Zone`, `Group`, `Tier` suffixes)
- Derive from deployment topology, NOT from code architecture layers
- **Deployment topology determines boundaries:**
- Single-process app → **EXACTLY 2 boundaries**: `Application` (the process) + `External` (external services). NEVER use 1 boundary. NEVER use 3+ boundaries. This is mandatory for single-process apps.
- Multi-container app → boundaries per container/pod
- K8s deployment → `K8sCluster` + per-namespace boundaries if relevant
- Client-server → `Client` + `Server`
- **K8s multi-service deployments (CRITICAL for microservice architectures):**
When a K8s namespace contains multiple Deployments/StatefulSets with DIFFERENT security characteristics, create sub-boundaries based on workload type:
- `BackendServices` — API services (FastAPI, Express, etc.) that handle user requests
- `DataStorage` — Databases and persistent storage (Redis, Milvus, PostgreSQL, NFS) — these have different access controls, persistence, and backup policies
- `MLModels` — ML model servers running on GPU nodes — these have different compute resources, attack surfaces (adversarial inputs), and scaling characteristics
- `Agentic` — Agent runtime/manager services if present
- The outer `K8sCluster` contains these sub-boundaries
- **This is NOT "code layers"** — each sub-boundary represents a different Kubernetes Deployment/StatefulSet with its own security context, resource limits, and network policies
- **Test**: If two components are in DIFFERENT Kubernetes Deployments with different service accounts, different network exposure, or different resource requirements → they SHOULD be in different sub-boundaries
- **FORBIDDEN boundary schemes (for SINGLE-PROCESS apps only):**
- Do NOT create boundaries based on code layers: `PresentationBoundary`, `OrchestrationBoundary`, `AgentBoundary`, `ServiceBoundary` are CODE LAYERS, not deployment boundaries. All these run in the SAME process.
- Do NOT split a single process into 4+ boundaries. If all components run in one .exe, they are in ONE boundary.
- **Example**: An application where `TerminalUserInterface`, `MCPHost`, `HealthAgent`, `ResponsesAPIService` all run in the same process → they are ALL in `Application`. External services like `AzureOpenAI` are in `External`.
- Two runs on the same code MUST produce the same number of boundaries (±1). A difference of ≥2 boundaries is WRONG.
- NEVER create boundaries based on code layers (Presentation/Business/Data) — boundaries represent DEPLOYMENT trust boundaries, not code architecture
**Boundary count locking:**
- After identifying boundaries, LOCK the count. Two runs on the same code MUST produce the same number of boundaries (±1 acceptable if one run identifies an edge boundary the other doesn't)
- A 4-boundary vs 7-boundary difference on the same code is WRONG and indicates the naming rules were not followed
**Additional naming rules:**
- The SAME component must get the SAME `id` regardless of which LLM model runs the analysis or how many times it runs
- External actors (`Operator`, `AzureDataStudio`, etc.) are ALWAYS included — never drop them
- Datastores representing distinct storage (files, database) are ALWAYS separate components — never merge them
- Lock the component list before Step 2. Use these exact IDs in ALL subsequent files (architecture, DFD, STRIDE, findings, JSON)
- If two classes exist as separate files (e.g., `ResponsesAPIService.cs` and `LLMService.cs`), they are TWO components even if they seem related
**⛔ DATA FLOW COMPLETENESS (MANDATORY — ensures consistent flow enumeration across runs):**
Data flows MUST be enumerated exhaustively. Two independent analyses of the same codebase MUST produce the same set of flows. To achieve this:
**⛔ RETURN FLOW MODELING RULE (addresses 24% variance in flow counts):**
- **DO NOT model separate return flows.** A request-response pair is ONE bidirectional flow (use `<-->` in Mermaid).
- Example: `DF01: Operator <--> TUI` (one flow for input and output)
- Example: `DF03: MCPHost <--> HealthAgent` (one flow for delegation and result)
- **DO model separate flows ONLY when the two directions use different protocols or semantics** (e.g., HTTP request vs WebSocket push-back).
- **Why:** When runs independently decide whether to create 1 flow or 2 flows per interaction, the flow count varies by 20-30%. This rule eliminates that variance.
- **Flow count formula:** `# flows ≈ # unique component-to-component interactions`. If component A talks to component B, that is 1 flow, not 2.
**Flow completeness checklist (use `<-->` bidirectional flows per the return flow rule above):**
1. **Ingress/reverse proxy flows**: `DF_EndUser_to_NginxIngress` (bidirectional `<-->`), `DF_NginxIngress_to_Backend` (bidirectional `<-->`). Each is ONE flow, not two.
2. **Database/datastore flows**: `DF_Service_to_Redis` (bidirectional `<-->`). ONE flow per service-datastore pair.
3. **Auth provider flows**: `DF_Service_to_AzureAD` (bidirectional `<-->`). ONE flow per service-auth pair.
4. **Admin access flows**: `DF_Operator_to_Service` (bidirectional `<-->`). ONE per admin interaction.
5. **Flow count locking**: After enumerating flows, LOCK the count. Two runs on the same code MUST produce the same number of flows (±3 acceptable). A difference of >5 flows indicates incomplete enumeration.
**⛔ EXTERNAL ENTITY INCLUSION RULES (addresses variance in which externals are modeled):**
- **ALWAYS include `AzureAD` (or `EntraID`) as an external entity** if the code acquires tokens from Azure AD / Microsoft Entra ID (look for `ChainedTokenCredential`, `ManagedIdentityCredential`, `AzureCliCredential`, MSAL, or any OAuth2/OIDC flow).
- **ALWAYS include the infrastructure target** (e.g., `OnPremInfra`, `HCICluster`) as an external entity if the code sends commands to external infrastructure via PowerShell, REST, or WMI.
- **ALWAYS include `AzureOpenAI`** (or equivalent LLM endpoint) if the code calls a cloud LLM API.
- **ALWAYS include `Operator`** as an external actor for CLI/TUI tools, admin tools, or operator consoles.
- **Rule of thumb:** If the code has a client class or config for a service, that service is an external entity.
**⛔ TMT CATEGORY RULES (addresses category inconsistency across runs):**
- **Tool servers** that expose APIs callable by agents → `SE.P.TMCore.WebSvc` (NOT `SE.P.TMCore.NetApp`)
- **Network-level services** that handle connections/sockets → `SE.P.TMCore.NetApp`
- **Services that execute OS commands** (PowerShell, bash) → `SE.P.TMCore.OSProcess`
- **Services that store data to disk** (SessionStore, FileLogger) → `SE.DS.TMCore.FS` (classify as Data Store, NOT Process)
- **Rule:** If a class's primary purpose is persisting data, it is a Data Store. If it does computation or orchestration, it is a Process. Never switch between runs.
**⛔ DFD DIRECTION (MANDATORY — addresses layout variance):**
- ALL DFDs MUST use `flowchart LR` (left-to-right). NEVER use `flowchart TB`.
- ALL summary DFDs MUST also use `flowchart LR`.
- This is immutable — do not change based on aesthetics or diagram shape.
**Acronym rules for PascalCase:**
- Preserve well-known acronyms as ALL-CAPS: `API`, `NFS`, `LLM`, `SQL`, `HCI`, `AD`, `UI`, `DB`
- Examples: `IngestionAPI` (not `IngestionApi`), `NFSServer` (not `NfsServer`), `AzureAD` (not `AzureAd`), `VectorDBAPI` (not `VectorDbApi`)
- Single-word technologies keep standard casing: `Redis`, `Milvus`, `PostgreSQL`, `Nginx`
**Common technology naming (use EXACTLY these IDs for well-known infrastructure):**
- Redis cache/state: `Redis` (never `DaprStateStore`, `RedisCache`, `StateStore`)
- Milvus vector DB: `Milvus` (never `MilvusVectorDb`, `VectorDB`)
- NGINX ingress: `NginxIngress` (never `IngressNginx`)
- Azure AD/Entra: `AzureAD` (never `AzureAd`, `EntraID`)
- PostgreSQL: `PostgreSQL` (never `PostgresDb`, `Postgres`)
- User/Operator: `Operator` for admin users, `EndUser` for end users
- Azure OpenAI: `AzureOpenAI` (never `OpenAIService`, `LLMEndpoint`)
- NFS: `NFSServer` (never `NfsServer`, `FileShare`)
- If two LLM models are separate deployments, keep them separate (never merge `MistralLLM` + `PhiLLM` into `LocalLlm`)
**BUT: for application-specific classes, use the EXACT class name from the code, NOT a technology label:**
- `ResponsesAPIService.cs` → `ResponsesAPIService` (NOT `OpenAIService` — the class IS named ResponsesAPIService)
- `TaskProcessor.cs` → `TaskProcessor` (NOT `LocalLLM` — the class IS named TaskProcessor)
- `SessionStore.cs` → `SessionStore` (NOT `StatePersistence` — the class IS named SessionStore)
**Component granularity rules (CRITICAL for stability):**
- Model components at the **technology/service level**, not the script/file level
- A Docker container running Kusto is `KustoContainer` — NOT decomposed into `KustoService` + `IngestLogs` + `KustoDataDirectory`
- A Moby Docker engine is `MobyDockerEngine` — NOT `InstallMoby` (the installer script is evidence, not the component)
- An installer for a tool is `SetupInstaller` — NOT renamed to `InstallAzureEdgeDiagnosticTool` (script filename)
- Rule: if a component has one primary function (e.g., "run Kusto queries"), model it as ONE component regardless of how many scripts/files implement it
- Scripts are EVIDENCE for components, not components themselves
- Keep the same granularity across runs — never split a single component into sub-components or merge sub-components between runs
**⛔ COMPONENT ID FORMAT (MANDATORY — addresses casing variance):**
- ALL component IDs MUST be PascalCase. NEVER use kebab-case, snake_case, or camelCase.
- Examples: `HealthAgent` (not `health-agent`), `AzureAD` (not `azure-ad`), `MCPHost` (not `mcp-host`)
- This applies to ALL artifacts: 0.1-architecture.md, 1-threatmodel.md, DFD mermaid, STRIDE, findings, JSON.
**⛔ STRIDE SCOPE RULE (addresses external entity analysis variance):**
- STRIDE analysis in `2-stride-analysis.md` MUST include sections for ALL elements in the Element Table EXCEPT external actors (Operator, EndUser).
- External services (AzureOpenAI, AzureAD, OnPremInfra) DO get STRIDE sections — they are attack surfaces from YOUR system's perspective.
- External actors (human users) do NOT get STRIDE sections — they are threat SOURCES, not targets.
- This means: if you have 20 elements total and 1 is an external actor, you write 19 STRIDE sections.
**⛔ STRIDE DEPTH CONSISTENCY (addresses threat count variance):**
- Each component MUST get ALL 7 STRIDE-A categories analyzed (S, T, R, I, D, E, A).
- Each STRIDE category MUST be explicitly addressed per component: either with one or more concrete threats, OR with an explicit `N/A — {1-sentence justification}` row explaining why that category does not apply to this specific component.
- A category may produce 0, 1, 2, 3, or more threats — the count depends on the component's actual attack surface. Do NOT cap at 1 threat per category. Components with rich security surfaces (API services, auth managers, command executors, LLM clients) should typically have 2-4 threats per relevant STRIDE category. Only simple components (static config, read-only data stores) should have mostly 0-1.
- **Expected distribution:** For a 15-component system: ~30% of STRIDE cells should be 0 (with N/A), ~40% should be 1, ~25% should be 2, ~5% should be 3+. If ALL cells are 0 or 1 (binary pattern) → the analysis is too shallow. Go back and identify additional threat vectors.
- N/A entries do NOT count toward threat totals in the Summary table. Only concrete threat rows count.
- The Summary table S/T/R/I/D/E/A columns show the COUNT of concrete threats per category (0 is valid if N/A was justified).
- This ensures comprehensive coverage while producing accurate, non-inflated threat counts.
2. **Write architecture overview** (`0.1-architecture.md`)
- **Read `skeletons/skeleton-architecture.md` first** — copy skeleton structure, fill `[FILL]` placeholders
- System purpose, key components, top scenarios, tech stack, deployment
- **Use the exact component IDs locked in Step 1** — do not rename or merge components
- **Reference:** `output-formats.md` for template, `diagram-conventions.md` for architecture diagram styles
3. **Inventory security infrastructure**
- Identify security-enabling components before flagging gaps
- **Reference:** `analysis-principles.md` Security Infrastructure Inventory table
4. **Produce threat model DFD** (`1.1-threatmodel.mmd`, `1.2-threatmodel-summary.mmd`, `1-threatmodel.md`)
- **Read `skeletons/skeleton-dfd.md`, `skeletons/skeleton-summary-dfd.md`, and `skeletons/skeleton-threatmodel.md` first**
- **Reference:** `diagram-conventions.md` for DFD styles, `tmt-element-taxonomy.md` for element classification
- ⚠️ **BEFORE FINALIZING:** Run the Pre-Render Checklist from `diagram-conventions.md`
⛔ **POST-DFD GATE — Run IMMEDIATELY after creating `1.1-threatmodel.mmd`:**
1. Count elements (nodes with `((...))`, `[(...)`, `["..."]`) in `1.1-threatmodel.mmd`
2. Count boundaries (`subgraph` lines)
3. If elements > 15 OR boundaries > 4:
→ You MUST create `1.2-threatmodel-summary.mmd` using `skeleton-summary-dfd.md` NOW
→ Do NOT proceed to `1-threatmodel.md` until the summary file exists
4. If threshold NOT met → skip summary, proceed to `1-threatmodel.md`
5. Create `1-threatmodel.md` (include Summary View section if summary was generated)
5. **Enumerate threats** per element and flow using STRIDE-A (`2-stride-analysis.md`)
- **Read `skeletons/skeleton-stride-analysis.md` first** — use Summary table and per-component structure
- **Reference:** `analysis-principles.md` for tier definitions, `output-formats.md` for STRIDE template
- **⛔ PREREQUISITE FLOOR CHECK (per threat):** Before assigning a prerequisite to any threat, look up the component's `Min Prerequisite` and `Derived Tier` in the Component Exposure Table (`0.1-architecture.md`). The threat's prerequisite MUST be ≥ the component's floor. The threat's tier MUST be ≥ the component's derived tier (i.e., if component is T2, no threat can be T1). Use the canonical prerequisite→tier mapping from `analysis-principles.md`.
6. **For each threat:** cite files/functions/endpoints, propose mitigations, provide verification steps
7. **Verify findings** — confirm each finding against actual configuration before documenting
- **Reference:** `analysis-principles.md` Finding Validation Checklist
7b. **Technology sweep** — Run the Technology-Specific Security Checklist from `analysis-principles.md`
- For every technology found in the repo (Redis, Milvus, PostgreSQL, Docker, K8s, ML models, LLMs, NFS, CI/CD, etc.), verify you have at least one finding or explicit mitigation
- This step catches gaps that component-level STRIDE misses (e.g., database auth defaults, container hardening, key management)
- Add any missing findings before proceeding to Step 8
8. **Compile findings** (`3-findings.md`)
- **Reference:** `output-formats.md` for findings template and Related Threats link format
- **Reference:** `skeletons/skeleton-findings.md` — read this skeleton, copy VERBATIM, fill in `[FILL]` placeholders for each finding
⛔ **PRE-WRITE GATE — Verify before calling `create_file` for `3-findings.md`:**
1. Finding IDs: `### FIND-01:`, `### FIND-02:` — sequential, `FIND-` prefix (NOT `F01` or `F-01`)
2. CVSS prefix: every vector starts with `CVSS:4.0/` (NOT bare `AV:N/AC:L/...`)
3. Related Threats: each threat ID is a separate hyperlink `[TNN.X](2-stride-analysis.md#anchor)` (NOT plain text)
4. Sub-sections: `#### Description`, `#### Evidence`, `#### Remediation`, `#### Verification` (NOT `Recommendation`)
5. Sort: within each tier → Critical → Important → Moderate → Low → higher CVSS first
6. All 10 mandatory attribute rows present per finding
7. **Deployment context gate (FAIL-CLOSED):** Read `0.1-architecture.md` Deployment Classification and Component Exposure Table.
If classification is `LOCALHOST_DESKTOP` or `LOCALHOST_SERVICE`:
- ZERO findings may have `Exploitation Prerequisites` = `None` → fix to `Local Process Access` (T2) or `Host/OS Access` (T3)
- ZERO findings may be in `## Tier 1` → downgrade to T2/T3 based on prerequisite
- ZERO CVSS vectors may use `AV:N` unless the **specific component** has `Reachability = External` in the Component Exposure Table → fix to `AV:L`
For ALL deployment classifications:
- For EACH finding, look up its Component in the exposure table. The finding's prerequisite MUST be ≥ the component's `Min Prerequisite`. The finding's tier MUST be ≥ the component's `Derived Tier`.
- Prerequisites MUST use only canonical values: `None`, `Authenticated User`, `Privileged User`, `Internal Network`, `Local Process Access`, `Host/OS Access`, `Admin Credentials`, `Physical Access`, `{Component} Compromise`. ⛔ `Application Access` and `Host Access` are FORBIDDEN.
If ANY violation exists → **DO NOT WRITE THE FILE.** Fix all violations first.
⛔ **Fail-fast gate:** Immediately after writing, run the Inline Quick-Checks for `3-findings.md` from `verification-checklist.md`. Fix before proceeding.
⛔ **MANDATORY: All 3 tier sections must be present.** Even if a tier has zero findings, include the heading with a note:
- `## Tier 1 — Direct Exposure (No Prerequisites)` → `*No Tier 1 findings identified for this repository.*`
- This ensures structural consistency for comparison matching and validation.
⛔ **COVERAGE VERIFICATION FEEDBACK LOOP (MANDATORY):**
After writing the Threat Coverage Verification table at the end of `3-findings.md`:
1. **Scan the table you just wrote.** Count how many threats have status `✅ Covered` vs `🔄 Mitigated by Platform` vs `⚠️ Needs Review` vs `⚠️ Accepted Risk`.
2. **If ANY threat has `⚠️ Accepted Risk`** → FAIL. The tool cannot accept risks. Go back and create a finding for each one.
3. **If Platform ratio > 20%** → SUSPECT. Re-examine each `🔄 Mitigated by Platform` entry: is the mitigation truly from an EXTERNAL system managed by a DIFFERENT team? If the mitigation is the repo's own code (auth middleware, file permissions, TLS config, localhost binding), reclassify as `Open` and create a finding.
4. **If ANY `Open` threat in `2-stride-analysis.md` has NO corresponding finding** → create a finding NOW. Use the threat's description as the finding title, the mitigation column as the remediation guidance, and assign severity based on STRIDE category.
5. **Update `3-findings.md`** with the newly created findings. Renumber sequentially. Update the Coverage table to show `✅ Covered` for each.
6. **This loop is the ENTIRE POINT of the Coverage table** — it's not documentation, it's a self-check that forces complete coverage. If you write the table and don't act on gaps, you've wasted the effort.
8b. **Generate threat inventory** (`threat-inventory.json`)
- **Read `skeletons/skeleton-inventory.md` first** — use exact field names and schema structure
- After writing all markdown reports, compile a structured JSON inventory of all components, boundaries, data flows, threats, and findings
- Use canonical PascalCase IDs for components (derived from class/file names) and keep display labels separate
- Use canonical flow IDs: `DF_{Source}_to_{Target}`
- Include identity keys on every threat and finding for future matching
- Include deterministic identity fields for component and boundary matching across runs:
- Component: `aliases`, `boundary_kind`, `fingerprint`
- Boundary: `kind`, `aliases`, `contains_fingerprint`
- Build `fingerprint` from stable evidence (source files, endpoint neighbors, protocols, type) — never from prose wording
- Normalize synonyms to the same canonical component ID (example: `SupportAgent` and `SupportabilityAgent` → `SupportabilityAgent`) and store alternate names in `aliases`
- Sort arrays deterministically before writing JSON:
- `components` by `id`
- `boundaries` by `id`
- `flows` by `id`
- `threats` by `id` then `identity_key.component_id`
- `findings` by `id` then `identity_key.component_id`
- Extract metrics (totals, per-tier counts, per-STRIDE-category counts)
- Include git metadata (commit SHA, branch, date) and analysis metadata (model, timestamps)
- **Reference:** `output-formats.md` for the `threat-inventory.json` schema
- **This file is NOT linked in 0-assessment.md** but is always present in the output folder
⛔ **PRE-WRITE SIZE CHECK (MANDATORY — before calling `create_file` for JSON):**
Before writing `threat-inventory.json`, count the data you plan to include:
- Count total threats from `2-stride-analysis.md` (grep `^\| T\d+\.`)
- Count total findings from `3-findings.md` (grep `### FIND-`)
- Count total components from `0.1-architecture.md`
- **If threats > 50 OR findings > 15:** DO NOT use a single `create_file` call.
Instead, use one of: (a) delegate to sub-agent, (b) Python extraction script, (c) chunked write strategy.
- **If threats ≤ 50 AND findings ≤ 15:** single `create_file` is acceptable, but keep entries minimal (1-sentence description/mitigation fields).
⛔ **POST-WRITE VALIDATION (MANDATORY — JSON Array Completeness):**
After writing `threat-inventory.json`, immediately verify:
- `threats.length == metrics.total_threats` — if mismatch, the threats array was truncated during generation. Rebuild by re-reading `2-stride-analysis.md` and extracting every threat row.
- `findings.length == metrics.total_findings` — if mismatch, rebuild from `3-findings.md`.
- `components.length == metrics.total_components` — if mismatch, rebuild from architecture/element tables.
⛔ **CROSS-FILE THREAT COUNT VERIFICATION (MANDATORY — catches dropped threats):**
The JSON `threats.length` can match `metrics.total_threats` but BOTH can be wrong if threats were dropped during JSON generation. To catch this:
- Count threat rows in `2-stride-analysis.md`: grep for `^\| T\d+\.` and count unique threat IDs
- Compare this count to `threats.length` in the JSON
- If the markdown has MORE threats than the JSON → the JSON dropped threats. Rebuild the JSON by re-extracting ALL threats from `2-stride-analysis.md`.
- This is the #2 quality issue observed in testing (after truncation). Large repos (114+ threats) frequently have 1-3 threats dropped when sub-agents write the JSON from memory instead of re-reading the STRIDE file.
⛔ **FIELD NAME COMPLIANCE GATE (MANDATORY — run immediately after array check):**
Read the first component and first threat from the JSON just written and verify these EXACT field names:
- `components[0]` has key `"display"` (NOT `"display_name"`, NOT `"name"`) → if wrong, find-replace ALL occurrences
- `threats[0]` has key `"stride_category"` (NOT `"category"`) → if wrong, find-replace ALL occurrences
- `threats[0].identity_key` has key `"component_id"` (threat→component link must be INSIDE `identity_key`, NOT a top-level `component_id` field on the threat) → if wrong, restructure
- `threats[0]` has BOTH `"title"` (short name, e.g., "Information Disclosure — Redis unencrypted traffic") AND `"description"` (longer prose). If only `description` exists without `title`, create `title` from the first sentence of `description`. If `name` or `threat_name` exists instead of `title`, find-replace to `title`
- **Why this matters:** Downstream tooling depends on these exact field names. Wrong names cause zero-value heatmaps, broken component matching, and empty display labels in comparison reports.
- **If ANY field name is wrong:** fix it NOW with find-replace on the JSON file before proceeding. Do NOT leave it for verification.
- **This is the #1 quality issue observed in testing.** Large repos (20+ components, 80+ threats) frequently have truncated JSON arrays because the model runs out of output tokens. If ANY array is truncated, you MUST rebuild it before proceeding. Do NOT finalize with mismatched counts.
⛔ **HARD GATE — TRUNCATION RECOVERY (MANDATORY):**
If post-write validation detects ANY array mismatch:
1. **DELETE** the truncated `threat-inventory.json` immediately
2. **DO NOT attempt to patch** the truncated file — partial JSON is unreliable
3. **Regenerate using one of these strategies** (in preference order):
a. **Delegate to a sub-agent** — hand the sub-agent the output folder path and instruct it to read `2-stride-analysis.md` and `3-findings.md`, then write `threat-inventory.json`. The sub-agent has a fresh context window.
b. **Python extraction script** — write a Python script that reads the markdown files, extracts threats/findings via regex, and writes the JSON. Run the script via terminal.
c. **Chunked write** — use the Large Repo Strategy below.
4. **Re-validate** after regeneration — if still mismatched, repeat with the next strategy
5. **NEVER proceed to Step 9 (assessment) or Step 10 (verification) with mismatched counts**
⛔ **LARGE REPO STRATEGY (MANDATORY for repos with >60 threats):**
For repos producing more than ~60 threats, the JSON file can exceed output token limits if generated in one pass. Use this chunked approach:
1. **Write metadata + components + boundaries + flows + metrics first** — these are small arrays
2. **Append threats in batches** — write threats array with ~20 threats per append operation. Use `replace_string_in_file` to add batches to the existing file rather than writing the entire JSON in one `create_file` call.
3. **Append findings** — similarly batch if >15 findings
4. **Final validation** — read the completed file and verify all array lengths match metrics
**Alternative approach:** If chunked writing is not feasible, keep each threat/finding entry minimal:
- `description` field: max 1 sentence (not full prose paragraphs)
- `mitigation` field: max 1 sentence
- Remove redundant fields that duplicate markdown content
- The JSON is for MATCHING, not for reading — brevity is key
9. **Write assessment** (`0-assessment.md`)
- **Reference:** `output-formats.md` for assessment template
- **Reference:** `skeletons/skeleton-assessment.md` — read this skeleton, copy VERBATIM, fill in `[FILL]` placeholders
- ⚠️ **ALL 7 sections are MANDATORY:** Report Files, Executive Summary, Action Summary (with Quick Wins), Analysis Context & Assumptions (with Needs Verification + Finding Overrides), References Consulted, Report Metadata, Classification Reference
- Do NOT add extra sections like "Severity Distribution", "Architecture Risk Areas", "Methodology Notes", or "Deliverables" — these are NOT in the template
⛔ **PRE-WRITE GATE — Verify before calling `create_file` for `0-assessment.md`:**
1. Exactly 7 sections: Report Files, Executive Summary, Action Summary, Analysis Context & Assumptions (with `&`), References Consulted, Report Metadata, Classification Reference
2. `---` horizontal rules between EVERY pair of `## ` sections (minimum 6)
3. `### Quick Wins`, `### Needs Verification`, `### Finding Overrides` all present
4. References: TWO subsections (`### Security Standards` + `### Component Documentation`) with 3-column tables and full URLs
5. ALL metadata values wrapped in backticks; ALL fields present (Model, Analysis Started, Analysis Completed, Duration)
6. Element/finding/threat counts match actual counts from other files
⛔ **Fail-fast gate:** Immediately after writing, run the Inline Quick-Checks for `0-assessment.md` from `verification-checklist.md`. Fix before proceeding.
10. **Final verification** — iterative correction loop
This step runs verification and fixes in a loop until all checks pass. Do NOT finalize with any failures remaining.
**Pass 1 — Comprehensive verification:**
- Delegate to a verification sub-agent with the content of `verification-checklist.md` + the output folder path
- Sub-agent runs ALL Phase 05 checks and reports PASS/FAIL with evidence
- If ANY check fails:
1. Fix the failed file(s) using the available file-edit tool
2. Re-run ONLY the failed checks against the fixed file(s)
3. Repeat until the failed checks pass
**Pass 2 — Regression check (if Pass 1 had fixes):**
- Re-run Phase 3 (cross-file consistency) to ensure fixes didnt break other files
- If new failures appear, fix and re-verify
**Exit condition:** ALL phases report 0 failures. Only then mark the analysis as complete.
**Sub-agent context management:**
- Include the relevant phase content from `verification-checklist.md` in the sub-agent prompt
- Include the output folder path so the sub-agent can read files
- Sub-agent output MUST include: phase name, total checks, passed, failed, and for each failure: check ID, file, evidence, exact fix instruction. Do not return "looks good" without counts.
---
## Tool Usage
### Progress Tracking (todo)
- Create todos at start for each major phase
- Mark in-progress before starting each phase
- Mark completed immediately after finishing each phase
### Sub-task Delegation (agent)
Delegate NARROW, READ-ONLY tasks to sub-agents (see Sub-Agent Governance above). Allowed delegations:
- **Context gathering:** "Search for auth patterns in these directories and return a summary"
- **Code analysis:** "Read these files and identify security-relevant APIs, credentials, and trust boundaries"
- **Verification:** Hand the verification sub-agent the content of `verification-checklist.md` and the output folder path. It reads the files and returns PASS/FAIL results. The PARENT fixes any failures.
- **JSON generation (exception):** For large repos, delegate `threat-inventory.json` writing with exact file path and pre-computed data
**NEVER delegate:** "Write 0.1-architecture.md", "Generate the STRIDE analysis", "Perform the threat model analysis", or any prompt that would cause the sub-agent to independently produce report files.
---
## Verification Checklist (Final Step)
The full verification checklist is in `verification-checklist.md`. It contains 9 phases:
> **Authority hierarchy:** `orchestrator.md` defines the AUTHORING rules (what to do when writing reports). `verification-checklist.md` defines the CHECKING rules (what to verify after writing). Some rules appear in both files for visibility — if they ever conflict, `orchestrator.md` rules take precedence for authoring decisions, and `verification-checklist.md` takes precedence for pass/fail criteria. For the complete list of all structural, diagram, and consistency checks, always consult `verification-checklist.md` — it is the single source of truth for quality gates.
0. **Phase 0 — Common Deviation Scan**: Known deviation patterns with WRONG→CORRECT examples
1. **Phase 1 — Per-File Structural Checks**: Section order, required content, formatting
2. **Phase 2 — Diagram Rendering Checks**: Mermaid init blocks, classDef, styles, syntax
3. **Phase 3 — Cross-File Consistency Checks**: Component coverage, DF mapping, threat-to-finding traceability
4. **Phase 4 — Evidence Quality Checks**: Evidence concreteness, verify-before-flagging compliance
5. **Phase 5 — JSON Schema Validation**: Schema fields, array completeness, metrics consistency
6. **Phase 6 — Deterministic Identity**: Component ID stability, boundary naming, flow ID consistency
7. **Phase 7 — Evidence-Based Prerequisites**: Prerequisite deployment evidence, coverage completeness
8. **Phase 8 — Comparison HTML** (incremental only): HTML structure, change annotations, CSS
**Inline Quick-Checks:** `verification-checklist.md` also contains Inline Quick-Checks that MUST be run immediately after writing each file (before Step 10). These catch errors while content is still in active context.
**Two-pass usage:**
- **Before writing (Workflow pre-work):** Scan Phase 1 and Phase 2 to internalize structural and diagram quality gates. This prevents rework.
- **After writing (Step 10):** Run ALL Phase 04 checks comprehensively against the completed output. Phase 0 is the most critical — it catches the deviations that persist across runs. Fix any failures before finalizing.
**Delegation:** Hand the verification sub-agent the content of `verification-checklist.md` and the output folder. It will run all checks and produce a PASS/FAIL summary. Fix any failures before finalizing.
---
## Starting the Analysis
If no folder path is provided, analyze the entire repository from its root.