Files
awesome-copilot/skills/threat-model-analyst/references/orchestrator.md
Vijay Chegu afba5b86b8 Add threat-model-analyst skill: STRIDE-A threat modeling for repositories (#1177)
* Add threat-model-analyst skill: STRIDE-A threat modeling for repositories

Add a comprehensive threat model analysis skill that performs security audits
using STRIDE-A (STRIDE + Abuse) threat modeling, Zero Trust principles, and
defense-in-depth analysis.

Supports two modes:
- Single analysis: full STRIDE-A threat model producing architecture overviews,
  DFD diagrams, prioritized findings, and executive assessments
- Incremental analysis: security posture diff between baseline report and current
  code, producing standalone reports with embedded comparison

Includes bundled reference assets:
- Orchestrator workflows (full and incremental)
- Analysis principles and verification checklists
- Output format specifications and skeleton templates
- DFD diagram conventions and TMT element taxonomy

* Address PR review comments from Copilot reviewer

- Fix SKILL.md description: use single-quoted scalar, rename mode (2) to
  'Incremental analysis' with accurate description
- Replace 'Compare Mode (Deprecated)' sections with 'Comparing Commits or
  Reports' redirect (no deprecated language for first release)
- Fix skeleton-findings.md: move Tier 1 table rows under header, add
  CONDITIONAL-EMPTY block after END-REPEAT (matching Tier 2/3 structure)
- Fix skeleton-threatmodel.md and skeleton-architecture.md: use 4-backtick
  outer fences to avoid nested fence conflicts with inner mermaid fences
- Fix skeleton-incremental-html.md: correct section count from 9 to 8
- Fix output-formats.md: change status 'open' to 'Open' in JSON example,
  move stride_category warning outside JSON fence as blockquote
- Fix incremental-orchestrator.md: replace stale compare-output-formats.md
  reference with inline color conventions
- Regenerate docs/README.skills.md with updated description

* Address second round of Copilot review comments

- Fix diagram-conventions.md: bidirectional flow notation now uses <-->
  matching orchestrator.md and DFD templates
- Fix tmt-element-taxonomy.md: normalize SE.DF.SSH/LDAP/LDAPS to use
  SE.DF.TMCore.* prefix consistent with all other data flow IDs
- Fix output-formats.md: correct TMT category example from SQLDatabase
  to SQL matching taxonomy, fix component type from 'datastore' to
  'data_store' matching canonical enum, remove DaprSidecar from
  inbound_from per no-standalone-sidecar rule
- Fix 5 skeleton files: clarify VERBATIM instruction to 'copy the
  template content below (excluding the outer code fence)' to prevent
  agents from wrapping output in markdown fences
- Genericize product-specific names in examples: replace edgerag with
  myapp, BitNetManager with TaskProcessor, AzureLocalMCP with MyApp.Core,
  AzureLocalInfra with OnPremInfra, MilvusVectorDB with VectorDB

* Address third round of Copilot review comments

- Fix diagram-conventions.md: second bidirectional two-arrow pattern in
  Quick Reference section now uses <-->
- Fix incremental-orchestrator.md: renumber HTML sections 5-9 to 4-8
  matching skeleton-incremental-html.md 8-section structure
- Fix output-formats.md: add incremental-comparison.html to File List
  as conditional output for incremental mode
- Fix skeleton-inventory.md: add tmt_type, sidecars, and boundary_kind
  fields to match output-formats.md JSON schema example
2026-03-30 07:58:56 +11:00

54 KiB
Raw Blame History

Orchestrator — Threat Model Analysis Workflow

This file contains the complete orchestration logic for performing a threat model analysis. It is the primary workflow document for the /threat-model-analyst skill.

Context Budget — Read Files Selectively

Do NOT read all 10 skill files at session start. Read only what each phase needs. This preserves context window for the actual codebase analysis.

Phase 1 (context gathering): Read this file (orchestrator.md) + analysis-principles.md + tmt-element-taxonomy.md Phase 2 (writing reports): Read the relevant skeleton from skeletons/ BEFORE writing each file. Read output-formats.md + diagram-conventions.md for rules — but use the skeleton as the structural template.

  • Before 0.1-architecture.md: read skeletons/skeleton-architecture.md
  • Before 1.1-threatmodel.mmd: read skeletons/skeleton-dfd.md
  • Before 1-threatmodel.md: read skeletons/skeleton-threatmodel.md
  • Before 2-stride-analysis.md: read skeletons/skeleton-stride-analysis.md
  • Before 3-findings.md: read skeletons/skeleton-findings.md
  • Before 0-assessment.md: read skeletons/skeleton-assessment.md
  • Before threat-inventory.json: read skeletons/skeleton-inventory.md
  • Before incremental-comparison.html: read skeletons/skeleton-incremental-html.md Phase 3 (verification): Delegate to a sub-agent and include verification-checklist.md in the sub-agent prompt. The sub-agent reads the full checklist with a fresh context window — the parent agent does NOT need to read it.

Key principle: Sub-agents get fresh context windows. Delegate verification and JSON generation to sub-agents rather than keeping everything in the parent context.


Mandatory Rules — READ BEFORE STARTING

These are the required behaviors for every threat model report. Follow each rule exactly:

  1. Organize findings by Exploitability Tier (Tier 1/2/3), never by severity level
  2. Split each component's STRIDE table into Tier 1, Tier 2, Tier 3 sub-sections
  3. Include Exploitability Tier and Remediation Effort on every finding — both are MANDATORY
  4. STRIDE summary table MUST include T1, T2, T3 columns 4b. STRIDE + Abuse Cases categories are exactly: Spoofing, Tampering, Repudiation, Information Disclosure, Denial of Service, Elevation of Privilege, Abuse (business logic abuse, workflow manipulation, feature misuse — an extension to standard STRIDE covering misuse of legitimate features). The A is ALWAYS "Abuse" — NEVER "AI Safety", "Authorization", or any other interpretation. Authorization issues belong under E (Elevation of Privilege).
  5. .md files: start with # Heading on line 1. The create_file tool writes raw content — no code fences
  6. .mmd files: start with %%{init: on line 1. Raw Mermaid source, no fences
  7. Section MUST be titled exactly ## Action Summary. Include ### Quick Wins subsection with Tier 1 low-effort findings table
  8. K8s sidecars: annotate host container with <br/>+ Sidecar — never create separate sidecar nodes (see diagram-conventions.md Rule 1)
  9. Intra-pod localhost flows are implicit — do NOT draw them in diagrams
  10. Action Summary IS the recommendations — no separate ### Key Recommendations section
  11. Include > **Note on threat counts:** blockquote in Executive Summary
  12. Every finding MUST have CVSS 4.0 (score AND full vector string), CWE (with hyperlink), and OWASP (:2025 suffix)
  13. OWASP suffix is always :2025 (e.g., A01:2025 Broken Access Control)
  14. Include Threat Coverage Verification table at end of 3-findings.md mapping every threat → finding
  15. Every component in 0.1-architecture.md MUST appear in 2-stride-analysis.md
  16. First 3 scenarios in 0.1-architecture.md MUST have Mermaid sequence diagrams
  17. 0-assessment.md MUST include ## Analysis Context & Assumptions with ### Needs Verification and ### Finding Overrides tables
  18. ### Quick Wins subsection is REQUIRED under Action Summary (include heading with note if none)
  19. ALL 7 sections in 0-assessment.md are MANDATORY: Report Files, Executive Summary, Action Summary, Analysis Context & Assumptions, References Consulted, Report Metadata, Classification Reference
  20. Deployment Classification is BINDING. In 0.1-architecture.md, set Deployment Classification and fill the Component Exposure Table. If classification is LOCALHOST_DESKTOP or LOCALHOST_SERVICE: zero T1 findings, zero Prerequisites = None, zero AV:N for non-listener components. See analysis-principles.md Deployment Context table.
  21. Finding IDs MUST be sequential top-to-bottom: FIND-01, FIND-02, FIND-03... Renumber after sorting
  22. CWE MUST include hyperlink: [CWE-306](https://cwe.mitre.org/data/definitions/306.html): Missing Authentication
  23. After STRIDE, run the Technology-Specific Security Checklist in analysis-principles.md. Every technology in the repo needs at least one finding or documented mitigation
  24. CVSS AV:L or PR:H → finding CANNOT be Tier 1. Downgrade to T2/T3. See CVSS-to-Tier Consistency Check in analysis-principles.md
  25. Use only Low/Medium/High effort labels. NEVER generate time estimates, sprint phases, or scheduling. See Prohibited Content in output-formats.md
  26. References Consulted: use the exact two-subsection format from output-formats.md### Security Standards (3-column table with full URLs) and ### Component Documentation (3-column table with URLs)
  27. Report Metadata: include ALL fields from output-formats.md template — Model, Analysis Started, Analysis Completed, Duration. Run Get-Date -Format "yyyy-MM-dd HH:mm:ss" -AsUTC at Step 1 and before writing 0-assessment.md
  28. ## Summary table in 2-stride-analysis.md MUST appear at the TOP, immediately after ## Exploitability Tiers, BEFORE individual component sections
  29. Related Threats: every threat ID MUST be a hyperlink to 2-stride-analysis.md#component-anchor. Format: [T02.S](2-stride-analysis.md#component-name)
  30. Diagram colors: copy classDef lines VERBATIM from diagram-conventions.md. Only allowed fills: #6baed6 (process), #fdae61 (external), #74c476 (datastore). Only allowed strokes: #2171b5, #d94701, #238b45, #e31a1c. Use ONLY %%{init: {'theme': 'base', 'themeVariables': { 'background': '#ffffff', 'primaryColor': '#ffffff', 'lineColor': '#666666' }}}%% — no other themeVariables keys
  31. Summary DFD: after creating 1.1-threatmodel.mmd, run the POST-DFD GATE in Step 4. The gate and skeleton-summary-dfd.md control whether 1.2-threatmodel-summary.mmd is generated.
  32. Report Files table in 0-assessment.md: list 0-assessment.md (this document) as the FIRST row, followed by 0.1-architecture.md, 1-threatmodel.md, etc. Use the exact template from output-formats.md
  33. threat-inventory.json MUST be generated for every analysis run (Step 8b). This file enables future comparisons. See output-formats.md for schema.
  34. NEVER delete, modify, or remove any existing threat-model-* or threat-model-compare-* folders in the repository. Only write to your own timestamped output folder. Cleaning up temporary git worktrees you created is allowed; deleting other report folders is FORBIDDEN.

Rule Precedence (when guidance conflicts)

Apply rules in this order:

  1. Literal skeletons in skeletons/skeleton-*.md — exact section/table headers and attribute rows
  2. Mandatory Rules in orchestrator.md (this list)
  3. Examples in output-formats.md (examples are illustrative, not authoritative when they differ from literal skeletons)

If any conflict is detected, follow the highest-precedence item.

Post-generation: The verification sub-agent will scan your output for all known deviations listed in verification-checklist.md Phase 0. Fix any failures before finalizing.


Workflow

Exclusions: Skip these directories:

  • threat-model-* (previous reports)
  • node_modules, .git, dist, build, vendor, __pycache__

Pre-work: Before writing any output file, scan verification-checklist.md Phase 1 (Per-File Structural Checks) and Phase 2 (Diagram Rendering Checks). This internalizes the quality gates so output is correct on the first pass — preventing costly rework. Do NOT run the full verification yet; that happens in Step 10.

Sub-Agent Governance (MANDATORY — prevents duplicate work)

Sub-agents are independent execution contexts — they have no memory of the parent's state, instructions, or other sub-agents. Without strict governance, sub-agents will independently perform the ENTIRE analysis, creating duplicate report folders and wasting ~15 min compute + ~100K tokens per duplication.

Rule 1 — Parent owns ALL file creation. The parent agent is the ONLY agent that calls create_file for report files (0.1-architecture.md, stride-analysis.md, findings.md, etc.). Sub-agents NEVER write report files.

Rule 2 — Sub-agents are READ-ONLY helpers. Sub-agents may:

  • Search source code for specific patterns (e.g., "find all auth-related code")
  • Read and analyze files, then return structured data to the parent
  • Run verification checks and return PASS/FAIL results
  • Execute terminal commands (git diff, grep) and return output

Rule 3 — Sub-agent prompts must be NARROW and SPECIFIC. Never tell a sub-agent to "perform threat model analysis" or "generate the report." Instead:

  • "Read these 5 Go files and list every function that handles credentials. Return a table of function name, file, line number."
  • "Run the verification checklist against the files in {folder}. Return PASS/FAIL for each check."
  • "Read threat-inventory.json from {path} and verify all array lengths match metrics. Return mismatches."
  • "Analyze this codebase and write the threat model files."
  • "Generate 0.1-architecture.md and stride-analysis.md for this component."

Rule 4 — Output folder path. The parent creates the timestamped output folder in Step 1 and uses that exact path for ALL create_file calls. If a sub-agent needs to read previously written report files, pass the folder path in the sub-agent prompt.

Rule 5 — The ONLY exception is threat-inventory.json generation (Step 8b), where the parent MAY delegate JSON writing to a sub-agent IF the data is too large. In that case, the sub-agent prompt MUST include: (a) the exact output file path, (b) the data to serialize, and (c) explicit instruction: "Write ONLY this one file. Do NOT create any other files or folders."

Steps

  1. Record start time & gather context

    • Run Get-Date -Format "yyyy-MM-dd HH:mm:ss" -AsUTC and store as START_TIME
    • Get git info: git remote get-url origin, git branch --show-current, git rev-parse --short HEAD, git log -1 --format="%ai" HEAD (commit date — NOT today's date), hostname
    • Map the system: identify components, trust boundaries, data flows
    • Reference: analysis-principles.md for security infrastructure inventory

    DEPLOYMENT CLASSIFICATION (MANDATORY — do this BEFORE analyzing code for threats): Determine the system's deployment class from code evidence (see skeleton-architecture.md for values). Record in 0.1-architecture.md → Deployment Model section. Then fill the Component Exposure Table — one row per component showing listen address, auth barrier, external reachability, and minimum prerequisite. This table is the single source of truth for prerequisite floors. No threat or finding may have a lower prerequisite than what the exposure table permits for its component.

    DETERMINISTIC NAMING — Apply BEFORE writing any files:

    When identifying components, assign each a canonical PascalCase id. The naming MUST be deterministic — two independent runs on the same codebase MUST produce the same component IDs.

    ABSOLUTE RULE: Every component ID MUST be anchored to a real code artifact. For every component you identify, you MUST be able to point to a specific class, file, or manifest in the codebase that is the "anchor" for that component. If no such artifact exists, the component does not exist.

    Naming procedure (follow IN ORDER — stop at the first match):

    1. Primary class name — Use the EXACT class name from the source code. Do NOT abbreviate, expand, or rephrase it.
      • TaskProcessor.csTaskProcessor (NOT TaskServer, NOT TaskService)
      • SessionStore.csSessionStore (NOT FileSessionStore, NOT SessionService)
      • TerminalUserInterface.csTerminalUserInterface (NOT TerminalUI)
      • PowerShellCommandExecutor.csPowerShellCommandExecutor (NOT PowerShellExecutor)
      • ResponsesAPIService.csResponsesAPIService (NOT LLMService — that's a DIFFERENT class)
      • MCPHost.csMCPHost (NOT OrchestrationHost)
    2. Primary script nameImport-Images.ps1ImportImages
    3. Primary config/manifest nameDockerfileDockerContainer, values.yamlHelmChart
    4. Directory name (if component spans multiple files) → src/ParquetParsing/ParquetParser
    5. Technology name (for external services/datastores) → "Azure OpenAI" → AzureOpenAI, "Redis" → Redis
    6. External actor roleOperator, EndUser (never drop these)

    Helm/Kubernetes Deployment Naming (CRITICAL for comparison stability): When a component is deployed via Helm chart or Kubernetes manifests, use the Kubernetes workload name (from the Deployment/StatefulSet metadata.name) as the component ID — NOT the Helm template filename or directory structure:

    • Look at metadata.name in deployment YAML → use that as the component ID (PascalCase normalized)
    • Example: metadata.name: devportal in templates/knowledge/devportal-deployment.yml → component ID is DevPortal
    • Example: metadata.name: phi-model in templates/knowledge/phi-deployment.yml → component ID is PhiModel
    • Why: Helm templates frequently get reorganized (e.g., moved from templates/ to templates/knowledge/) but the Kubernetes workload name stays the same. Using the workload name ensures the component ID survives directory reorganizations.
    • source_files MUST include the deployment YAML path AND the application source code path (e.g., both helmchart/myapp/templates/knowledge/devportal-deployment.yml AND developer-portal/src/)
    • source_directories MUST include BOTH the Helm template directory AND the source code directory

    External Service Anchoring (for components without repo source code): External services (cloud APIs, managed databases, SaaS endpoints) don't have source files in the repository. Anchor them to their integration point in the codebase:

    • source_files → the client class or config file that defines the connection (e.g., src/MCP/appsettings.json for Azure OpenAI connection config, helmchart/values.yaml for Redis endpoint config)
    • source_directories → the directory containing the integration code (e.g., src/MCP/Core/Services/LLM/ for the LLM client)
    • class_names → the CLIENT class in YOUR repo that talks to the service (e.g., ResponsesAPIService), NOT the vendor's SDK class (e.g., NOT OpenAIClient). If no dedicated client class exists, leave empty.
    • namespace → leave empty "" (external services don't have repo namespaces)
    • config_keys → the env vars / config keys for the service connection (e.g., ["AZURE_OPENAI_ENDPOINT", "RESPONSES_API_DEPLOYMENT"]). These are the most stable anchors for external services.
    • api_routes → leave empty (external services expose their own routes, not yours)
    • dependencies → the SDK package used (e.g., ["Azure.AI.OpenAI"] for NuGet, ["pymilvus"] for pip)

    Why this matters: External services frequently change display names across LLM runs (e.g., "Azure OpenAI" vs "GPT-4 Endpoint" vs "LLM Backend"). The config_keys and dependencies fields are what make them matchable across runs.

    FORBIDDEN naming patterns — NEVER use these:

    • NEVER invent abstract names that don't correspond to a real class: ConfigurationStore, LocalFileSystem, DataLayer, IngestionPipeline, BackendServer
    • NEVER abbreviate a class name: TerminalUI for TerminalUserInterface, PSExecutor for PowerShellCommandExecutor
    • NEVER substitute a synonym: TaskServer for TaskProcessor, LLMService for ResponsesAPIService
    • NEVER merge two separate classes into one component: ResponsesAPIService and LLMService are two different classes → two different components
    • NEVER create a component for something that doesn't exist in the code: if there's no Windows Registry access code, don't create a WindowsRegistry component
    • NEVER rename between runs: if you called it TaskProcessor in run 1, it MUST be TaskProcessor in run 2

    COMPONENT ANCHOR VERIFICATION (MANDATORY — do this BEFORE Step 2): After identifying all components, create a mental checklist:

    For EACH component:
      Q: What is the EXACT filename or class that anchors this component?
      A: [must cite a real file path, e.g., "src/Core/TaskProcessor.cs"]
      If you cannot cite a real file → DELETE the component from your list
    

    This verification catches invented components like WindowsRegistry (no registry code exists), ConfigurationStore (no such class), LocalFileSystem (abstract concept, not a class).

    COMPONENT SELECTION STABILITY (when multiple related classes exist): Many systems have clusters of related classes (e.g., CredentialManager, AzureCredentialProvider, AzureAuthenticationHandler). To ensure deterministic selection:

    • Pick the class that OWNS the security-relevant behavior — the one that makes the trust decision, holds the credential, or processes the data
    • Prefer the class registered in dependency injection over helpers/utilities
    • Prefer the higher-level orchestrator over its internal implementation classes
    • Once you pick a class, its alternatives become aliases — add them to the aliases array, not as separate components
    • Example: If CredentialManager orchestrates credential lookup and uses AzureCredentialProvider internally, CredentialManager is the component and AzureCredentialProvider is an alias
    • Example: Do NOT include both SessionStore and SessionFilesSessionStore is the class, SessionFiles is an abstract concept
    • Count rule: Two runs on the same code MUST produce the same number of components (±1 for edge cases). A difference of ≥3 components indicates the selection rules were not followed.

    STABILITY ANCHORS (for comparison matching): When recording each component in threat-inventory.json, the fingerprint fields source_directories, class_names, and namespace serve as stability anchors — immutable identifiers that persist even when:

    • The class is renamed (directory stays the same)
    • The file is moved to a different directory (class name stays the same)
    • The component ID changes between analysis runs (namespace stays the same) The comparison matching algorithm relies on these anchors MORE than on the component id field. Therefore:
    • source_directories MUST be populated for every process-type component (never empty [])
    • class_names MUST include at least the primary class name
    • namespace MUST be the actual code namespace (e.g., MyApp.Core.Servers.Health), not a made-up grouping
    • These fields are what make a component identifiable across independent analysis runs, even if two LLMs pick different display names

    COMPONENT ELIGIBILITY — What qualifies as a threat model component: A class/service becomes a threat model component ONLY if it meets ALL of these criteria:

    1. It crosses a trust boundary OR handles security-sensitive data (credentials, user input, network I/O, file I/O, process execution)
    2. It is a top-level service, not an internal helper (registered in DI, or the main entry point, or an agent with its own responsibility)
    3. It would appear in a deployment diagram — you could point to it and say "this runs here, talks to that"

    ALWAYS include these component types (if they exist in the code):

    • ALL agent classes (HealthAgent, InfrastructureAgent, InvestigatorAgent, SupportabilityAgent, etc.)
    • ALL MCP server classes (HealthServer, InfrastructureServer, etc.)
    • The main host/orchestrator (MCPHost, etc.)
    • ALL external service connections (AzureOpenAI, AzureAD, etc.)
    • ALL credential/auth managers
    • The user interface entry point
    • ALL tool execution services (PowerShellCommandExecutor, etc.)
    • ALL session/state persistence services
    • ALL LLM service classes (ResponsesAPIService, LLMService — if they are separate classes, they are separate components)
    • External actors (Operator, EndUser)

    NEVER include these as separate components:

    • Loggers (LocalFileLogger, TelemetryLogger) — these are cross-cutting concerns, not threat model components
    • Static helper classes
    • Model/DTO classes
    • Configuration builders (unless they handle secrets)
    • Infrastructure-as-code classes that don't exist at runtime (AzureStackHCI cluster reference, deployment scripts)

    The goal: Every run on the same code should identify the SAME set of ~12-20 components. If you're including a logger or excluding an agent, you're doing it wrong.

    Boundary naming rules:

    • Boundary IDs MUST be PascalCase (never Layer, Zone, Group, Tier suffixes)
    • Derive from deployment topology, NOT from code architecture layers
    • Deployment topology determines boundaries:
      • Single-process app → EXACTLY 2 boundaries: Application (the process) + External (external services). NEVER use 1 boundary. NEVER use 3+ boundaries. This is mandatory for single-process apps.
      • Multi-container app → boundaries per container/pod
      • K8s deployment → K8sCluster + per-namespace boundaries if relevant
      • Client-server → Client + Server
    • K8s multi-service deployments (CRITICAL for microservice architectures): When a K8s namespace contains multiple Deployments/StatefulSets with DIFFERENT security characteristics, create sub-boundaries based on workload type:
      • BackendServices — API services (FastAPI, Express, etc.) that handle user requests
      • DataStorage — Databases and persistent storage (Redis, Milvus, PostgreSQL, NFS) — these have different access controls, persistence, and backup policies
      • MLModels — ML model servers running on GPU nodes — these have different compute resources, attack surfaces (adversarial inputs), and scaling characteristics
      • Agentic — Agent runtime/manager services if present
      • The outer K8sCluster contains these sub-boundaries
      • This is NOT "code layers" — each sub-boundary represents a different Kubernetes Deployment/StatefulSet with its own security context, resource limits, and network policies
      • Test: If two components are in DIFFERENT Kubernetes Deployments with different service accounts, different network exposure, or different resource requirements → they SHOULD be in different sub-boundaries
    • FORBIDDEN boundary schemes (for SINGLE-PROCESS apps only):
      • Do NOT create boundaries based on code layers: PresentationBoundary, OrchestrationBoundary, AgentBoundary, ServiceBoundary are CODE LAYERS, not deployment boundaries. All these run in the SAME process.
      • Do NOT split a single process into 4+ boundaries. If all components run in one .exe, they are in ONE boundary.
    • Example: An application where TerminalUserInterface, MCPHost, HealthAgent, ResponsesAPIService all run in the same process → they are ALL in Application. External services like AzureOpenAI are in External.
    • Two runs on the same code MUST produce the same number of boundaries (±1). A difference of ≥2 boundaries is WRONG.
    • NEVER create boundaries based on code layers (Presentation/Business/Data) — boundaries represent DEPLOYMENT trust boundaries, not code architecture

    Boundary count locking:

    • After identifying boundaries, LOCK the count. Two runs on the same code MUST produce the same number of boundaries (±1 acceptable if one run identifies an edge boundary the other doesn't)
    • A 4-boundary vs 7-boundary difference on the same code is WRONG and indicates the naming rules were not followed

    Additional naming rules:

    • The SAME component must get the SAME id regardless of which LLM model runs the analysis or how many times it runs
    • External actors (Operator, AzureDataStudio, etc.) are ALWAYS included — never drop them
    • Datastores representing distinct storage (files, database) are ALWAYS separate components — never merge them
    • Lock the component list before Step 2. Use these exact IDs in ALL subsequent files (architecture, DFD, STRIDE, findings, JSON)
    • If two classes exist as separate files (e.g., ResponsesAPIService.cs and LLMService.cs), they are TWO components even if they seem related

    DATA FLOW COMPLETENESS (MANDATORY — ensures consistent flow enumeration across runs): Data flows MUST be enumerated exhaustively. Two independent analyses of the same codebase MUST produce the same set of flows. To achieve this:

    RETURN FLOW MODELING RULE (addresses 24% variance in flow counts):

    • DO NOT model separate return flows. A request-response pair is ONE bidirectional flow (use <--> in Mermaid).
    • Example: DF01: Operator <--> TUI (one flow for input and output)
    • Example: DF03: MCPHost <--> HealthAgent (one flow for delegation and result)
    • DO model separate flows ONLY when the two directions use different protocols or semantics (e.g., HTTP request vs WebSocket push-back).
    • Why: When runs independently decide whether to create 1 flow or 2 flows per interaction, the flow count varies by 20-30%. This rule eliminates that variance.
    • Flow count formula: # flows ≈ # unique component-to-component interactions. If component A talks to component B, that is 1 flow, not 2.

    Flow completeness checklist (use <--> bidirectional flows per the return flow rule above):

    1. Ingress/reverse proxy flows: DF_EndUser_to_NginxIngress (bidirectional <-->), DF_NginxIngress_to_Backend (bidirectional <-->). Each is ONE flow, not two.
    2. Database/datastore flows: DF_Service_to_Redis (bidirectional <-->). ONE flow per service-datastore pair.
    3. Auth provider flows: DF_Service_to_AzureAD (bidirectional <-->). ONE flow per service-auth pair.
    4. Admin access flows: DF_Operator_to_Service (bidirectional <-->). ONE per admin interaction.
    5. Flow count locking: After enumerating flows, LOCK the count. Two runs on the same code MUST produce the same number of flows (±3 acceptable). A difference of >5 flows indicates incomplete enumeration.

    EXTERNAL ENTITY INCLUSION RULES (addresses variance in which externals are modeled):

    • ALWAYS include AzureAD (or EntraID) as an external entity if the code acquires tokens from Azure AD / Microsoft Entra ID (look for ChainedTokenCredential, ManagedIdentityCredential, AzureCliCredential, MSAL, or any OAuth2/OIDC flow).
    • ALWAYS include the infrastructure target (e.g., OnPremInfra, HCICluster) as an external entity if the code sends commands to external infrastructure via PowerShell, REST, or WMI.
    • ALWAYS include AzureOpenAI (or equivalent LLM endpoint) if the code calls a cloud LLM API.
    • ALWAYS include Operator as an external actor for CLI/TUI tools, admin tools, or operator consoles.
    • Rule of thumb: If the code has a client class or config for a service, that service is an external entity.

    TMT CATEGORY RULES (addresses category inconsistency across runs):

    • Tool servers that expose APIs callable by agents → SE.P.TMCore.WebSvc (NOT SE.P.TMCore.NetApp)
    • Network-level services that handle connections/sockets → SE.P.TMCore.NetApp
    • Services that execute OS commands (PowerShell, bash) → SE.P.TMCore.OSProcess
    • Services that store data to disk (SessionStore, FileLogger) → SE.DS.TMCore.FS (classify as Data Store, NOT Process)
    • Rule: If a class's primary purpose is persisting data, it is a Data Store. If it does computation or orchestration, it is a Process. Never switch between runs.

    DFD DIRECTION (MANDATORY — addresses layout variance):

    • ALL DFDs MUST use flowchart LR (left-to-right). NEVER use flowchart TB.
    • ALL summary DFDs MUST also use flowchart LR.
    • This is immutable — do not change based on aesthetics or diagram shape.

    Acronym rules for PascalCase:

    • Preserve well-known acronyms as ALL-CAPS: API, NFS, LLM, SQL, HCI, AD, UI, DB
    • Examples: IngestionAPI (not IngestionApi), NFSServer (not NfsServer), AzureAD (not AzureAd), VectorDBAPI (not VectorDbApi)
    • Single-word technologies keep standard casing: Redis, Milvus, PostgreSQL, Nginx

    Common technology naming (use EXACTLY these IDs for well-known infrastructure):

    • Redis cache/state: Redis (never DaprStateStore, RedisCache, StateStore)
    • Milvus vector DB: Milvus (never MilvusVectorDb, VectorDB)
    • NGINX ingress: NginxIngress (never IngressNginx)
    • Azure AD/Entra: AzureAD (never AzureAd, EntraID)
    • PostgreSQL: PostgreSQL (never PostgresDb, Postgres)
    • User/Operator: Operator for admin users, EndUser for end users
    • Azure OpenAI: AzureOpenAI (never OpenAIService, LLMEndpoint)
    • NFS: NFSServer (never NfsServer, FileShare)
    • If two LLM models are separate deployments, keep them separate (never merge MistralLLM + PhiLLM into LocalLlm)

    BUT: for application-specific classes, use the EXACT class name from the code, NOT a technology label:

    • ResponsesAPIService.csResponsesAPIService (NOT OpenAIService — the class IS named ResponsesAPIService)
    • TaskProcessor.csTaskProcessor (NOT LocalLLM — the class IS named TaskProcessor)
    • SessionStore.csSessionStore (NOT StatePersistence — the class IS named SessionStore) Component granularity rules (CRITICAL for stability):
    • Model components at the technology/service level, not the script/file level
    • A Docker container running Kusto is KustoContainer — NOT decomposed into KustoService + IngestLogs + KustoDataDirectory
    • A Moby Docker engine is MobyDockerEngine — NOT InstallMoby (the installer script is evidence, not the component)
    • An installer for a tool is SetupInstaller — NOT renamed to InstallAzureEdgeDiagnosticTool (script filename)
    • Rule: if a component has one primary function (e.g., "run Kusto queries"), model it as ONE component regardless of how many scripts/files implement it
    • Scripts are EVIDENCE for components, not components themselves
    • Keep the same granularity across runs — never split a single component into sub-components or merge sub-components between runs

    COMPONENT ID FORMAT (MANDATORY — addresses casing variance):

    • ALL component IDs MUST be PascalCase. NEVER use kebab-case, snake_case, or camelCase.
    • Examples: HealthAgent (not health-agent), AzureAD (not azure-ad), MCPHost (not mcp-host)
    • This applies to ALL artifacts: 0.1-architecture.md, 1-threatmodel.md, DFD mermaid, STRIDE, findings, JSON.

    STRIDE SCOPE RULE (addresses external entity analysis variance):

    • STRIDE analysis in 2-stride-analysis.md MUST include sections for ALL elements in the Element Table EXCEPT external actors (Operator, EndUser).
    • External services (AzureOpenAI, AzureAD, OnPremInfra) DO get STRIDE sections — they are attack surfaces from YOUR system's perspective.
    • External actors (human users) do NOT get STRIDE sections — they are threat SOURCES, not targets.
    • This means: if you have 20 elements total and 1 is an external actor, you write 19 STRIDE sections.

    STRIDE DEPTH CONSISTENCY (addresses threat count variance):

    • Each component MUST get ALL 7 STRIDE-A categories analyzed (S, T, R, I, D, E, A).
    • Each STRIDE category MUST be explicitly addressed per component: either with one or more concrete threats, OR with an explicit N/A — {1-sentence justification} row explaining why that category does not apply to this specific component.
    • A category may produce 0, 1, 2, 3, or more threats — the count depends on the component's actual attack surface. Do NOT cap at 1 threat per category. Components with rich security surfaces (API services, auth managers, command executors, LLM clients) should typically have 2-4 threats per relevant STRIDE category. Only simple components (static config, read-only data stores) should have mostly 0-1.
    • Expected distribution: For a 15-component system: ~30% of STRIDE cells should be 0 (with N/A), ~40% should be 1, ~25% should be 2, ~5% should be 3+. If ALL cells are 0 or 1 (binary pattern) → the analysis is too shallow. Go back and identify additional threat vectors.
    • N/A entries do NOT count toward threat totals in the Summary table. Only concrete threat rows count.
    • The Summary table S/T/R/I/D/E/A columns show the COUNT of concrete threats per category (0 is valid if N/A was justified).
    • This ensures comprehensive coverage while producing accurate, non-inflated threat counts.
  2. Write architecture overview (0.1-architecture.md)

    • Read skeletons/skeleton-architecture.md first — copy skeleton structure, fill [FILL] placeholders
    • System purpose, key components, top scenarios, tech stack, deployment
    • Use the exact component IDs locked in Step 1 — do not rename or merge components
    • Reference: output-formats.md for template, diagram-conventions.md for architecture diagram styles
  3. Inventory security infrastructure

    • Identify security-enabling components before flagging gaps
    • Reference: analysis-principles.md Security Infrastructure Inventory table
  4. Produce threat model DFD (1.1-threatmodel.mmd, 1.2-threatmodel-summary.mmd, 1-threatmodel.md)

    • Read skeletons/skeleton-dfd.md, skeletons/skeleton-summary-dfd.md, and skeletons/skeleton-threatmodel.md first
    • Reference: diagram-conventions.md for DFD styles, tmt-element-taxonomy.md for element classification
    • ⚠️ BEFORE FINALIZING: Run the Pre-Render Checklist from diagram-conventions.md

    POST-DFD GATE — Run IMMEDIATELY after creating 1.1-threatmodel.mmd:

    1. Count elements (nodes with ((...)), [(...), ["..."]) in 1.1-threatmodel.mmd
    2. Count boundaries (subgraph lines)
    3. If elements > 15 OR boundaries > 4: → You MUST create 1.2-threatmodel-summary.mmd using skeleton-summary-dfd.md NOW → Do NOT proceed to 1-threatmodel.md until the summary file exists
    4. If threshold NOT met → skip summary, proceed to 1-threatmodel.md
    5. Create 1-threatmodel.md (include Summary View section if summary was generated)
  5. Enumerate threats per element and flow using STRIDE-A (2-stride-analysis.md)

    • Read skeletons/skeleton-stride-analysis.md first — use Summary table and per-component structure
    • Reference: analysis-principles.md for tier definitions, output-formats.md for STRIDE template
    • PREREQUISITE FLOOR CHECK (per threat): Before assigning a prerequisite to any threat, look up the component's Min Prerequisite and Derived Tier in the Component Exposure Table (0.1-architecture.md). The threat's prerequisite MUST be ≥ the component's floor. The threat's tier MUST be ≥ the component's derived tier (i.e., if component is T2, no threat can be T1). Use the canonical prerequisite→tier mapping from analysis-principles.md.
  6. For each threat: cite files/functions/endpoints, propose mitigations, provide verification steps

  7. Verify findings — confirm each finding against actual configuration before documenting

    • Reference: analysis-principles.md Finding Validation Checklist

7b. Technology sweep — Run the Technology-Specific Security Checklist from analysis-principles.md

  • For every technology found in the repo (Redis, Milvus, PostgreSQL, Docker, K8s, ML models, LLMs, NFS, CI/CD, etc.), verify you have at least one finding or explicit mitigation
  • This step catches gaps that component-level STRIDE misses (e.g., database auth defaults, container hardening, key management)
  • Add any missing findings before proceeding to Step 8
  1. Compile findings (3-findings.md)

    • Reference: output-formats.md for findings template and Related Threats link format
    • Reference: skeletons/skeleton-findings.md — read this skeleton, copy VERBATIM, fill in [FILL] placeholders for each finding

    PRE-WRITE GATE — Verify before calling create_file for 3-findings.md:

    1. Finding IDs: ### FIND-01:, ### FIND-02: — sequential, FIND- prefix (NOT F01 or F-01)
    2. CVSS prefix: every vector starts with CVSS:4.0/ (NOT bare AV:N/AC:L/...)
    3. Related Threats: each threat ID is a separate hyperlink [TNN.X](2-stride-analysis.md#anchor) (NOT plain text)
    4. Sub-sections: #### Description, #### Evidence, #### Remediation, #### Verification (NOT Recommendation)
    5. Sort: within each tier → Critical → Important → Moderate → Low → higher CVSS first
    6. All 10 mandatory attribute rows present per finding
    7. Deployment context gate (FAIL-CLOSED): Read 0.1-architecture.md Deployment Classification and Component Exposure Table. If classification is LOCALHOST_DESKTOP or LOCALHOST_SERVICE:
      • ZERO findings may have Exploitation Prerequisites = None → fix to Local Process Access (T2) or Host/OS Access (T3)
      • ZERO findings may be in ## Tier 1 → downgrade to T2/T3 based on prerequisite
      • ZERO CVSS vectors may use AV:N unless the specific component has Reachability = External in the Component Exposure Table → fix to AV:L For ALL deployment classifications:
      • For EACH finding, look up its Component in the exposure table. The finding's prerequisite MUST be ≥ the component's Min Prerequisite. The finding's tier MUST be ≥ the component's Derived Tier.
      • Prerequisites MUST use only canonical values: None, Authenticated User, Privileged User, Internal Network, Local Process Access, Host/OS Access, Admin Credentials, Physical Access, {Component} Compromise. Application Access and Host Access are FORBIDDEN. If ANY violation exists → DO NOT WRITE THE FILE. Fix all violations first.

    Fail-fast gate: Immediately after writing, run the Inline Quick-Checks for 3-findings.md from verification-checklist.md. Fix before proceeding.

    MANDATORY: All 3 tier sections must be present. Even if a tier has zero findings, include the heading with a note:

    • ## Tier 1 — Direct Exposure (No Prerequisites)*No Tier 1 findings identified for this repository.*
    • This ensures structural consistency for comparison matching and validation.

    COVERAGE VERIFICATION FEEDBACK LOOP (MANDATORY): After writing the Threat Coverage Verification table at the end of 3-findings.md:

    1. Scan the table you just wrote. Count how many threats have status ✅ Covered vs 🔄 Mitigated by Platform vs ⚠️ Needs Review vs ⚠️ Accepted Risk.
    2. If ANY threat has ⚠️ Accepted Risk → FAIL. The tool cannot accept risks. Go back and create a finding for each one.
    3. If Platform ratio > 20% → SUSPECT. Re-examine each 🔄 Mitigated by Platform entry: is the mitigation truly from an EXTERNAL system managed by a DIFFERENT team? If the mitigation is the repo's own code (auth middleware, file permissions, TLS config, localhost binding), reclassify as Open and create a finding.
    4. If ANY Open threat in 2-stride-analysis.md has NO corresponding finding → create a finding NOW. Use the threat's description as the finding title, the mitigation column as the remediation guidance, and assign severity based on STRIDE category.
    5. Update 3-findings.md with the newly created findings. Renumber sequentially. Update the Coverage table to show ✅ Covered for each.
    6. This loop is the ENTIRE POINT of the Coverage table — it's not documentation, it's a self-check that forces complete coverage. If you write the table and don't act on gaps, you've wasted the effort.

8b. Generate threat inventory (threat-inventory.json)

  • Read skeletons/skeleton-inventory.md first — use exact field names and schema structure
  • After writing all markdown reports, compile a structured JSON inventory of all components, boundaries, data flows, threats, and findings - Use canonical PascalCase IDs for components (derived from class/file names) and keep display labels separate
  • Use canonical flow IDs: DF_{Source}_to_{Target} - Include identity keys on every threat and finding for future matching - Include deterministic identity fields for component and boundary matching across runs:
    • Component: aliases, boundary_kind, fingerprint
    • Boundary: kind, aliases, contains_fingerprint - Build fingerprint from stable evidence (source files, endpoint neighbors, protocols, type) — never from prose wording - Normalize synonyms to the same canonical component ID (example: SupportAgent and SupportabilityAgentSupportabilityAgent) and store alternate names in aliases - Sort arrays deterministically before writing JSON:
    • components by id
    • boundaries by id
    • flows by id
    • threats by id then identity_key.component_id
    • findings by id then identity_key.component_id
  • Extract metrics (totals, per-tier counts, per-STRIDE-category counts)
  • Include git metadata (commit SHA, branch, date) and analysis metadata (model, timestamps)
  • Reference: output-formats.md for the threat-inventory.json schema
  • This file is NOT linked in 0-assessment.md but is always present in the output folder

PRE-WRITE SIZE CHECK (MANDATORY — before calling create_file for JSON): Before writing threat-inventory.json, count the data you plan to include:

  • Count total threats from 2-stride-analysis.md (grep ^\| T\d+\.)
  • Count total findings from 3-findings.md (grep ### FIND-)
  • Count total components from 0.1-architecture.md
  • If threats > 50 OR findings > 15: DO NOT use a single create_file call. Instead, use one of: (a) delegate to sub-agent, (b) Python extraction script, (c) chunked write strategy.
  • If threats ≤ 50 AND findings ≤ 15: single create_file is acceptable, but keep entries minimal (1-sentence description/mitigation fields).

POST-WRITE VALIDATION (MANDATORY — JSON Array Completeness): After writing threat-inventory.json, immediately verify:

  • threats.length == metrics.total_threats — if mismatch, the threats array was truncated during generation. Rebuild by re-reading 2-stride-analysis.md and extracting every threat row.
  • findings.length == metrics.total_findings — if mismatch, rebuild from 3-findings.md.
  • components.length == metrics.total_components — if mismatch, rebuild from architecture/element tables.

CROSS-FILE THREAT COUNT VERIFICATION (MANDATORY — catches dropped threats): The JSON threats.length can match metrics.total_threats but BOTH can be wrong if threats were dropped during JSON generation. To catch this:

  • Count threat rows in 2-stride-analysis.md: grep for ^\| T\d+\. and count unique threat IDs
  • Compare this count to threats.length in the JSON
  • If the markdown has MORE threats than the JSON → the JSON dropped threats. Rebuild the JSON by re-extracting ALL threats from 2-stride-analysis.md.
  • This is the #2 quality issue observed in testing (after truncation). Large repos (114+ threats) frequently have 1-3 threats dropped when sub-agents write the JSON from memory instead of re-reading the STRIDE file.

FIELD NAME COMPLIANCE GATE (MANDATORY — run immediately after array check): Read the first component and first threat from the JSON just written and verify these EXACT field names:

  • components[0] has key "display" (NOT "display_name", NOT "name") → if wrong, find-replace ALL occurrences

  • threats[0] has key "stride_category" (NOT "category") → if wrong, find-replace ALL occurrences

  • threats[0].identity_key has key "component_id" (threat→component link must be INSIDE identity_key, NOT a top-level component_id field on the threat) → if wrong, restructure

  • threats[0] has BOTH "title" (short name, e.g., "Information Disclosure — Redis unencrypted traffic") AND "description" (longer prose). If only description exists without title, create title from the first sentence of description. If name or threat_name exists instead of title, find-replace to title

  • Why this matters: Downstream tooling depends on these exact field names. Wrong names cause zero-value heatmaps, broken component matching, and empty display labels in comparison reports.

  • If ANY field name is wrong: fix it NOW with find-replace on the JSON file before proceeding. Do NOT leave it for verification.

  • This is the #1 quality issue observed in testing. Large repos (20+ components, 80+ threats) frequently have truncated JSON arrays because the model runs out of output tokens. If ANY array is truncated, you MUST rebuild it before proceeding. Do NOT finalize with mismatched counts.

HARD GATE — TRUNCATION RECOVERY (MANDATORY): If post-write validation detects ANY array mismatch:

  1. DELETE the truncated threat-inventory.json immediately
  2. DO NOT attempt to patch the truncated file — partial JSON is unreliable
  3. Regenerate using one of these strategies (in preference order): a. Delegate to a sub-agent — hand the sub-agent the output folder path and instruct it to read 2-stride-analysis.md and 3-findings.md, then write threat-inventory.json. The sub-agent has a fresh context window. b. Python extraction script — write a Python script that reads the markdown files, extracts threats/findings via regex, and writes the JSON. Run the script via terminal. c. Chunked write — use the Large Repo Strategy below.
  4. Re-validate after regeneration — if still mismatched, repeat with the next strategy
  5. NEVER proceed to Step 9 (assessment) or Step 10 (verification) with mismatched counts

LARGE REPO STRATEGY (MANDATORY for repos with >60 threats): For repos producing more than ~60 threats, the JSON file can exceed output token limits if generated in one pass. Use this chunked approach:

  1. Write metadata + components + boundaries + flows + metrics first — these are small arrays
  2. Append threats in batches — write threats array with ~20 threats per append operation. Use replace_string_in_file to add batches to the existing file rather than writing the entire JSON in one create_file call.
  3. Append findings — similarly batch if >15 findings
  4. Final validation — read the completed file and verify all array lengths match metrics

Alternative approach: If chunked writing is not feasible, keep each threat/finding entry minimal:

  • description field: max 1 sentence (not full prose paragraphs)
  • mitigation field: max 1 sentence
  • Remove redundant fields that duplicate markdown content
  • The JSON is for MATCHING, not for reading — brevity is key
  1. Write assessment (0-assessment.md)

    • Reference: output-formats.md for assessment template
    • Reference: skeletons/skeleton-assessment.md — read this skeleton, copy VERBATIM, fill in [FILL] placeholders
    • ⚠️ ALL 7 sections are MANDATORY: Report Files, Executive Summary, Action Summary (with Quick Wins), Analysis Context & Assumptions (with Needs Verification + Finding Overrides), References Consulted, Report Metadata, Classification Reference
    • Do NOT add extra sections like "Severity Distribution", "Architecture Risk Areas", "Methodology Notes", or "Deliverables" — these are NOT in the template

    PRE-WRITE GATE — Verify before calling create_file for 0-assessment.md:

    1. Exactly 7 sections: Report Files, Executive Summary, Action Summary, Analysis Context & Assumptions (with &), References Consulted, Report Metadata, Classification Reference
    2. --- horizontal rules between EVERY pair of ## sections (minimum 6)
    3. ### Quick Wins, ### Needs Verification, ### Finding Overrides all present
    4. References: TWO subsections (### Security Standards + ### Component Documentation) with 3-column tables and full URLs
    5. ALL metadata values wrapped in backticks; ALL fields present (Model, Analysis Started, Analysis Completed, Duration)
    6. Element/finding/threat counts match actual counts from other files

    Fail-fast gate: Immediately after writing, run the Inline Quick-Checks for 0-assessment.md from verification-checklist.md. Fix before proceeding.

  2. Final verification — iterative correction loop

    This step runs verification and fixes in a loop until all checks pass. Do NOT finalize with any failures remaining.

    Pass 1 — Comprehensive verification:

    • Delegate to a verification sub-agent with the content of verification-checklist.md + the output folder path
    • Sub-agent runs ALL Phase 05 checks and reports PASS/FAIL with evidence
    • If ANY check fails:
      1. Fix the failed file(s) using the available file-edit tool
      2. Re-run ONLY the failed checks against the fixed file(s)
      3. Repeat until the failed checks pass

    Pass 2 — Regression check (if Pass 1 had fixes):

    • Re-run Phase 3 (cross-file consistency) to ensure fixes didnt break other files
    • If new failures appear, fix and re-verify

    Exit condition: ALL phases report 0 failures. Only then mark the analysis as complete.

    Sub-agent context management:

    • Include the relevant phase content from verification-checklist.md in the sub-agent prompt
    • Include the output folder path so the sub-agent can read files
    • Sub-agent output MUST include: phase name, total checks, passed, failed, and for each failure: check ID, file, evidence, exact fix instruction. Do not return "looks good" without counts.

Tool Usage

Progress Tracking (todo)

  • Create todos at start for each major phase
  • Mark in-progress before starting each phase
  • Mark completed immediately after finishing each phase

Sub-task Delegation (agent)

Delegate NARROW, READ-ONLY tasks to sub-agents (see Sub-Agent Governance above). Allowed delegations:

  • Context gathering: "Search for auth patterns in these directories and return a summary"
  • Code analysis: "Read these files and identify security-relevant APIs, credentials, and trust boundaries"
  • Verification: Hand the verification sub-agent the content of verification-checklist.md and the output folder path. It reads the files and returns PASS/FAIL results. The PARENT fixes any failures.
  • JSON generation (exception): For large repos, delegate threat-inventory.json writing with exact file path and pre-computed data

NEVER delegate: "Write 0.1-architecture.md", "Generate the STRIDE analysis", "Perform the threat model analysis", or any prompt that would cause the sub-agent to independently produce report files.


Verification Checklist (Final Step)

The full verification checklist is in verification-checklist.md. It contains 9 phases:

Authority hierarchy: orchestrator.md defines the AUTHORING rules (what to do when writing reports). verification-checklist.md defines the CHECKING rules (what to verify after writing). Some rules appear in both files for visibility — if they ever conflict, orchestrator.md rules take precedence for authoring decisions, and verification-checklist.md takes precedence for pass/fail criteria. For the complete list of all structural, diagram, and consistency checks, always consult verification-checklist.md — it is the single source of truth for quality gates.

  1. Phase 0 — Common Deviation Scan: Known deviation patterns with WRONG→CORRECT examples
  2. Phase 1 — Per-File Structural Checks: Section order, required content, formatting
  3. Phase 2 — Diagram Rendering Checks: Mermaid init blocks, classDef, styles, syntax
  4. Phase 3 — Cross-File Consistency Checks: Component coverage, DF mapping, threat-to-finding traceability
  5. Phase 4 — Evidence Quality Checks: Evidence concreteness, verify-before-flagging compliance
  6. Phase 5 — JSON Schema Validation: Schema fields, array completeness, metrics consistency
  7. Phase 6 — Deterministic Identity: Component ID stability, boundary naming, flow ID consistency
  8. Phase 7 — Evidence-Based Prerequisites: Prerequisite deployment evidence, coverage completeness
  9. Phase 8 — Comparison HTML (incremental only): HTML structure, change annotations, CSS

Inline Quick-Checks: verification-checklist.md also contains Inline Quick-Checks that MUST be run immediately after writing each file (before Step 10). These catch errors while content is still in active context.

Two-pass usage:

  • Before writing (Workflow pre-work): Scan Phase 1 and Phase 2 to internalize structural and diagram quality gates. This prevents rework.
  • After writing (Step 10): Run ALL Phase 04 checks comprehensively against the completed output. Phase 0 is the most critical — it catches the deviations that persist across runs. Fix any failures before finalizing.

Delegation: Hand the verification sub-agent the content of verification-checklist.md and the output folder. It will run all checks and produce a PASS/FAIL summary. Fix any failures before finalizing.


Starting the Analysis

If no folder path is provided, analyze the entire repository from its root.