* Add threat-model-analyst skill: STRIDE-A threat modeling for repositories Add a comprehensive threat model analysis skill that performs security audits using STRIDE-A (STRIDE + Abuse) threat modeling, Zero Trust principles, and defense-in-depth analysis. Supports two modes: - Single analysis: full STRIDE-A threat model producing architecture overviews, DFD diagrams, prioritized findings, and executive assessments - Incremental analysis: security posture diff between baseline report and current code, producing standalone reports with embedded comparison Includes bundled reference assets: - Orchestrator workflows (full and incremental) - Analysis principles and verification checklists - Output format specifications and skeleton templates - DFD diagram conventions and TMT element taxonomy * Address PR review comments from Copilot reviewer - Fix SKILL.md description: use single-quoted scalar, rename mode (2) to 'Incremental analysis' with accurate description - Replace 'Compare Mode (Deprecated)' sections with 'Comparing Commits or Reports' redirect (no deprecated language for first release) - Fix skeleton-findings.md: move Tier 1 table rows under header, add CONDITIONAL-EMPTY block after END-REPEAT (matching Tier 2/3 structure) - Fix skeleton-threatmodel.md and skeleton-architecture.md: use 4-backtick outer fences to avoid nested fence conflicts with inner mermaid fences - Fix skeleton-incremental-html.md: correct section count from 9 to 8 - Fix output-formats.md: change status 'open' to 'Open' in JSON example, move stride_category warning outside JSON fence as blockquote - Fix incremental-orchestrator.md: replace stale compare-output-formats.md reference with inline color conventions - Regenerate docs/README.skills.md with updated description * Address second round of Copilot review comments - Fix diagram-conventions.md: bidirectional flow notation now uses <--> matching orchestrator.md and DFD templates - Fix tmt-element-taxonomy.md: normalize SE.DF.SSH/LDAP/LDAPS to use SE.DF.TMCore.* prefix consistent with all other data flow IDs - Fix output-formats.md: correct TMT category example from SQLDatabase to SQL matching taxonomy, fix component type from 'datastore' to 'data_store' matching canonical enum, remove DaprSidecar from inbound_from per no-standalone-sidecar rule - Fix 5 skeleton files: clarify VERBATIM instruction to 'copy the template content below (excluding the outer code fence)' to prevent agents from wrapping output in markdown fences - Genericize product-specific names in examples: replace edgerag with myapp, BitNetManager with TaskProcessor, AzureLocalMCP with MyApp.Core, AzureLocalInfra with OnPremInfra, MilvusVectorDB with VectorDB * Address third round of Copilot review comments - Fix diagram-conventions.md: second bidirectional two-arrow pattern in Quick Reference section now uses <--> - Fix incremental-orchestrator.md: renumber HTML sections 5-9 to 4-8 matching skeleton-incremental-html.md 8-section structure - Fix output-formats.md: add incremental-comparison.html to File List as conditional output for incremental mode - Fix skeleton-inventory.md: add tmt_type, sidecars, and boundary_kind fields to match output-formats.md JSON schema example
54 KiB
Orchestrator — Threat Model Analysis Workflow
This file contains the complete orchestration logic for performing a threat model analysis.
It is the primary workflow document for the /threat-model-analyst skill.
⚡ Context Budget — Read Files Selectively
Do NOT read all 10 skill files at session start. Read only what each phase needs. This preserves context window for the actual codebase analysis.
Phase 1 (context gathering): Read this file (orchestrator.md) + analysis-principles.md + tmt-element-taxonomy.md
Phase 2 (writing reports): Read the relevant skeleton from skeletons/ BEFORE writing each file. Read output-formats.md + diagram-conventions.md for rules — but use the skeleton as the structural template.
- Before
0.1-architecture.md: readskeletons/skeleton-architecture.md - Before
1.1-threatmodel.mmd: readskeletons/skeleton-dfd.md - Before
1-threatmodel.md: readskeletons/skeleton-threatmodel.md - Before
2-stride-analysis.md: readskeletons/skeleton-stride-analysis.md - Before
3-findings.md: readskeletons/skeleton-findings.md - Before
0-assessment.md: readskeletons/skeleton-assessment.md - Before
threat-inventory.json: readskeletons/skeleton-inventory.md - Before
incremental-comparison.html: readskeletons/skeleton-incremental-html.mdPhase 3 (verification): Delegate to a sub-agent and includeverification-checklist.mdin the sub-agent prompt. The sub-agent reads the full checklist with a fresh context window — the parent agent does NOT need to read it.
Key principle: Sub-agents get fresh context windows. Delegate verification and JSON generation to sub-agents rather than keeping everything in the parent context.
✅ Mandatory Rules — READ BEFORE STARTING
These are the required behaviors for every threat model report. Follow each rule exactly:
- Organize findings by Exploitability Tier (Tier 1/2/3), never by severity level
- Split each component's STRIDE table into Tier 1, Tier 2, Tier 3 sub-sections
- Include
Exploitability TierandRemediation Efforton every finding — both are MANDATORY - STRIDE summary table MUST include T1, T2, T3 columns 4b. STRIDE + Abuse Cases categories are exactly: Spoofing, Tampering, Repudiation, Information Disclosure, Denial of Service, Elevation of Privilege, Abuse (business logic abuse, workflow manipulation, feature misuse — an extension to standard STRIDE covering misuse of legitimate features). The A is ALWAYS "Abuse" — NEVER "AI Safety", "Authorization", or any other interpretation. Authorization issues belong under E (Elevation of Privilege).
.mdfiles: start with# Headingon line 1. Thecreate_filetool writes raw content — no code fences.mmdfiles: start with%%{init:on line 1. Raw Mermaid source, no fences- Section MUST be titled exactly
## Action Summary. Include### Quick Winssubsection with Tier 1 low-effort findings table - K8s sidecars: annotate host container with
<br/>+ Sidecar— never create separate sidecar nodes (seediagram-conventions.mdRule 1) - Intra-pod localhost flows are implicit — do NOT draw them in diagrams
- Action Summary IS the recommendations — no separate
### Key Recommendationssection - Include
> **Note on threat counts:**blockquote in Executive Summary - Every finding MUST have CVSS 4.0 (score AND full vector string), CWE (with hyperlink), and OWASP (
:2025suffix) - OWASP suffix is always
:2025(e.g.,A01:2025 – Broken Access Control) - Include Threat Coverage Verification table at end of
3-findings.mdmapping every threat → finding - Every component in
0.1-architecture.mdMUST appear in2-stride-analysis.md - First 3 scenarios in
0.1-architecture.mdMUST have Mermaid sequence diagrams 0-assessment.mdMUST include## Analysis Context & Assumptionswith### Needs Verificationand### Finding Overridestables### Quick Winssubsection is REQUIRED under Action Summary (include heading with note if none)- ALL 7 sections in
0-assessment.mdare MANDATORY: Report Files, Executive Summary, Action Summary, Analysis Context & Assumptions, References Consulted, Report Metadata, Classification Reference - Deployment Classification is BINDING. In
0.1-architecture.md, setDeployment Classificationand fill the Component Exposure Table. If classification isLOCALHOST_DESKTOPorLOCALHOST_SERVICE: zero T1 findings, zeroPrerequisites = None, zeroAV:Nfor non-listener components. Seeanalysis-principles.mdDeployment Context table. - Finding IDs MUST be sequential top-to-bottom: FIND-01, FIND-02, FIND-03... Renumber after sorting
- CWE MUST include hyperlink:
[CWE-306](https://cwe.mitre.org/data/definitions/306.html): Missing Authentication - After STRIDE, run the Technology-Specific Security Checklist in
analysis-principles.md. Every technology in the repo needs at least one finding or documented mitigation - CVSS
AV:LorPR:H→ finding CANNOT be Tier 1. Downgrade to T2/T3. See CVSS-to-Tier Consistency Check inanalysis-principles.md - Use only
Low/Medium/Higheffort labels. NEVER generate time estimates, sprint phases, or scheduling. See Prohibited Content inoutput-formats.md - References Consulted: use the exact two-subsection format from
output-formats.md—### Security Standards(3-column table with full URLs) and### Component Documentation(3-column table with URLs) - Report Metadata: include ALL fields from
output-formats.mdtemplate — Model, Analysis Started, Analysis Completed, Duration. RunGet-Date -Format "yyyy-MM-dd HH:mm:ss" -AsUTCat Step 1 and before writing0-assessment.md ## Summarytable in2-stride-analysis.mdMUST appear at the TOP, immediately after## Exploitability Tiers, BEFORE individual component sections- Related Threats: every threat ID MUST be a hyperlink to
2-stride-analysis.md#component-anchor. Format:[T02.S](2-stride-analysis.md#component-name) - Diagram colors: copy classDef lines VERBATIM from
diagram-conventions.md. Only allowed fills:#6baed6(process),#fdae61(external),#74c476(datastore). Only allowed strokes:#2171b5,#d94701,#238b45,#e31a1c. Use ONLY%%{init: {'theme': 'base', 'themeVariables': { 'background': '#ffffff', 'primaryColor': '#ffffff', 'lineColor': '#666666' }}}%%— no other themeVariables keys - Summary DFD: after creating
1.1-threatmodel.mmd, run the POST-DFD GATE in Step 4. The gate andskeleton-summary-dfd.mdcontrol whether1.2-threatmodel-summary.mmdis generated. - Report Files table in
0-assessment.md: list0-assessment.md(this document) as the FIRST row, followed by 0.1-architecture.md, 1-threatmodel.md, etc. Use the exact template fromoutput-formats.md threat-inventory.jsonMUST be generated for every analysis run (Step 8b). This file enables future comparisons. Seeoutput-formats.mdfor schema.- NEVER delete, modify, or remove any existing
threat-model-*orthreat-model-compare-*folders in the repository. Only write to your own timestamped output folder. Cleaning up temporary git worktrees you created is allowed; deleting other report folders is FORBIDDEN.
Rule Precedence (when guidance conflicts)
Apply rules in this order:
- Literal skeletons in
skeletons/skeleton-*.md— exact section/table headers and attribute rows - Mandatory Rules in
orchestrator.md(this list) - Examples in
output-formats.md(examples are illustrative, not authoritative when they differ from literal skeletons)
If any conflict is detected, follow the highest-precedence item.
Post-generation: The verification sub-agent will scan your output for all known deviations listed in verification-checklist.md Phase 0. Fix any failures before finalizing.
Workflow
Exclusions: Skip these directories:
threat-model-*(previous reports)node_modules,.git,dist,build,vendor,__pycache__
Pre-work: Before writing any output file, scan verification-checklist.md Phase 1 (Per-File Structural Checks) and Phase 2 (Diagram Rendering Checks). This internalizes the quality gates so output is correct on the first pass — preventing costly rework. Do NOT run the full verification yet; that happens in Step 10.
⛔ Sub-Agent Governance (MANDATORY — prevents duplicate work)
Sub-agents are independent execution contexts — they have no memory of the parent's state, instructions, or other sub-agents. Without strict governance, sub-agents will independently perform the ENTIRE analysis, creating duplicate report folders and wasting ~15 min compute + ~100K tokens per duplication.
Rule 1 — Parent owns ALL file creation. The parent agent is the ONLY agent that calls create_file for report files (0.1-architecture.md, stride-analysis.md, findings.md, etc.). Sub-agents NEVER write report files.
Rule 2 — Sub-agents are READ-ONLY helpers. Sub-agents may:
- Search source code for specific patterns (e.g., "find all auth-related code")
- Read and analyze files, then return structured data to the parent
- Run verification checks and return PASS/FAIL results
- Execute terminal commands (git diff, grep) and return output
Rule 3 — Sub-agent prompts must be NARROW and SPECIFIC. Never tell a sub-agent to "perform threat model analysis" or "generate the report." Instead:
- ✅ "Read these 5 Go files and list every function that handles credentials. Return a table of function name, file, line number."
- ✅ "Run the verification checklist against the files in {folder}. Return PASS/FAIL for each check."
- ✅ "Read threat-inventory.json from {path} and verify all array lengths match metrics. Return mismatches."
- ❌ "Analyze this codebase and write the threat model files."
- ❌ "Generate 0.1-architecture.md and stride-analysis.md for this component."
Rule 4 — Output folder path. The parent creates the timestamped output folder in Step 1 and uses that exact path for ALL create_file calls. If a sub-agent needs to read previously written report files, pass the folder path in the sub-agent prompt.
Rule 5 — The ONLY exception is threat-inventory.json generation (Step 8b), where the parent MAY delegate JSON writing to a sub-agent IF the data is too large. In that case, the sub-agent prompt MUST include: (a) the exact output file path, (b) the data to serialize, and (c) explicit instruction: "Write ONLY this one file. Do NOT create any other files or folders."
Steps
-
Record start time & gather context
- Run
Get-Date -Format "yyyy-MM-dd HH:mm:ss" -AsUTCand store asSTART_TIME - Get git info:
git remote get-url origin,git branch --show-current,git rev-parse --short HEAD,git log -1 --format="%ai" HEAD(commit date — NOT today's date),hostname - Map the system: identify components, trust boundaries, data flows
- Reference:
analysis-principles.mdfor security infrastructure inventory
⛔ DEPLOYMENT CLASSIFICATION (MANDATORY — do this BEFORE analyzing code for threats): Determine the system's deployment class from code evidence (see
skeleton-architecture.mdfor values). Record in0.1-architecture.md→ Deployment Model section. Then fill the Component Exposure Table — one row per component showing listen address, auth barrier, external reachability, and minimum prerequisite. This table is the single source of truth for prerequisite floors. No threat or finding may have a lower prerequisite than what the exposure table permits for its component.⛔ DETERMINISTIC NAMING — Apply BEFORE writing any files:
When identifying components, assign each a canonical PascalCase
id. The naming MUST be deterministic — two independent runs on the same codebase MUST produce the same component IDs.⛔ ABSOLUTE RULE: Every component ID MUST be anchored to a real code artifact. For every component you identify, you MUST be able to point to a specific class, file, or manifest in the codebase that is the "anchor" for that component. If no such artifact exists, the component does not exist.
Naming procedure (follow IN ORDER — stop at the first match):
- Primary class name — Use the EXACT class name from the source code. Do NOT abbreviate, expand, or rephrase it.
TaskProcessor.cs→TaskProcessor(NOTTaskServer, NOTTaskService)SessionStore.cs→SessionStore(NOTFileSessionStore, NOTSessionService)TerminalUserInterface.cs→TerminalUserInterface(NOTTerminalUI)PowerShellCommandExecutor.cs→PowerShellCommandExecutor(NOTPowerShellExecutor)ResponsesAPIService.cs→ResponsesAPIService(NOTLLMService— that's a DIFFERENT class)MCPHost.cs→MCPHost(NOTOrchestrationHost)
- Primary script name →
Import-Images.ps1→ImportImages - Primary config/manifest name →
Dockerfile→DockerContainer,values.yaml→HelmChart - Directory name (if component spans multiple files) →
src/ParquetParsing/→ParquetParser - Technology name (for external services/datastores) → "Azure OpenAI" →
AzureOpenAI, "Redis" →Redis - External actor role →
Operator,EndUser(never drop these)
⛔ Helm/Kubernetes Deployment Naming (CRITICAL for comparison stability): When a component is deployed via Helm chart or Kubernetes manifests, use the Kubernetes workload name (from the Deployment/StatefulSet metadata.name) as the component ID — NOT the Helm template filename or directory structure:
- Look at
metadata.namein deployment YAML → use that as the component ID (PascalCase normalized) - Example:
metadata.name: devportalintemplates/knowledge/devportal-deployment.yml→ component ID isDevPortal - Example:
metadata.name: phi-modelintemplates/knowledge/phi-deployment.yml→ component ID isPhiModel - Why: Helm templates frequently get reorganized (e.g., moved from
templates/totemplates/knowledge/) but the Kubernetes workload name stays the same. Using the workload name ensures the component ID survives directory reorganizations. source_filesMUST include the deployment YAML path AND the application source code path (e.g., bothhelmchart/myapp/templates/knowledge/devportal-deployment.ymlANDdeveloper-portal/src/)source_directoriesMUST include BOTH the Helm template directory AND the source code directory
External Service Anchoring (for components without repo source code): External services (cloud APIs, managed databases, SaaS endpoints) don't have source files in the repository. Anchor them to their integration point in the codebase:
source_files→ the client class or config file that defines the connection (e.g.,src/MCP/appsettings.jsonfor Azure OpenAI connection config,helmchart/values.yamlfor Redis endpoint config)source_directories→ the directory containing the integration code (e.g.,src/MCP/Core/Services/LLM/for the LLM client)class_names→ the CLIENT class in YOUR repo that talks to the service (e.g.,ResponsesAPIService), NOT the vendor's SDK class (e.g., NOTOpenAIClient). If no dedicated client class exists, leave empty.namespace→ leave empty""(external services don't have repo namespaces)config_keys→ the env vars / config keys for the service connection (e.g.,["AZURE_OPENAI_ENDPOINT", "RESPONSES_API_DEPLOYMENT"]). These are the most stable anchors for external services.api_routes→ leave empty (external services expose their own routes, not yours)dependencies→ the SDK package used (e.g.,["Azure.AI.OpenAI"]for NuGet,["pymilvus"]for pip)
Why this matters: External services frequently change display names across LLM runs (e.g., "Azure OpenAI" vs "GPT-4 Endpoint" vs "LLM Backend"). The
config_keysanddependenciesfields are what make them matchable across runs.⛔ FORBIDDEN naming patterns — NEVER use these:
- NEVER invent abstract names that don't correspond to a real class:
ConfigurationStore,LocalFileSystem,DataLayer,IngestionPipeline,BackendServer - NEVER abbreviate a class name:
TerminalUIforTerminalUserInterface,PSExecutorforPowerShellCommandExecutor - NEVER substitute a synonym:
TaskServerforTaskProcessor,LLMServiceforResponsesAPIService - NEVER merge two separate classes into one component:
ResponsesAPIServiceandLLMServiceare two different classes → two different components - NEVER create a component for something that doesn't exist in the code: if there's no Windows Registry access code, don't create a
WindowsRegistrycomponent - NEVER rename between runs: if you called it
TaskProcessorin run 1, it MUST beTaskProcessorin run 2
⛔ COMPONENT ANCHOR VERIFICATION (MANDATORY — do this BEFORE Step 2): After identifying all components, create a mental checklist:
For EACH component: Q: What is the EXACT filename or class that anchors this component? A: [must cite a real file path, e.g., "src/Core/TaskProcessor.cs"] If you cannot cite a real file → DELETE the component from your listThis verification catches invented components like
WindowsRegistry(no registry code exists),ConfigurationStore(no such class),LocalFileSystem(abstract concept, not a class).⛔ COMPONENT SELECTION STABILITY (when multiple related classes exist): Many systems have clusters of related classes (e.g.,
CredentialManager,AzureCredentialProvider,AzureAuthenticationHandler). To ensure deterministic selection:- Pick the class that OWNS the security-relevant behavior — the one that makes the trust decision, holds the credential, or processes the data
- Prefer the class registered in dependency injection over helpers/utilities
- Prefer the higher-level orchestrator over its internal implementation classes
- Once you pick a class, its alternatives become aliases — add them to the
aliasesarray, not as separate components - Example: If
CredentialManagerorchestrates credential lookup and usesAzureCredentialProviderinternally,CredentialManageris the component andAzureCredentialProvideris an alias - Example: Do NOT include both
SessionStoreandSessionFiles—SessionStoreis the class,SessionFilesis an abstract concept - Count rule: Two runs on the same code MUST produce the same number of components (±1 for edge cases). A difference of ≥3 components indicates the selection rules were not followed.
⛔ STABILITY ANCHORS (for comparison matching): When recording each component in
threat-inventory.json, thefingerprintfieldssource_directories,class_names, andnamespaceserve as stability anchors — immutable identifiers that persist even when:- The class is renamed (directory stays the same)
- The file is moved to a different directory (class name stays the same)
- The component ID changes between analysis runs (namespace stays the same)
The comparison matching algorithm relies on these anchors MORE than on the component
idfield. Therefore: source_directoriesMUST be populated for every process-type component (never empty[])class_namesMUST include at least the primary class namenamespaceMUST be the actual code namespace (e.g.,MyApp.Core.Servers.Health), not a made-up grouping- These fields are what make a component identifiable across independent analysis runs, even if two LLMs pick different display names
⛔ COMPONENT ELIGIBILITY — What qualifies as a threat model component: A class/service becomes a threat model component ONLY if it meets ALL of these criteria:
- It crosses a trust boundary OR handles security-sensitive data (credentials, user input, network I/O, file I/O, process execution)
- It is a top-level service, not an internal helper (registered in DI, or the main entry point, or an agent with its own responsibility)
- It would appear in a deployment diagram — you could point to it and say "this runs here, talks to that"
ALWAYS include these component types (if they exist in the code):
- ALL agent classes (HealthAgent, InfrastructureAgent, InvestigatorAgent, SupportabilityAgent, etc.)
- ALL MCP server classes (HealthServer, InfrastructureServer, etc.)
- The main host/orchestrator (MCPHost, etc.)
- ALL external service connections (AzureOpenAI, AzureAD, etc.)
- ALL credential/auth managers
- The user interface entry point
- ALL tool execution services (PowerShellCommandExecutor, etc.)
- ALL session/state persistence services
- ALL LLM service classes (ResponsesAPIService, LLMService — if they are separate classes, they are separate components)
- External actors (Operator, EndUser)
NEVER include these as separate components:
- Loggers (LocalFileLogger, TelemetryLogger) — these are cross-cutting concerns, not threat model components
- Static helper classes
- Model/DTO classes
- Configuration builders (unless they handle secrets)
- Infrastructure-as-code classes that don't exist at runtime (AzureStackHCI cluster reference, deployment scripts)
The goal: Every run on the same code should identify the SAME set of ~12-20 components. If you're including a logger or excluding an agent, you're doing it wrong.
Boundary naming rules:
- Boundary IDs MUST be PascalCase (never
Layer,Zone,Group,Tiersuffixes) - Derive from deployment topology, NOT from code architecture layers
- Deployment topology determines boundaries:
- Single-process app → EXACTLY 2 boundaries:
Application(the process) +External(external services). NEVER use 1 boundary. NEVER use 3+ boundaries. This is mandatory for single-process apps. - Multi-container app → boundaries per container/pod
- K8s deployment →
K8sCluster+ per-namespace boundaries if relevant - Client-server →
Client+Server
- Single-process app → EXACTLY 2 boundaries:
- K8s multi-service deployments (CRITICAL for microservice architectures):
When a K8s namespace contains multiple Deployments/StatefulSets with DIFFERENT security characteristics, create sub-boundaries based on workload type:
BackendServices— API services (FastAPI, Express, etc.) that handle user requestsDataStorage— Databases and persistent storage (Redis, Milvus, PostgreSQL, NFS) — these have different access controls, persistence, and backup policiesMLModels— ML model servers running on GPU nodes — these have different compute resources, attack surfaces (adversarial inputs), and scaling characteristicsAgentic— Agent runtime/manager services if present- The outer
K8sClustercontains these sub-boundaries - This is NOT "code layers" — each sub-boundary represents a different Kubernetes Deployment/StatefulSet with its own security context, resource limits, and network policies
- Test: If two components are in DIFFERENT Kubernetes Deployments with different service accounts, different network exposure, or different resource requirements → they SHOULD be in different sub-boundaries
- FORBIDDEN boundary schemes (for SINGLE-PROCESS apps only):
- Do NOT create boundaries based on code layers:
PresentationBoundary,OrchestrationBoundary,AgentBoundary,ServiceBoundaryare CODE LAYERS, not deployment boundaries. All these run in the SAME process. - Do NOT split a single process into 4+ boundaries. If all components run in one .exe, they are in ONE boundary.
- Do NOT create boundaries based on code layers:
- Example: An application where
TerminalUserInterface,MCPHost,HealthAgent,ResponsesAPIServiceall run in the same process → they are ALL inApplication. External services likeAzureOpenAIare inExternal. - Two runs on the same code MUST produce the same number of boundaries (±1). A difference of ≥2 boundaries is WRONG.
- NEVER create boundaries based on code layers (Presentation/Business/Data) — boundaries represent DEPLOYMENT trust boundaries, not code architecture
Boundary count locking:
- After identifying boundaries, LOCK the count. Two runs on the same code MUST produce the same number of boundaries (±1 acceptable if one run identifies an edge boundary the other doesn't)
- A 4-boundary vs 7-boundary difference on the same code is WRONG and indicates the naming rules were not followed
Additional naming rules:
- The SAME component must get the SAME
idregardless of which LLM model runs the analysis or how many times it runs - External actors (
Operator,AzureDataStudio, etc.) are ALWAYS included — never drop them - Datastores representing distinct storage (files, database) are ALWAYS separate components — never merge them
- Lock the component list before Step 2. Use these exact IDs in ALL subsequent files (architecture, DFD, STRIDE, findings, JSON)
- If two classes exist as separate files (e.g.,
ResponsesAPIService.csandLLMService.cs), they are TWO components even if they seem related
⛔ DATA FLOW COMPLETENESS (MANDATORY — ensures consistent flow enumeration across runs): Data flows MUST be enumerated exhaustively. Two independent analyses of the same codebase MUST produce the same set of flows. To achieve this:
⛔ RETURN FLOW MODELING RULE (addresses 24% variance in flow counts):
- DO NOT model separate return flows. A request-response pair is ONE bidirectional flow (use
<-->in Mermaid). - Example:
DF01: Operator <--> TUI(one flow for input and output) - Example:
DF03: MCPHost <--> HealthAgent(one flow for delegation and result) - DO model separate flows ONLY when the two directions use different protocols or semantics (e.g., HTTP request vs WebSocket push-back).
- Why: When runs independently decide whether to create 1 flow or 2 flows per interaction, the flow count varies by 20-30%. This rule eliminates that variance.
- Flow count formula:
# flows ≈ # unique component-to-component interactions. If component A talks to component B, that is 1 flow, not 2.
Flow completeness checklist (use
<-->bidirectional flows per the return flow rule above):- Ingress/reverse proxy flows:
DF_EndUser_to_NginxIngress(bidirectional<-->),DF_NginxIngress_to_Backend(bidirectional<-->). Each is ONE flow, not two. - Database/datastore flows:
DF_Service_to_Redis(bidirectional<-->). ONE flow per service-datastore pair. - Auth provider flows:
DF_Service_to_AzureAD(bidirectional<-->). ONE flow per service-auth pair. - Admin access flows:
DF_Operator_to_Service(bidirectional<-->). ONE per admin interaction. - Flow count locking: After enumerating flows, LOCK the count. Two runs on the same code MUST produce the same number of flows (±3 acceptable). A difference of >5 flows indicates incomplete enumeration.
⛔ EXTERNAL ENTITY INCLUSION RULES (addresses variance in which externals are modeled):
- ALWAYS include
AzureAD(orEntraID) as an external entity if the code acquires tokens from Azure AD / Microsoft Entra ID (look forChainedTokenCredential,ManagedIdentityCredential,AzureCliCredential, MSAL, or any OAuth2/OIDC flow). - ALWAYS include the infrastructure target (e.g.,
OnPremInfra,HCICluster) as an external entity if the code sends commands to external infrastructure via PowerShell, REST, or WMI. - ALWAYS include
AzureOpenAI(or equivalent LLM endpoint) if the code calls a cloud LLM API. - ALWAYS include
Operatoras an external actor for CLI/TUI tools, admin tools, or operator consoles. - Rule of thumb: If the code has a client class or config for a service, that service is an external entity.
⛔ TMT CATEGORY RULES (addresses category inconsistency across runs):
- Tool servers that expose APIs callable by agents →
SE.P.TMCore.WebSvc(NOTSE.P.TMCore.NetApp) - Network-level services that handle connections/sockets →
SE.P.TMCore.NetApp - Services that execute OS commands (PowerShell, bash) →
SE.P.TMCore.OSProcess - Services that store data to disk (SessionStore, FileLogger) →
SE.DS.TMCore.FS(classify as Data Store, NOT Process) - Rule: If a class's primary purpose is persisting data, it is a Data Store. If it does computation or orchestration, it is a Process. Never switch between runs.
⛔ DFD DIRECTION (MANDATORY — addresses layout variance):
- ALL DFDs MUST use
flowchart LR(left-to-right). NEVER useflowchart TB. - ALL summary DFDs MUST also use
flowchart LR. - This is immutable — do not change based on aesthetics or diagram shape.
Acronym rules for PascalCase:
- Preserve well-known acronyms as ALL-CAPS:
API,NFS,LLM,SQL,HCI,AD,UI,DB - Examples:
IngestionAPI(notIngestionApi),NFSServer(notNfsServer),AzureAD(notAzureAd),VectorDBAPI(notVectorDbApi) - Single-word technologies keep standard casing:
Redis,Milvus,PostgreSQL,Nginx
Common technology naming (use EXACTLY these IDs for well-known infrastructure):
- Redis cache/state:
Redis(neverDaprStateStore,RedisCache,StateStore) - Milvus vector DB:
Milvus(neverMilvusVectorDb,VectorDB) - NGINX ingress:
NginxIngress(neverIngressNginx) - Azure AD/Entra:
AzureAD(neverAzureAd,EntraID) - PostgreSQL:
PostgreSQL(neverPostgresDb,Postgres) - User/Operator:
Operatorfor admin users,EndUserfor end users - Azure OpenAI:
AzureOpenAI(neverOpenAIService,LLMEndpoint) - NFS:
NFSServer(neverNfsServer,FileShare) - If two LLM models are separate deployments, keep them separate (never merge
MistralLLM+PhiLLMintoLocalLlm)
BUT: for application-specific classes, use the EXACT class name from the code, NOT a technology label:
ResponsesAPIService.cs→ResponsesAPIService(NOTOpenAIService— the class IS named ResponsesAPIService)TaskProcessor.cs→TaskProcessor(NOTLocalLLM— the class IS named TaskProcessor)SessionStore.cs→SessionStore(NOTStatePersistence— the class IS named SessionStore) Component granularity rules (CRITICAL for stability):- Model components at the technology/service level, not the script/file level
- A Docker container running Kusto is
KustoContainer— NOT decomposed intoKustoService+IngestLogs+KustoDataDirectory - A Moby Docker engine is
MobyDockerEngine— NOTInstallMoby(the installer script is evidence, not the component) - An installer for a tool is
SetupInstaller— NOT renamed toInstallAzureEdgeDiagnosticTool(script filename) - Rule: if a component has one primary function (e.g., "run Kusto queries"), model it as ONE component regardless of how many scripts/files implement it
- Scripts are EVIDENCE for components, not components themselves
- Keep the same granularity across runs — never split a single component into sub-components or merge sub-components between runs
⛔ COMPONENT ID FORMAT (MANDATORY — addresses casing variance):
- ALL component IDs MUST be PascalCase. NEVER use kebab-case, snake_case, or camelCase.
- Examples:
HealthAgent(nothealth-agent),AzureAD(notazure-ad),MCPHost(notmcp-host) - This applies to ALL artifacts: 0.1-architecture.md, 1-threatmodel.md, DFD mermaid, STRIDE, findings, JSON.
⛔ STRIDE SCOPE RULE (addresses external entity analysis variance):
- STRIDE analysis in
2-stride-analysis.mdMUST include sections for ALL elements in the Element Table EXCEPT external actors (Operator, EndUser). - External services (AzureOpenAI, AzureAD, OnPremInfra) DO get STRIDE sections — they are attack surfaces from YOUR system's perspective.
- External actors (human users) do NOT get STRIDE sections — they are threat SOURCES, not targets.
- This means: if you have 20 elements total and 1 is an external actor, you write 19 STRIDE sections.
⛔ STRIDE DEPTH CONSISTENCY (addresses threat count variance):
- Each component MUST get ALL 7 STRIDE-A categories analyzed (S, T, R, I, D, E, A).
- Each STRIDE category MUST be explicitly addressed per component: either with one or more concrete threats, OR with an explicit
N/A — {1-sentence justification}row explaining why that category does not apply to this specific component. - A category may produce 0, 1, 2, 3, or more threats — the count depends on the component's actual attack surface. Do NOT cap at 1 threat per category. Components with rich security surfaces (API services, auth managers, command executors, LLM clients) should typically have 2-4 threats per relevant STRIDE category. Only simple components (static config, read-only data stores) should have mostly 0-1.
- Expected distribution: For a 15-component system: ~30% of STRIDE cells should be 0 (with N/A), ~40% should be 1, ~25% should be 2, ~5% should be 3+. If ALL cells are 0 or 1 (binary pattern) → the analysis is too shallow. Go back and identify additional threat vectors.
- N/A entries do NOT count toward threat totals in the Summary table. Only concrete threat rows count.
- The Summary table S/T/R/I/D/E/A columns show the COUNT of concrete threats per category (0 is valid if N/A was justified).
- This ensures comprehensive coverage while producing accurate, non-inflated threat counts.
- Run
-
Write architecture overview (
0.1-architecture.md)- Read
skeletons/skeleton-architecture.mdfirst — copy skeleton structure, fill[FILL]placeholders - System purpose, key components, top scenarios, tech stack, deployment
- Use the exact component IDs locked in Step 1 — do not rename or merge components
- Reference:
output-formats.mdfor template,diagram-conventions.mdfor architecture diagram styles
- Read
-
Inventory security infrastructure
- Identify security-enabling components before flagging gaps
- Reference:
analysis-principles.mdSecurity Infrastructure Inventory table
-
Produce threat model DFD (
1.1-threatmodel.mmd,1.2-threatmodel-summary.mmd,1-threatmodel.md)- Read
skeletons/skeleton-dfd.md,skeletons/skeleton-summary-dfd.md, andskeletons/skeleton-threatmodel.mdfirst - Reference:
diagram-conventions.mdfor DFD styles,tmt-element-taxonomy.mdfor element classification - ⚠️ BEFORE FINALIZING: Run the Pre-Render Checklist from
diagram-conventions.md
⛔ POST-DFD GATE — Run IMMEDIATELY after creating
1.1-threatmodel.mmd:- Count elements (nodes with
((...)),[(...),["..."]) in1.1-threatmodel.mmd - Count boundaries (
subgraphlines) - If elements > 15 OR boundaries > 4:
→ You MUST create
1.2-threatmodel-summary.mmdusingskeleton-summary-dfd.mdNOW → Do NOT proceed to1-threatmodel.mduntil the summary file exists - If threshold NOT met → skip summary, proceed to
1-threatmodel.md - Create
1-threatmodel.md(include Summary View section if summary was generated)
- Read
-
Enumerate threats per element and flow using STRIDE-A (
2-stride-analysis.md)- Read
skeletons/skeleton-stride-analysis.mdfirst — use Summary table and per-component structure - Reference:
analysis-principles.mdfor tier definitions,output-formats.mdfor STRIDE template - ⛔ PREREQUISITE FLOOR CHECK (per threat): Before assigning a prerequisite to any threat, look up the component's
Min PrerequisiteandDerived Tierin the Component Exposure Table (0.1-architecture.md). The threat's prerequisite MUST be ≥ the component's floor. The threat's tier MUST be ≥ the component's derived tier (i.e., if component is T2, no threat can be T1). Use the canonical prerequisite→tier mapping fromanalysis-principles.md.
- Read
-
For each threat: cite files/functions/endpoints, propose mitigations, provide verification steps
-
Verify findings — confirm each finding against actual configuration before documenting
- Reference:
analysis-principles.mdFinding Validation Checklist
- Reference:
7b. Technology sweep — Run the Technology-Specific Security Checklist from analysis-principles.md
- For every technology found in the repo (Redis, Milvus, PostgreSQL, Docker, K8s, ML models, LLMs, NFS, CI/CD, etc.), verify you have at least one finding or explicit mitigation
- This step catches gaps that component-level STRIDE misses (e.g., database auth defaults, container hardening, key management)
- Add any missing findings before proceeding to Step 8
-
Compile findings (
3-findings.md)- Reference:
output-formats.mdfor findings template and Related Threats link format - Reference:
skeletons/skeleton-findings.md— read this skeleton, copy VERBATIM, fill in[FILL]placeholders for each finding
⛔ PRE-WRITE GATE — Verify before calling
create_filefor3-findings.md:- Finding IDs:
### FIND-01:,### FIND-02:— sequential,FIND-prefix (NOTF01orF-01) - CVSS prefix: every vector starts with
CVSS:4.0/(NOT bareAV:N/AC:L/...) - Related Threats: each threat ID is a separate hyperlink
[TNN.X](2-stride-analysis.md#anchor)(NOT plain text) - Sub-sections:
#### Description,#### Evidence,#### Remediation,#### Verification(NOTRecommendation) - Sort: within each tier → Critical → Important → Moderate → Low → higher CVSS first
- All 10 mandatory attribute rows present per finding
- Deployment context gate (FAIL-CLOSED): Read
0.1-architecture.mdDeployment Classification and Component Exposure Table. If classification isLOCALHOST_DESKTOPorLOCALHOST_SERVICE:- ZERO findings may have
Exploitation Prerequisites=None→ fix toLocal Process Access(T2) orHost/OS Access(T3) - ZERO findings may be in
## Tier 1→ downgrade to T2/T3 based on prerequisite - ZERO CVSS vectors may use
AV:Nunless the specific component hasReachability = Externalin the Component Exposure Table → fix toAV:LFor ALL deployment classifications: - For EACH finding, look up its Component in the exposure table. The finding's prerequisite MUST be ≥ the component's
Min Prerequisite. The finding's tier MUST be ≥ the component'sDerived Tier. - Prerequisites MUST use only canonical values:
None,Authenticated User,Privileged User,Internal Network,Local Process Access,Host/OS Access,Admin Credentials,Physical Access,{Component} Compromise. ⛔Application AccessandHost Accessare FORBIDDEN. If ANY violation exists → DO NOT WRITE THE FILE. Fix all violations first.
- ZERO findings may have
⛔ Fail-fast gate: Immediately after writing, run the Inline Quick-Checks for
3-findings.mdfromverification-checklist.md. Fix before proceeding.⛔ MANDATORY: All 3 tier sections must be present. Even if a tier has zero findings, include the heading with a note:
## Tier 1 — Direct Exposure (No Prerequisites)→*No Tier 1 findings identified for this repository.*- This ensures structural consistency for comparison matching and validation.
⛔ COVERAGE VERIFICATION FEEDBACK LOOP (MANDATORY): After writing the Threat Coverage Verification table at the end of
3-findings.md:- Scan the table you just wrote. Count how many threats have status
✅ Coveredvs🔄 Mitigated by Platformvs⚠️ Needs Reviewvs⚠️ Accepted Risk. - If ANY threat has
⚠️ Accepted Risk→ FAIL. The tool cannot accept risks. Go back and create a finding for each one. - If Platform ratio > 20% → SUSPECT. Re-examine each
🔄 Mitigated by Platformentry: is the mitigation truly from an EXTERNAL system managed by a DIFFERENT team? If the mitigation is the repo's own code (auth middleware, file permissions, TLS config, localhost binding), reclassify asOpenand create a finding. - If ANY
Openthreat in2-stride-analysis.mdhas NO corresponding finding → create a finding NOW. Use the threat's description as the finding title, the mitigation column as the remediation guidance, and assign severity based on STRIDE category. - Update
3-findings.mdwith the newly created findings. Renumber sequentially. Update the Coverage table to show✅ Coveredfor each. - This loop is the ENTIRE POINT of the Coverage table — it's not documentation, it's a self-check that forces complete coverage. If you write the table and don't act on gaps, you've wasted the effort.
- Reference:
8b. Generate threat inventory (threat-inventory.json)
- Read
skeletons/skeleton-inventory.mdfirst — use exact field names and schema structure - After writing all markdown reports, compile a structured JSON inventory of all components, boundaries, data flows, threats, and findings - Use canonical PascalCase IDs for components (derived from class/file names) and keep display labels separate
- Use canonical flow IDs:
DF_{Source}_to_{Target}- Include identity keys on every threat and finding for future matching - Include deterministic identity fields for component and boundary matching across runs:- Component:
aliases,boundary_kind,fingerprint - Boundary:
kind,aliases,contains_fingerprint- Buildfingerprintfrom stable evidence (source files, endpoint neighbors, protocols, type) — never from prose wording - Normalize synonyms to the same canonical component ID (example:SupportAgentandSupportabilityAgent→SupportabilityAgent) and store alternate names inaliases- Sort arrays deterministically before writing JSON: componentsbyidboundariesbyidflowsbyidthreatsbyidthenidentity_key.component_idfindingsbyidthenidentity_key.component_id
- Component:
- Extract metrics (totals, per-tier counts, per-STRIDE-category counts)
- Include git metadata (commit SHA, branch, date) and analysis metadata (model, timestamps)
- Reference:
output-formats.mdfor thethreat-inventory.jsonschema - This file is NOT linked in 0-assessment.md but is always present in the output folder
⛔ PRE-WRITE SIZE CHECK (MANDATORY — before calling create_file for JSON):
Before writing threat-inventory.json, count the data you plan to include:
- Count total threats from
2-stride-analysis.md(grep^\| T\d+\.) - Count total findings from
3-findings.md(grep### FIND-) - Count total components from
0.1-architecture.md - If threats > 50 OR findings > 15: DO NOT use a single
create_filecall. Instead, use one of: (a) delegate to sub-agent, (b) Python extraction script, (c) chunked write strategy. - If threats ≤ 50 AND findings ≤ 15: single
create_fileis acceptable, but keep entries minimal (1-sentence description/mitigation fields).
⛔ POST-WRITE VALIDATION (MANDATORY — JSON Array Completeness):
After writing threat-inventory.json, immediately verify:
threats.length == metrics.total_threats— if mismatch, the threats array was truncated during generation. Rebuild by re-reading2-stride-analysis.mdand extracting every threat row.findings.length == metrics.total_findings— if mismatch, rebuild from3-findings.md.components.length == metrics.total_components— if mismatch, rebuild from architecture/element tables.
⛔ CROSS-FILE THREAT COUNT VERIFICATION (MANDATORY — catches dropped threats):
The JSON threats.length can match metrics.total_threats but BOTH can be wrong if threats were dropped during JSON generation. To catch this:
- Count threat rows in
2-stride-analysis.md: grep for^\| T\d+\.and count unique threat IDs - Compare this count to
threats.lengthin the JSON - If the markdown has MORE threats than the JSON → the JSON dropped threats. Rebuild the JSON by re-extracting ALL threats from
2-stride-analysis.md. - This is the #2 quality issue observed in testing (after truncation). Large repos (114+ threats) frequently have 1-3 threats dropped when sub-agents write the JSON from memory instead of re-reading the STRIDE file.
⛔ FIELD NAME COMPLIANCE GATE (MANDATORY — run immediately after array check): Read the first component and first threat from the JSON just written and verify these EXACT field names:
-
components[0]has key"display"(NOT"display_name", NOT"name") → if wrong, find-replace ALL occurrences -
threats[0]has key"stride_category"(NOT"category") → if wrong, find-replace ALL occurrences -
threats[0].identity_keyhas key"component_id"(threat→component link must be INSIDEidentity_key, NOT a top-levelcomponent_idfield on the threat) → if wrong, restructure -
threats[0]has BOTH"title"(short name, e.g., "Information Disclosure — Redis unencrypted traffic") AND"description"(longer prose). If onlydescriptionexists withouttitle, createtitlefrom the first sentence ofdescription. Ifnameorthreat_nameexists instead oftitle, find-replace totitle -
Why this matters: Downstream tooling depends on these exact field names. Wrong names cause zero-value heatmaps, broken component matching, and empty display labels in comparison reports.
-
If ANY field name is wrong: fix it NOW with find-replace on the JSON file before proceeding. Do NOT leave it for verification.
-
This is the #1 quality issue observed in testing. Large repos (20+ components, 80+ threats) frequently have truncated JSON arrays because the model runs out of output tokens. If ANY array is truncated, you MUST rebuild it before proceeding. Do NOT finalize with mismatched counts.
⛔ HARD GATE — TRUNCATION RECOVERY (MANDATORY): If post-write validation detects ANY array mismatch:
- DELETE the truncated
threat-inventory.jsonimmediately - DO NOT attempt to patch the truncated file — partial JSON is unreliable
- Regenerate using one of these strategies (in preference order):
a. Delegate to a sub-agent — hand the sub-agent the output folder path and instruct it to read
2-stride-analysis.mdand3-findings.md, then writethreat-inventory.json. The sub-agent has a fresh context window. b. Python extraction script — write a Python script that reads the markdown files, extracts threats/findings via regex, and writes the JSON. Run the script via terminal. c. Chunked write — use the Large Repo Strategy below. - Re-validate after regeneration — if still mismatched, repeat with the next strategy
- NEVER proceed to Step 9 (assessment) or Step 10 (verification) with mismatched counts
⛔ LARGE REPO STRATEGY (MANDATORY for repos with >60 threats): For repos producing more than ~60 threats, the JSON file can exceed output token limits if generated in one pass. Use this chunked approach:
- Write metadata + components + boundaries + flows + metrics first — these are small arrays
- Append threats in batches — write threats array with ~20 threats per append operation. Use
replace_string_in_fileto add batches to the existing file rather than writing the entire JSON in onecreate_filecall. - Append findings — similarly batch if >15 findings
- Final validation — read the completed file and verify all array lengths match metrics
Alternative approach: If chunked writing is not feasible, keep each threat/finding entry minimal:
descriptionfield: max 1 sentence (not full prose paragraphs)mitigationfield: max 1 sentence- Remove redundant fields that duplicate markdown content
- The JSON is for MATCHING, not for reading — brevity is key
-
Write assessment (
0-assessment.md)- Reference:
output-formats.mdfor assessment template - Reference:
skeletons/skeleton-assessment.md— read this skeleton, copy VERBATIM, fill in[FILL]placeholders - ⚠️ ALL 7 sections are MANDATORY: Report Files, Executive Summary, Action Summary (with Quick Wins), Analysis Context & Assumptions (with Needs Verification + Finding Overrides), References Consulted, Report Metadata, Classification Reference
- Do NOT add extra sections like "Severity Distribution", "Architecture Risk Areas", "Methodology Notes", or "Deliverables" — these are NOT in the template
⛔ PRE-WRITE GATE — Verify before calling
create_filefor0-assessment.md:- Exactly 7 sections: Report Files, Executive Summary, Action Summary, Analysis Context & Assumptions (with
&), References Consulted, Report Metadata, Classification Reference ---horizontal rules between EVERY pair of##sections (minimum 6)### Quick Wins,### Needs Verification,### Finding Overridesall present- References: TWO subsections (
### Security Standards+### Component Documentation) with 3-column tables and full URLs - ALL metadata values wrapped in backticks; ALL fields present (Model, Analysis Started, Analysis Completed, Duration)
- Element/finding/threat counts match actual counts from other files
⛔ Fail-fast gate: Immediately after writing, run the Inline Quick-Checks for
0-assessment.mdfromverification-checklist.md. Fix before proceeding. - Reference:
-
Final verification — iterative correction loop
This step runs verification and fixes in a loop until all checks pass. Do NOT finalize with any failures remaining.
Pass 1 — Comprehensive verification:
- Delegate to a verification sub-agent with the content of
verification-checklist.md+ the output folder path - Sub-agent runs ALL Phase 0–5 checks and reports PASS/FAIL with evidence
- If ANY check fails:
- Fix the failed file(s) using the available file-edit tool
- Re-run ONLY the failed checks against the fixed file(s)
- Repeat until the failed checks pass
Pass 2 — Regression check (if Pass 1 had fixes):
- Re-run Phase 3 (cross-file consistency) to ensure fixes didn’t break other files
- If new failures appear, fix and re-verify
Exit condition: ALL phases report 0 failures. Only then mark the analysis as complete.
Sub-agent context management:
- Include the relevant phase content from
verification-checklist.mdin the sub-agent prompt - Include the output folder path so the sub-agent can read files
- Sub-agent output MUST include: phase name, total checks, passed, failed, and for each failure: check ID, file, evidence, exact fix instruction. Do not return "looks good" without counts.
- Delegate to a verification sub-agent with the content of
Tool Usage
Progress Tracking (todo)
- Create todos at start for each major phase
- Mark in-progress before starting each phase
- Mark completed immediately after finishing each phase
Sub-task Delegation (agent)
Delegate NARROW, READ-ONLY tasks to sub-agents (see Sub-Agent Governance above). Allowed delegations:
- Context gathering: "Search for auth patterns in these directories and return a summary"
- Code analysis: "Read these files and identify security-relevant APIs, credentials, and trust boundaries"
- Verification: Hand the verification sub-agent the content of
verification-checklist.mdand the output folder path. It reads the files and returns PASS/FAIL results. The PARENT fixes any failures. - JSON generation (exception): For large repos, delegate
threat-inventory.jsonwriting with exact file path and pre-computed data
NEVER delegate: "Write 0.1-architecture.md", "Generate the STRIDE analysis", "Perform the threat model analysis", or any prompt that would cause the sub-agent to independently produce report files.
Verification Checklist (Final Step)
The full verification checklist is in verification-checklist.md. It contains 9 phases:
Authority hierarchy:
orchestrator.mddefines the AUTHORING rules (what to do when writing reports).verification-checklist.mddefines the CHECKING rules (what to verify after writing). Some rules appear in both files for visibility — if they ever conflict,orchestrator.mdrules take precedence for authoring decisions, andverification-checklist.mdtakes precedence for pass/fail criteria. For the complete list of all structural, diagram, and consistency checks, always consultverification-checklist.md— it is the single source of truth for quality gates.
- Phase 0 — Common Deviation Scan: Known deviation patterns with WRONG→CORRECT examples
- Phase 1 — Per-File Structural Checks: Section order, required content, formatting
- Phase 2 — Diagram Rendering Checks: Mermaid init blocks, classDef, styles, syntax
- Phase 3 — Cross-File Consistency Checks: Component coverage, DF mapping, threat-to-finding traceability
- Phase 4 — Evidence Quality Checks: Evidence concreteness, verify-before-flagging compliance
- Phase 5 — JSON Schema Validation: Schema fields, array completeness, metrics consistency
- Phase 6 — Deterministic Identity: Component ID stability, boundary naming, flow ID consistency
- Phase 7 — Evidence-Based Prerequisites: Prerequisite deployment evidence, coverage completeness
- Phase 8 — Comparison HTML (incremental only): HTML structure, change annotations, CSS
Inline Quick-Checks: verification-checklist.md also contains Inline Quick-Checks that MUST be run immediately after writing each file (before Step 10). These catch errors while content is still in active context.
Two-pass usage:
- Before writing (Workflow pre-work): Scan Phase 1 and Phase 2 to internalize structural and diagram quality gates. This prevents rework.
- After writing (Step 10): Run ALL Phase 0–4 checks comprehensively against the completed output. Phase 0 is the most critical — it catches the deviations that persist across runs. Fix any failures before finalizing.
Delegation: Hand the verification sub-agent the content of verification-checklist.md and the output folder. It will run all checks and produce a PASS/FAIL summary. Fix any failures before finalizing.
Starting the Analysis
If no folder path is provided, analyze the entire repository from its root.