From afba5b86b8e8c565b78d213d2cbc9e87e6f32de3 Mon Sep 17 00:00:00 2001
From: Vijay Chegu <21251550+cheguv@users.noreply.github.com>
Date: Sun, 29 Mar 2026 20:58:56 +0000
Subject: [PATCH] Add threat-model-analyst skill: STRIDE-A threat modeling for
 repositories (#1177)

* Add threat-model-analyst skill: STRIDE-A threat modeling for repositories

Add a comprehensive threat model analysis skill that performs security audits
using STRIDE-A (STRIDE + Abuse) threat modeling, Zero Trust principles, and
defense-in-depth analysis.

Supports two modes:
- Single analysis: full STRIDE-A threat model producing architecture overviews,
  DFD diagrams, prioritized findings, and executive assessments
- Incremental analysis: security posture diff between baseline report and current
  code, producing standalone reports with embedded comparison

Includes bundled reference assets:
- Orchestrator workflows (full and incremental)
- Analysis principles and verification checklists
- Output format specifications and skeleton templates
- DFD diagram conventions and TMT element taxonomy

* Address PR review comments from Copilot reviewer

- Fix SKILL.md description: use single-quoted scalar, rename mode (2) to
  'Incremental analysis' with accurate description
- Replace 'Compare Mode (Deprecated)' sections with 'Comparing Commits or
  Reports' redirect (no deprecated language for first release)
- Fix skeleton-findings.md: move Tier 1 table rows under header, add
  CONDITIONAL-EMPTY block after END-REPEAT (matching Tier 2/3 structure)
- Fix skeleton-threatmodel.md and skeleton-architecture.md: use 4-backtick
  outer fences to avoid nested fence conflicts with inner mermaid fences
- Fix skeleton-incremental-html.md: correct section count from 9 to 8
- Fix output-formats.md: change status 'open' to 'Open' in JSON example,
  move stride_category warning outside JSON fence as blockquote
- Fix incremental-orchestrator.md: replace stale compare-output-formats.md
  reference with inline color conventions
- Regenerate docs/README.skills.md with updated description

* Address second round of Copilot review comments

- Fix diagram-conventions.md: bidirectional flow notation now uses <-->
  matching orchestrator.md and DFD templates
- Fix tmt-element-taxonomy.md: normalize SE.DF.SSH/LDAP/LDAPS to use
  SE.DF.TMCore.* prefix consistent with all other data flow IDs
- Fix output-formats.md: correct TMT category example from SQLDatabase
  to SQL matching taxonomy, fix component type from 'datastore' to
  'data_store' matching canonical enum, remove DaprSidecar from
  inbound_from per no-standalone-sidecar rule
- Fix 5 skeleton files: clarify VERBATIM instruction to 'copy the
  template content below (excluding the outer code fence)' to prevent
  agents from wrapping output in markdown fences
- Genericize product-specific names in examples: replace edgerag with
  myapp, BitNetManager with TaskProcessor, AzureLocalMCP with MyApp.Core,
  AzureLocalInfra with OnPremInfra, MilvusVectorDB with VectorDB

* Address third round of Copilot review comments

- Fix diagram-conventions.md: second bidirectional two-arrow pattern in
  Quick Reference section now uses <-->
- Fix incremental-orchestrator.md: renumber HTML sections 5-9 to 4-8
  matching skeleton-incremental-html.md 8-section structure
- Fix output-formats.md: add incremental-comparison.html to File List
  as conditional output for incremental mode
- Fix skeleton-inventory.md: add tmt_type, sidecars, and boundary_kind
  fields to match output-formats.md JSON schema example
---
 docs/README.skills.md                         |    1 +
 skills/threat-model-analyst/SKILL.md          |   75 ++
 .../references/analysis-principles.md         |  421 +++++++
 .../references/diagram-conventions.md         |  491 ++++++++
 .../references/incremental-orchestrator.md    |  708 +++++++++++
 .../references/orchestrator.md                |  593 +++++++++
 .../references/output-formats.md              | 1062 +++++++++++++++++
 .../skeletons/skeleton-architecture.md        |  133 +++
 .../skeletons/skeleton-assessment.md          |  273 +++++
 .../references/skeletons/skeleton-dfd.md      |   68 ++
 .../references/skeletons/skeleton-findings.md |  197 +++
 .../skeletons/skeleton-incremental-html.md    |  150 +++
 .../skeletons/skeleton-inventory.md           |  139 +++
 .../skeletons/skeleton-stride-analysis.md     |  106 ++
 .../skeletons/skeleton-summary-dfd.md         |   62 +
 .../skeletons/skeleton-threatmodel.md         |   65 +
 .../references/tmt-element-taxonomy.md        |  187 +++
 .../references/verification-checklist.md      |  639 ++++++++++
 18 files changed, 5370 insertions(+)
 create mode 100644 skills/threat-model-analyst/SKILL.md
 create mode 100644 skills/threat-model-analyst/references/analysis-principles.md
 create mode 100644 skills/threat-model-analyst/references/diagram-conventions.md
 create mode 100644 skills/threat-model-analyst/references/incremental-orchestrator.md
 create mode 100644 skills/threat-model-analyst/references/orchestrator.md
 create mode 100644 skills/threat-model-analyst/references/output-formats.md
 create mode 100644 skills/threat-model-analyst/references/skeletons/skeleton-architecture.md
 create mode 100644 skills/threat-model-analyst/references/skeletons/skeleton-assessment.md
 create mode 100644 skills/threat-model-analyst/references/skeletons/skeleton-dfd.md
 create mode 100644 skills/threat-model-analyst/references/skeletons/skeleton-findings.md
 create mode 100644 skills/threat-model-analyst/references/skeletons/skeleton-incremental-html.md
 create mode 100644 skills/threat-model-analyst/references/skeletons/skeleton-inventory.md
 create mode 100644 skills/threat-model-analyst/references/skeletons/skeleton-stride-analysis.md
 create mode 100644 skills/threat-model-analyst/references/skeletons/skeleton-summary-dfd.md
 create mode 100644 skills/threat-model-analyst/references/skeletons/skeleton-threatmodel.md
 create mode 100644 skills/threat-model-analyst/references/tmt-element-taxonomy.md
 create mode 100644 skills/threat-model-analyst/references/verification-checklist.md

diff --git a/docs/README.skills.md b/docs/README.skills.md
index 259ea5d8..ba71dbdd 100644
--- a/docs/README.skills.md
+++ b/docs/README.skills.md
@@ -257,6 +257,7 @@ See [CONTRIBUTING.md](../CONTRIBUTING.md#adding-skills) for guidelines on how to
 | [swift-mcp-server-generator](../skills/swift-mcp-server-generator/SKILL.md) | Generate a complete Model Context Protocol server project in Swift using the official MCP Swift SDK package. | None |
 | [technology-stack-blueprint-generator](../skills/technology-stack-blueprint-generator/SKILL.md) | Comprehensive technology stack blueprint generator that analyzes codebases to create detailed architectural documentation. Automatically detects technology stacks, programming languages, and implementation patterns across multiple platforms (.NET, Java, JavaScript, React, Python). Generates configurable blueprints with version information, licensing details, usage patterns, coding conventions, and visual diagrams. Provides implementation-ready templates and maintains architectural consistency for guided development. | None |
 | [terraform-azurerm-set-diff-analyzer](../skills/terraform-azurerm-set-diff-analyzer/SKILL.md) | Analyze Terraform plan JSON output for AzureRM Provider to distinguish between false-positive diffs (order-only changes in Set-type attributes) and actual resource changes. Use when reviewing terraform plan output for Azure resources like Application Gateway, Load Balancer, Firewall, Front Door, NSG, and other resources with Set-type attributes that cause spurious diffs due to internal ordering changes. | `references/azurerm_set_attributes.json`<br />`references/azurerm_set_attributes.md`<br />`scripts/.gitignore`<br />`scripts/README.md`<br />`scripts/analyze_plan.py` |
+| [threat-model-analyst](../skills/threat-model-analyst/SKILL.md) | Full STRIDE-A threat model analysis and incremental update skill for repositories and systems. Supports two modes: (1) Single analysis — full STRIDE-A threat model of a repository, producing architecture overviews, DFD diagrams, STRIDE-A analysis, prioritized findings, and executive assessments. (2) Incremental analysis — takes a previous threat model report as baseline, compares the codebase at the latest (or a given commit), and produces an updated report with change tracking (new, resolved, still-present threats), STRIDE heatmap, findings diff, and an embedded HTML comparison. Only activate when the user explicitly requests a threat model analysis, incremental update, or invokes /threat-model-analyst directly. | `references/analysis-principles.md`<br />`references/diagram-conventions.md`<br />`references/incremental-orchestrator.md`<br />`references/orchestrator.md`<br />`references/output-formats.md`<br />`references/skeletons`<br />`references/tmt-element-taxonomy.md`<br />`references/verification-checklist.md` |
 | [tldr-prompt](../skills/tldr-prompt/SKILL.md) | Create tldr summaries for GitHub Copilot files (prompts, agents, instructions, collections), MCP servers, or documentation from URLs and queries. | None |
 | [transloadit-media-processing](../skills/transloadit-media-processing/SKILL.md) | Process media files (video, audio, images, documents) using Transloadit. Use when asked to encode video to HLS/MP4, generate thumbnails, resize or watermark images, extract audio, concatenate clips, add subtitles, OCR documents, or run any media processing pipeline. Covers 86+ processing robots for file transformation at scale. | None |
 | [typescript-mcp-server-generator](../skills/typescript-mcp-server-generator/SKILL.md) | Generate a complete MCP server project in TypeScript with tools, resources, and proper configuration | None |
diff --git a/skills/threat-model-analyst/SKILL.md b/skills/threat-model-analyst/SKILL.md
new file mode 100644
index 00000000..9b38ea26
--- /dev/null
+++ b/skills/threat-model-analyst/SKILL.md
@@ -0,0 +1,75 @@
+---
+name: threat-model-analyst
+description: 'Full STRIDE-A threat model analysis and incremental update skill for repositories and systems. Supports two modes: (1) Single analysis — full STRIDE-A threat model of a repository, producing architecture overviews, DFD diagrams, STRIDE-A analysis, prioritized findings, and executive assessments. (2) Incremental analysis — takes a previous threat model report as baseline, compares the codebase at the latest (or a given commit), and produces an updated report with change tracking (new, resolved, still-present threats), STRIDE heatmap, findings diff, and an embedded HTML comparison. Only activate when the user explicitly requests a threat model analysis, incremental update, or invokes /threat-model-analyst directly.'
+---
+
+# Threat Model Analyst
+
+You are an expert **Threat Model Analyst**. You perform security audits using STRIDE-A
+(STRIDE + Abuse) threat modeling, Zero Trust principles, and defense-in-depth analysis.
+You flag secrets, insecure boundaries, and architectural risks.
+
+## Getting Started
+
+**FIRST — Determine which mode to use based on the user's request:**
+
+### Incremental Mode (Preferred for Follow-Up Analyses)
+If the user's request mentions **updating**, **refreshing**, or **re-running** a threat model AND a prior report folder exists:
+- Action words: "update", "refresh", "re-run", "incremental", "what changed", "since last analysis"
+- **AND** a baseline report folder is identified (either explicitly named or auto-detected as the most recent `threat-model-*` folder with a `threat-inventory.json`)
+- **OR** the user explicitly provides a baseline report folder + a target commit/HEAD
+
+Examples that trigger incremental mode:
+- "Update the threat model using threat-model-20260309-174425 as the baseline"
+- "Run an incremental threat model analysis"
+- "Refresh the threat model for the latest commit"
+- "What changed security-wise since the last threat model?"
+
+→ Read [incremental-orchestrator.md](./references/incremental-orchestrator.md) and follow the **incremental workflow**.
+  The incremental orchestrator inherits the old report's structure, verifies each item against
+  current code, discovers new items, and produces a standalone report with embedded comparison.
+
+### Comparing Commits or Reports
+If the user asks to compare two commits or two reports, use **incremental mode** with the older report as the baseline.
+→ Read [incremental-orchestrator.md](./references/incremental-orchestrator.md) and follow the **incremental workflow**.
+
+### Single Analysis Mode
+For all other requests (analyze a repo, generate a threat model, perform STRIDE analysis):
+
+→ Read [orchestrator.md](./references/orchestrator.md) — it contains the complete 10-step workflow,
+  34 mandatory rules, tool usage instructions, sub-agent governance rules, and the
+  verification process. Do not skip this step.
+
+## Reference Files
+
+Load the relevant file when performing each task:
+
+| File | Use When | Content |
+|------|----------|---------|
+| [Orchestrator](./references/orchestrator.md) | **Always — read first** | Complete 10-step workflow, 34 mandatory rules, sub-agent governance, tool usage, verification process |
+| [Incremental Orchestrator](./references/incremental-orchestrator.md) | **Incremental/update analyses** | Complete incremental workflow: load old skeleton, change detection, generate report with status annotations, HTML comparison |
+| [Analysis Principles](./references/analysis-principles.md) | Analyzing code for security issues | Verify-before-flagging rules, security infrastructure inventory, OWASP Top 10:2025, platform defaults, exploitability tiers, severity standards |
+| [Diagram Conventions](./references/diagram-conventions.md) | Creating ANY Mermaid diagram | Color palette, shapes, sidecar co-location rules, pre-render checklist, DFD vs architecture styles, sequence diagram styles |
+| [Output Formats](./references/output-formats.md) | Writing ANY output file | Templates for 0.1-architecture.md, 1-threatmodel.md, 2-stride-analysis.md, 3-findings.md, 0-assessment.md, common mistakes checklist |
+| [Skeletons](./references/skeletons/) | **Before writing EACH output file** | 8 verbatim fill-in skeletons (`skeleton-*.md`) — read the relevant skeleton, copy VERBATIM, fill `[FILL]` placeholders. One skeleton per output file. Loaded on-demand to minimize context usage. |
+| [Verification Checklist](./references/verification-checklist.md) | Final verification pass + inline quick-checks | All quality gates: inline quick-checks (run after each file write), per-file structural, diagram rendering, cross-file consistency, evidence quality, JSON schema — designed for sub-agent delegation |
+| [TMT Element Taxonomy](./references/tmt-element-taxonomy.md) | Identifying DFD elements from code | Complete TMT-compatible element type taxonomy, trust boundary detection, data flow patterns, code analysis checklist |
+
+## When to Activate
+
+**Incremental Mode** (read [incremental-orchestrator.md](./references/incremental-orchestrator.md) for workflow):
+- Update or refresh an existing threat model analysis
+- Generate a new analysis that builds on a prior report's structure
+- Track what threats/findings were fixed, introduced, or remain since a baseline
+- When a prior `threat-model-*` folder exists and the user wants a follow-up analysis
+
+**Single Analysis Mode:**
+- Perform full threat model analysis of a repository or system
+- Generate threat model diagrams (DFD) from code
+- Perform STRIDE-A analysis on components and data flows
+- Validate security control implementations
+- Identify trust boundary violations and architectural risks
+- Write prioritized security findings with CVSS 4.0 / CWE / OWASP mappings
+
+**Comparing commits or reports:**
+- To compare security posture between commits, use incremental mode with the older report as baseline
diff --git a/skills/threat-model-analyst/references/analysis-principles.md b/skills/threat-model-analyst/references/analysis-principles.md
new file mode 100644
index 00000000..be8a2dd3
--- /dev/null
+++ b/skills/threat-model-analyst/references/analysis-principles.md
@@ -0,0 +1,421 @@
+# Analysis Principles — Security Analysis Methodology
+
+This file contains ALL rules for how to analyze code for security threats. It is self-contained — everything needed to perform correct, evidence-based security analysis is here.
+
+---
+
+## ⛔ CRITICAL: Verify Before Flagging
+
+**NEVER flag a security gap without confirming it exists.** Many platforms have secure defaults.
+
+### Three-Step Verification
+
+1. **Check for security infrastructure components** before claiming security is missing:
+   - Certificate authorities (Dapr Sentry, cert-manager, Vault)
+   - Service mesh control planes (Istio, Linkerd, Dapr)
+   - Policy engines (OPA, Kyverno, Gatekeeper)
+   - Secret managers (Vault, Azure Key Vault, AWS Secrets Manager)
+   - Identity providers (MISE, OAuth proxies, OIDC)
+
+2. **Understand platform defaults** — research before assuming:
+   - Dapr: mTLS enabled by default when Sentry is deployed
+   - Kubernetes: RBAC enabled by default since v1.6
+   - Istio: mTLS in PERMISSIVE mode by default, STRICT available
+   - Azure: Many services encrypted at rest by default
+
+3. **Distinguish configuration states**:
+   - **Explicitly disabled**: `enabled: false` → Flag as finding
+   - **Not configured**: No setting present → Check platform default first
+   - **Implicitly enabled**: Default behavior is secure → Document as control, not gap
+
+### Evidence Quality Requirements
+
+For every finding:
+- Show the specific config/code that proves the gap (not just absence of config)
+- For "missing security" claims, prove the default is insecure
+- Cross-reference with platform documentation when uncertain
+
+---
+
+## Security Infrastructure Inventory
+
+Before STRIDE-A analysis, identify ALL security-enabling components present in the codebase:
+
+| Category | Components to Look For | Security They Provide |
+|----------|----------------------|----------------------|
+| Service Mesh | Dapr, Istio, Linkerd, Consul Connect | mTLS, traffic policies, observability |
+| Certificate Management | Sentry, cert-manager, Vault PKI | Automatic cert issuance/rotation |
+| Authentication | MISE, OAuth2-proxy, Dex, Keycloak | Token validation, SSO |
+| Authorization | OPA, Kyverno, Gatekeeper, RBAC | Policy enforcement |
+| Secrets | Vault, External Secrets, CSI drivers | Secret injection, rotation |
+| Network | NetworkPolicy, Calico, Cilium | Microsegmentation |
+
+**If these components exist, their security features are likely active unless explicitly disabled.**
+
+---
+
+## Security Analysis Lenses
+
+Apply these frameworks during analysis:
+
+- **Zero Trust**: Verify explicitly, least privilege, assume breach
+- **Defense in Depth**: Identify missing security layers
+- **Abuse Cases**: Business logic abuse, workflow manipulation, feature misuse
+
+---
+
+## Comprehensive Coverage Requirements
+
+**Do NOT truncate analysis for larger codebases.** All components must receive equal analytical depth.
+
+### Sidecar Security Analysis
+
+⚠️ **Sidecars (Dapr, MISE, Envoy, etc.) are NOT separate components in the DFD** — they are co-located in the same pod as the primary container (see diagram-conventions.md Rule 2). However, sidecar communication MUST still be analyzed for security vulnerabilities.
+
+**How to analyze sidecar threats:**
+- Sidecars with distinct threat surfaces (e.g., MISE auth bypass, Dapr mTLS) get their own `## Component` section in `2-stride-analysis.md` — but are NOT separate DFD nodes (see diagram-conventions.md Rule 2)
+- Use the format: threat title includes the sidecar name, e.g., "Dapr Sidecar Plaintext Communication"
+- Common sidecar threats:
+  - **Information Disclosure (I):** Dapr/MISE sidecar communicating with main container over plaintext HTTP within the pod
+  - **Tampering (T):** Dapr pub/sub messages not signed or encrypted
+  - **Spoofing (S):** MISE token validation bypass if sidecar is compromised
+  - **Elevation of Privilege (E):** Sidecar running with elevated privileges that the main container doesn't need
+- CWE mapping: CWE-319 (Cleartext Transmission), CWE-311 (Missing Encryption), CWE-250 (Unnecessary Privileges)
+- These threats appear in the sidecar's own STRIDE section (if it has a distinct threat surface) or under the primary component's table (if the sidecar is a simple infrastructure proxy)
+- If the sidecar vulnerability warrants a finding, list it under the sidecar component with a note: "Affects [Dapr/MISE] sidecar communication"
+
+1. **Minimum coverage:** Every component in `0.1-architecture.md` MUST have a corresponding section in `2-stride-analysis.md` with actual threat enumeration (not just "no threats found").
+2. **Finding density check:** As a guideline, expect roughly 1 finding per 2-3 significant components. If a repo has 15+ components and you have fewer than 8 findings, re-examine under-analyzed components.
+3. **Use sub-agents for scale:** For repos with 10+ components, delegate component-specific STRIDE analysis to sub-agents to maintain depth. Each sub-agent should analyze 3-5 components.
+4. **OWASP checklist sweep:** After component-level STRIDE, do a cross-cutting pass using the OWASP Top 10:2025 checklist below. This catches systemic issues (missing auth, no audit logging, no rate limiting, unsigned images) that component-level analysis may miss.
+5. **Infrastructure-layer check:** Explicitly check for: container security contexts, network policies, resource limits, image signing, secrets management, backup/DR controls, and monitoring/alerting gaps.
+6. **Exhaustive findings consolidation:** After STRIDE analysis is complete, scan the STRIDE output for ALL identified threats. Every threat MUST map to either:
+   - A finding in `3-findings.md` (consolidated with related threats)
+   - A `🔄 Mitigated by Platform` entry in the Threat Coverage Verification table (for platform-handled threats only)
+   
+   **⛔ EVERY `Open` THREAT MUST HAVE A FINDING.** The tool does NOT have authority to accept risks, defer threats, or decide that a threat is "acceptable." That is the engineering team's decision. The tool's job is to identify ALL threats and create findings for them. The Coverage table should show `✅ Covered (FIND-XX)` for every Open threat — NEVER `⚠️ Accepted Risk`.
+
+   If you have 40+ threats in STRIDE but only 10 findings, you are under-consolidating. Check for missed data store auth, operational controls, credential management, and supply chain issues.
+
+   **⛔ "ACCEPTED RISK" IS FORBIDDEN (MANDATORY):**
+   - **NEVER use `⚠️ Accepted Risk` as a Coverage table status.** This label implies the tool has accepted a risk on behalf of the engineering team. It has not. It cannot.
+   - **NEVER use `Accepted` as a STRIDE Status value.** Use `Open`, `Mitigated`, or `Platform` only.
+   - If you are tempted to write "Accepted Risk" → create a finding instead. The finding's remediation section tells the team what to do. The team decides whether to accept, fix, or defer.
+
+   **⛔ NEEDS REVIEW RESTRICTIONS (MANDATORY):**
+   - **Tier 1 threats (prerequisites = `None`) MUST NEVER be classified as "⚠️ Needs Review."** A threat exploitable by an unauthenticated external attacker cannot be deferred — it MUST become a finding.
+   - **If a threat has a mitigation listed in the STRIDE analysis, it SHOULD become a finding.** The mitigation text is the remediation — use it to write the finding. Only defer to "Needs Review" if the mitigation is genuinely not actionable.
+   - **DoS threats with `None` prerequisites are Tier 1 findings**, not hardening opportunities. An unauthenticated attacker flooding an API with no rate limiting is a directly exploitable vulnerability (CWE-770, CWE-400).
+   - **Do NOT batch-classify entire STRIDE categories as Needs Review.** Each threat must be evaluated individually based on its prerequisites and exploitability.
+   - **"⚠️ Needs Review" is reserved for:** Tier 2/3 threats where no technical mitigation is possible (e.g., social engineering), or threats requiring business context the tool doesn't have.
+   - **The automated analysis does NOT have authority to accept risks** — it only identifies them. "Needs Review" signals that a human must decide.
+   - **Maximum Needs Review ratio:** If more than 30% of threats are classified as "Needs Review", re-examine — you are likely under-reporting findings. Typical ratio: 10-20% for a well-analyzed codebase.
+7. **Minimum finding thresholds by repo size:**
+   - Small repo (< 20 source files): 8+ findings expected
+   - Medium repo (20-100 source files): 12+ findings expected
+   - Large repo (100+ source files): 18+ findings expected
+   
+   If below threshold, systematically review: auth per component, secrets in code, container security, network segmentation, logging/monitoring, input validation.
+
+8. **Context-aware Platform ratio limits (MANDATORY):**
+   
+   After completing the security infrastructure inventory (Step 1), detect the deployment pattern:
+   
+   | Pattern | Detection Signal | Platform Limit |
+   |---------|-----------------|----------------|
+   | **K8s Operator** | `controller-runtime`, `kubebuilder`, or `operator-sdk` in go.mod/go.sum; `Reconcile()` functions in source | **≤35%** |
+   | **Standalone Application** | All other repos (web apps, CLI tools, services) | **≤20%** |
+   
+   **Why K8s operators have higher Platform ratios:** Operators delegate security to the K8s platform (RBAC for CR access, etcd encryption, API server TLS, webhook cert validation, Azure AD token validation). The operator code CANNOT implement these controls — they are the platform's responsibility. Classifying them as Platform is correct.
+   
+   **Action when Platform exceeds limit:**
+   - Review each Platform-classified threat
+   - If the operator CAN take action (e.g., add input validation, add RBAC checks at startup) → reclassify as `Open` with a finding
+   - If the operator genuinely cannot act (e.g., etcd encryption is a cluster admin concern) → Platform is correct
+   - Document the detected pattern and ratio in `0-assessment.md` → Analysis Context & Assumptions
+
+---
+
+## Technology-Specific Security Checklist
+
+**After completing STRIDE analysis**, scan the codebase for each technology below. For every technology found, verify the corresponding security checks are covered in findings or documented as mitigated. This catches specific vulnerabilities that component-level STRIDE often misses.
+
+| Technology Found | MUST Check For | Common Finding |
+|-----------------|---------------|----------------|
+| **Redis** | `requirepass` disabled, no TLS, no ACL | Auth disabled by default → finding |
+| **Milvus** | `authorizationEnabled: false`, no TLS, public gRPC port | Auth disabled by default → finding |
+| **PostgreSQL/SQL DB** | Superuser usage, `ssl=false`, SQL injection, connection string credentials | Input validation + auth |
+| **MongoDB** | Auth disabled, no TLS, `--noauth` flag | Auth disabled by default |
+| **NGINX/Ingress** | Missing TLS, server_info headers, snippet injection, rate limiting | Config hardening |
+| **Docker/Containers** | Running as root, no `USER` directive, host mounts, no seccomp/AppArmor, unsigned images | Container hardening |
+| **ML/AI Models** | Unauthenticated inference endpoint, model poisoning, prompt injection, no input validation | Endpoint auth + input validation |
+| **LLM/Cloud AI** | PII/secrets sent to external LLM, no content filtering, prompt injection, data exfiltration | Data exposure to cloud |
+| **Kubernetes** | No NetworkPolicy, no PodSecurityPolicy/Standards, no resource limits, RBAC gaps | Network segmentation + resource limits |
+| **Helm Charts** | Hardcoded secrets in values.yaml, no image tag pinning, no security contexts | Config + supply chain |
+| **Key Management** | Hardcoded RSA/HMAC keys, weak key generation, no rotation, keys in source | Cryptographic failures |
+| **CI/CD Pipelines** | Secrets in logs, no artifact signing, mutable dependencies, script injection | Supply chain |
+| **REST APIs** | Missing auth, no rate limiting, verbose errors, no input validation | Auth + injection |
+| **gRPC Services** | No TLS, no auth interceptor, reflection enabled in production | Auth + encryption |
+| **Message Queues** | No auth on pub/sub, no encryption, no message signing | Auth + integrity |
+| **NFS/File Shares** | Path traversal, no access control, world-readable mounts | Access control |
+| **Audit/Logging** | No security event logging, log injection, no tamper protection | Monitoring gaps |
+
+**Process:** After writing 3-findings.md, scan this table for technologies present in the repo. For each technology, evaluate its common technology-specific threat patterns based on how that technology is actually used, and ensure any relevant risks are accounted for in the assessment. Add a finding only if an actual threat or meaningful mitigation gap is identified.
+
+---
+
+## OWASP Top 10:2025 Checklist
+
+Check for these vulnerability categories during analysis:
+
+| ID | Category | Check For |
+|----|----------|----------|
+| A01 | Broken Access Control | Missing authZ, privilege escalation, IDOR, CORS misconfig |
+| A02 | Security Misconfiguration | Default creds, verbose errors, unnecessary features, missing hardening |
+| A03 | Software Supply Chain Failures | Vulnerable dependencies, malicious packages, compromised CI/CD |
+| A04 | Cryptographic Failures | Weak algorithms, exposed secrets, improper key management, plaintext data |
+| A05 | Injection | SQL, NoSQL, OS command, LDAP, XSS, template injection |
+| A06 | Insecure Design | Missing security controls at architecture level, threat modeling gaps |
+| A07 | Authentication Failures | Broken auth, weak sessions, credential stuffing, missing MFA |
+| A08 | Software/Data Integrity Failures | Insecure deserialization, unsigned updates, CI/CD tampering |
+| A09 | Security Logging & Alerting Failures | Missing audit logs, no alerting, log injection, insufficient monitoring |
+| A10 | Mishandling of Exceptional Conditions | Poor error handling, race conditions, resource exhaustion |
+
+Reference: https://owasp.org/Top10/2025/
+
+---
+
+## Platform Security Defaults Reference
+
+Before flagging missing security, check these common secure-by-default behaviors:
+
+| Platform | Feature | Default Behavior | How to Verify |
+|----------|---------|------------------|---------------|
+| **Dapr** | mTLS | Enabled when Sentry deployed | Check for `dapr_sentry` or `sentry` component |
+| **Dapr** | Access Control | Deny if policies defined | Look for `accessControl` in Configuration |
+| **Kubernetes** | RBAC | Enabled since v1.6 | Check `--authorization-mode` includes RBAC |
+| **Kubernetes** | Secrets | Base64 encoded (not encrypted) | Check for encryption provider config |
+| **Istio** | mTLS | PERMISSIVE by default | Check PeerAuthentication resources |
+| **Azure Storage** | Encryption at rest | Enabled by default | Always encrypted, check key management |
+| **Azure SQL** | TDE | Enabled by default | Transparent data encryption on |
+| **PostgreSQL** | SSL | Often disabled by default | Check `ssl` parameter |
+| **Redis** | Auth | Disabled by default | Check `requirepass` configuration |
+| **Milvus** | Auth | Disabled by default | Check `authorizationEnabled` |
+| **NGINX Ingress** | TLS | Not enabled by default | Check for TLS secret in Ingress |
+| **Docker** | User | Root by default | Check `USER` in Dockerfile |
+
+**Key insight**: Service meshes (Dapr, Istio, Linkerd) typically enable mTLS automatically. Databases (Redis, Milvus, MongoDB) typically have auth disabled by default.
+
+---
+
+## Exploitability Tiers
+
+Threats are classified into three exploitability tiers based on prerequisites:
+
+| Tier | Label | Prerequisites | Assignment Rule |
+|------|-------|---------------|----------------|
+| **Tier 1** | Direct Exposure | `None` | Exploitable by unauthenticated external attacker with NO prior access. |
+| **Tier 2** | Conditional Risk | Single prerequisite | Requires exactly ONE form of access: `Authenticated User`, `Privileged User`, `Internal Network`, or single `{Boundary} Access`. |
+| **Tier 3** | Defense-in-Depth | Multiple prerequisites or infrastructure access | Requires `Host/OS Access`, `Admin Credentials`, `{Component} Compromise`, `Physical Access`, or multiple prerequisites with `+`. |
+
+### Tier Assignment Rules
+
+**⛔ CANONICAL PREREQUISITE → TIER MAPPING (deterministic, no exceptions):**
+
+Prerequisites MUST use only these values (closed enum). The tier follows mechanically:
+
+| Prerequisite | Tier | Rationale |
+|-------------|------|----------|
+| `None` | **Tier 1** | Unauthenticated external attacker, no prior access |
+| `Authenticated User` | **Tier 2** | Requires valid credentials |
+| `Privileged User` | **Tier 2** | Requires admin/operator role |
+| `Internal Network` | **Tier 2** | Requires position on internal network |
+| `Local Process Access` | **Tier 2** | Requires code execution on same host (localhost listener, IPC) |
+| `Host/OS Access` | **Tier 3** | Requires filesystem, console, or debug access to the host |
+| `Admin Credentials` | **Tier 3** | Requires admin credentials + host access |
+| `Physical Access` | **Tier 3** | Requires physical presence (USB, serial) |
+| `{Component} Compromise` | **Tier 3** | Requires prior compromise of another component |
+| Any `A + B` combination | **Tier 3** | Multiple prerequisites = always Tier 3 |
+
+**⛔ FORBIDDEN prerequisite values:** `Application Access`, `Host Access` (ambiguous — use `Local Process Access` or `Host/OS Access`).
+
+**Deployment context overrides:** If Deployment Classification is `LOCALHOST_DESKTOP` or `LOCALHOST_SERVICE`, the prerequisite `None` is FORBIDDEN for all components — use `Local Process Access` or `Host/OS Access` instead. The tier then follows from the corrected prerequisite.
+
+### ⛔ Prerequisite Determination (MANDATORY — Evidence-Based, Not Judgment-Based)
+
+**Prerequisites MUST be determined from deployment configuration evidence, not from general knowledge or assumptions.** Two independent analysis runs on the same code MUST assign the same prerequisites because they are objective facts about the deployment.
+
+**Generic Decision Procedure (applies to ALL environments):**
+
+1. **Network Exposure Check — Is the component reachable from outside?**
+   - Look for evidence of external exposure in the codebase:
+     - API gateway / reverse proxy routes pointing to the component
+     - Firewall rules or security group configurations
+     - Load balancer configurations
+     - DNS records or public endpoint definitions
+   - If ANY external route exists → prerequisites = `None` for network-based threats
+   - If NO external route exists AND the component is on an internal-only network → prerequisites = `Internal Network`
+
+2. **Authentication Check — Does the endpoint require credentials?**
+   - Look for authentication middleware, decorators, or filters in the component's code:
+     - `@require_auth`, `[Authorize]`, `@login_required`, auth middleware in Express/FastAPI
+     - API key validation in request handlers
+     - OAuth/OIDC token validation
+     - mTLS certificate requirements
+   - If auth is ENFORCED on all endpoints → prerequisite = `Authenticated User`
+   - If auth is OPTIONAL or DISABLED by config flag → prerequisite = `None` (disabled auth = no barrier)
+   - If auth exists but has bypass routes (e.g., `/health`, `/metrics` without auth) → those specific routes have prerequisite = `None`
+
+3. **Authorization Check — What level of access is required?**
+   - If no RBAC/role check beyond authentication → prerequisite stays `Authenticated User`
+   - If admin/operator role required → prerequisite = `Privileged User`
+   - If specific permissions required → prerequisite names the permission (e.g., `ClusterAdmin Role`)
+
+4. **Physical/Local Access Check:**
+   - If the component only listens on `localhost`/`127.0.0.1` → prerequisite = `Local Process Access` (T2)
+   - If access requires console/SSH/filesystem → prerequisite = `Host/OS Access` (T3)
+   - If access requires physical presence (USB, serial port) → prerequisite = `Physical Access` (T3)
+   - If component has no listener (console app, library, outbound-only) → prerequisite = `Host/OS Access` (T3)
+
+5. **Default Rule:** If you cannot determine exposure from config → look up the component's `Min Prerequisite` in the Component Exposure Table. If the table is not yet filled, assume `Local Process Access` (T2) as a safe default for unknown components. **NEVER assume `None` without positive evidence of external reachability.** **NEVER assume `Internal Network` without evidence of network restriction.**
+
+**Platform-Specific Evidence Sources:**
+
+| Platform | Where to check exposure | Internal indicator | External indicator |
+|----------|------------------------|--------------------|--------------------|
+| **Kubernetes** | Service type, Ingress rules, values.yaml | `ClusterIP` service, no Ingress | `LoadBalancer`/`NodePort`, Ingress path exists |
+| **Docker Compose** | `ports:` mapping, network config | No `ports:` mapping, internal network only | `ports: "8080:8080"` maps to host |
+| **Azure App Service** | App settings, access restrictions | VNet integration, private endpoint | Public URL, no IP restrictions |
+| **VM / Bare Metal** | Firewall rules, NSG, iptables | Port blocked in firewall/NSG | Port open, public IP bound |
+| **Serverless (Functions)** | Function auth level, API Management | `authLevel: function/admin` | `authLevel: anonymous` |
+| **.NET / Java / Node** | Startup config, middleware pipeline | `app.UseAuthentication()` enforced | No auth middleware, or auth disabled |
+| **Python (FastAPI/Flask)** | Middleware, dependency injection | `Depends(get_current_user)` on routes | No auth dependency, open routes |
+
+**⛔ NEVER assign prerequisites based on "what seems reasonable" or architecture assumptions.** Check the actual deployment config. The same component MUST get the same prerequisite across runs because the config doesn't change between runs.
+
+**Common violations:**
+- Assigning `Internal Network` to a component that has an ingress route → hides real external exposure
+- Assuming databases are "internal only" without checking if they have a public endpoint or ingress route
+- Assuming ML model servers are "internal" when they may be exposed for direct inference requests
+
+### CVSS-to-Tier Consistency Check (MANDATORY)
+
+**After assigning CVSS vectors AND tiers, cross-check for contradictions:**
+
+| CVSS Metric | Value | Tier Implication |
+|-------------|-------|------------------|
+| `AV:L` (Attack Vector: Local) | Requires local access | **Cannot be Tier 1** — must be T2 or T3 |
+| `AV:A` (Attack Vector: Adjacent) | Requires adjacent network | **Cannot be Tier 1** — must be T2 or T3 |
+| `AV:P` (Attack Vector: Physical) | Requires physical access | **Must be Tier 3** |
+| `PR:H` (Privileges Required: High) | Requires admin/privileged access | **Cannot be Tier 1** — must be T2 or T3 |
+| `PR:L` (Privileges Required: Low) | Requires authenticated user | **Cannot be Tier 1** — must be T2 |
+| `PR:N` + `AV:N` | No privileges, network accessible | Tier 1 candidate (confirm no deployment override) |
+
+⚠️ **If a finding has `AV:L` and `Tier 1`, this is ALWAYS an error.** Fix by either:
+- Changing the tier to T2/T3 (correct approach for localhost-only services), OR
+- Changing the CVSS AV to `AV:N` if the service is actually network-accessible (rare)
+
+⚠️ **If a finding has `PR:H` and `Tier 1`, this is ALWAYS an error.** Admin-required findings are T2 minimum.
+
+### Deployment Context Affects Tier Classification
+
+**CRITICAL: This section OVERRIDES the default tier rules above when specific deployment conditions apply.**
+
+Before assigning tiers, determine the system's deployment model from code, docs, and architecture. Record the **Deployment Classification** and **Component Exposure Table** in `0.1-architecture.md` (see `skeleton-architecture.md`).
+
+**Deployment Classifications and their tier implications:**
+
+| Classification | Description | T1 Allowed? | Min Prerequisite |
+|----------------|-------------|-------------|------------------|
+| `LOCALHOST_DESKTOP` | Console/GUI app, no network listeners (or localhost-only), single-user workstation | ❌ **NO** — all findings T2+ | `Host/OS Access` (T3) or `Local Process Access` (T2) |
+| `LOCALHOST_SERVICE` | Daemon/service binding to 127.0.0.1 only | ❌ **NO** — all findings T2+ | `Local Process Access` (T2) |
+| `AIRGAPPED` | No internet connectivity | ❌ for network-originated attacks | `Internal Network` |
+| `K8S_SERVICE` | Kubernetes Deployment with ClusterIP/LoadBalancer | ✅ YES | Depends on Service type |
+| `NETWORK_SERVICE` | Public API, cloud endpoint, internet-facing | ✅ YES | `None` (if no auth) |
+
+**The Component Exposure Table in `0.1-architecture.md` sets the prerequisite floor per component.** No threat or finding may have a lower prerequisite than the table permits. This table is filled in Step 1 and is binding on all subsequent analysis steps.
+
+**Legacy override table (still applies as fallback):**
+
+| Deployment Indicator | Tier Override Rule |
+|---------------------|-------------------|
+| Binds to `localhost`/`127.0.0.1` only | Cannot be T1 — requires local access (T2 minimum) |
+| Air-gapped / no internet | Downgrade network-based attacks by one tier |
+| Single-admin workstation tool | Cannot be T1 unless exploitable by a non-admin local user |
+| Docker/container on single machine | Docker socket access = T2 (local admin required) |
+| Named pipe / Unix socket | Cannot be T1 — requires local process access |
+
+**How to apply:**
+1. In Step 1 (context gathering), identify deployment model and record in 0.1-architecture.md
+2. In Step 6/7 (finding verification), check each T1 candidate against the table above
+3. If ANY override applies, downgrade to T2 (or T3 if multiple)
+4. Document the override rationale in the finding’s Description
+
+**Example:** Kusto container on air-gapped workstation, listening on port 80 without auth:
+- Default classification: T1 (unauthenticated, port 80)
+- Override: localhost-only + single-admin → **T2** (attacker needs local access to an admin workstation)
+
+**Do NOT override** for:
+- Kubernetes services (any pod can reach them → lateral movement is realistic → keep T1)
+- Network-exposed APIs (any network user can reach them → keep T1)
+- Cloud endpoints (public internet → keep T1)
+- **Network-exposed APIs**: An unauthenticated API on a listening port IS Tier 1.
+
+The prerequisite for Tier 1 is `None` — meaning an **unauthenticated external attacker** with no prior access. If exploiting a vulnerability requires local admin access, OS-level access, or physical presence, it cannot be Tier 1.
+
+---
+
+## Finding Classification
+
+Before documenting each finding, verify:
+
+- [ ] **Positive evidence exists**: Can you show config/code that proves the vulnerability?
+- [ ] **Not a secure default**: Have you checked if the platform enables security by default?
+- [ ] **Security infrastructure checked**: Did you look for Sentry/cert-manager/Vault/etc.?
+- [ ] **Explicit vs implicit**: Is security explicitly disabled, or just not explicitly enabled?
+- [ ] **Platform documentation consulted**: When uncertain, verify against official docs
+
+**Classification outcomes:**
+- **Confirmed**: Positive evidence of vulnerability → Document as finding in `3-findings.md`
+- **Needs Verification**: Unable to confirm but potential risk → Add to "Needs Verification" in `0-assessment.md`
+- **Not a Finding**: Confirmed secure by default or explicitly enabled → Do not document
+
+---
+
+## Severity Standards
+
+### SDL Bugbar Severity
+Classify each finding per: https://www.microsoft.com/en-us/msrc/sdlbugbar
+
+### CVSS 4.0 Score
+Use CVSS v4.0 Base score (0.0-10.0) with vector string.
+Reference: https://www.first.org/cvss/v4.0/specification-document
+
+### CWE
+Assign Common Weakness Enumeration ID and name.
+Reference: https://cwe.mitre.org/
+
+### OWASP
+Map to OWASP Top 10:2025 category if applicable (A01-A10).
+**ALWAYS use `:2025` suffix** (e.g., `A01:2025`), never `:2021`.
+Reference: https://owasp.org/Top10/2025/
+
+### Remediation Effort
+- **Low**: Configuration change, flag toggle, or single-file fix
+- **Medium**: Multi-file code change, new validation logic, or dependency update
+- **High**: Architecture change, new component, or cross-team coordination
+
+### STRIDE Scope Rule
+- **External services** (AzureOpenAI, AzureAD, Redis, PostgreSQL) **DO get** STRIDE sections — they are attack surfaces from your system's perspective
+- **External actors** (Operator, EndUser) **do NOT get** STRIDE sections — they are threat sources, not targets
+- If you have 20 elements and 2 are external actors, you write 18 STRIDE sections
+
+**⚠️ DO NOT include time estimates.** Never add "(hours)", "(days)", "(weeks)", "~1 hour", "~2 hours", or any duration/effort-to-fix estimates anywhere in the output. The effort level (Low/Medium/High) is sufficient.
+
+### Mitigation Type (OWASP-aligned)
+- **Redesign**: Eliminate the threat by changing architecture (OWASP: Avoid)
+- **Standard Mitigation**: Apply well-known, proven security controls (OWASP: Mitigate)
+- **Custom Mitigation**: Implement a bespoke code fix specific to this system (OWASP: Mitigate)
+- **Existing Control**: Team already built a control that addresses this threat — document it (OWASP: Fix)
+- **Accept Risk**: Acknowledge and document the residual risk (requires justification) (OWASP: Accept)
+- **Transfer Risk**: Shift responsibility to user/operator/third-party (e.g., configuration choice, SLA) (OWASP: Transfer)
diff --git a/skills/threat-model-analyst/references/diagram-conventions.md b/skills/threat-model-analyst/references/diagram-conventions.md
new file mode 100644
index 00000000..c02565b0
--- /dev/null
+++ b/skills/threat-model-analyst/references/diagram-conventions.md
@@ -0,0 +1,491 @@
+# Diagram Conventions — Mermaid Diagrams for Threat Models & Architecture
+
+This file contains ALL rules for creating Mermaid diagrams in threat model reports. It is self-contained — everything needed to produce correct diagrams is here.
+
+---
+
+## ⛔ CRITICAL RULES — READ BEFORE DRAWING ANY DIAGRAM
+
+These rules are the most frequently violated. Read them first, and re-check after every diagram.
+
+### Rule 1: Kubernetes Sidecar Co-location (MANDATORY)
+
+When the target system runs on Kubernetes, **containers that share a Pod must be represented together** — never as independent standalone components.
+
+**DO THIS — annotate the primary container's label:**
+```
+InferencingFlow(("Inferencing Flow<br/>+ MISE, Dapr")):::process
+IngestionFlow(("Ingestion Flow<br/>+ MISE, Dapr")):::process
+VectorDbApi(("VectorDB API<br/>+ Dapr")):::process
+```
+
+**DO NOT DO THIS — never create standalone sidecar nodes:**
+```
+❌ MISE(("MISE Sidecar")):::process
+❌ DaprSidecar(("Dapr Sidecar")):::process
+❌ InferencingFlow -->|"localhost"| MISE
+```
+
+**Why:** Sidecars (Dapr, MISE/auth proxy, Envoy, Istio proxy, log collectors) share the Pod's network namespace, lifecycle, and security context with their primary container. They are NOT independent services.
+
+**This rule applies to ALL diagram types:** architecture, threat model, summary.
+
+### Rule 2: No Intra-Pod Flows (MANDATORY)
+
+**DO NOT draw data flows between a primary container and its sidecars.** These are implicit from the co-location annotation.
+
+```
+❌ InferencingFlow -->|"localhost:3500"| DaprSidecar
+❌ InferencingFlow -->|"localhost:8080"| MISE
+```
+
+Intra-pod communication happens on localhost — it has no security boundary and should not appear in the diagram.
+
+### Rule 3: Cross-Boundary Sidecar Flows Originate from Host Container
+
+When a sidecar makes a call that crosses a trust boundary (e.g., MISE → Azure AD, Dapr → Redis), draw the arrow **from the host container node** — never from a standalone sidecar node.
+
+```
+✅ InferencingFlow -->|"HTTPS (MISE auth)"| AzureAD
+✅ IngestionAPI -->|"HTTPS (MISE auth)"| AzureAD
+✅ InferencingFlow -->|"TCP (Dapr)"| Redis
+
+❌ MISESidecar -->|"HTTPS"| AzureAD
+❌ DaprSidecar -->|"TCP"| Redis
+```
+
+If multiple pods have the same sidecar calling the same external target, draw one arrow per host container. Multiple arrows to the same target is correct.
+
+### Rule 4: Element Table — No Separate Sidecar Rows
+
+Do NOT add separate Element Table rows for sidecars. Describe them in the host container's description column:
+
+```
+✅ | Inferencing Flow | Process | API service + MISE auth proxy + Dapr sidecar | Backend Services |
+❌ | MISE Sidecar     | Process | Auth proxy for Inferencing Flow              | Backend Services |
+```
+
+If a sidecar class has its own threat surface (e.g., MISE auth bypass), it gets a `## Component` section in STRIDE analysis — but it is still NOT a separate diagram node.
+
+---
+
+## Pre-Render Checklist (VERIFY BEFORE FINALIZING)
+
+After drawing ANY diagram, verify:
+
+- [ ] **Every K8s service node annotated with sidecars?** — Each pod's process node includes `<br/>+ SidecarName` for all co-located containers
+- [ ] **Zero standalone sidecar nodes?** — Search diagram for any node named `MISE`, `Dapr`, `Envoy`, `Istio`, `Sidecar` — these must NOT exist as separate nodes
+- [ ] **Zero intra-pod localhost flows?** — No arrows between a container and its sidecars on localhost
+- [ ] **Cross-boundary sidecar flows from host?** — All arrows to external targets (Azure AD, Redis, etc.) originate from the host container node
+- [ ] **Background forced to white?** — `%%{init}%%` block includes `'background': '#ffffff'`
+- [ ] **All classDef include `color:#000000`?** — Black text on every element
+- [ ] **`linkStyle default` present?** — `stroke:#666666,stroke-width:2px`
+- [ ] **All labels quoted?** — `["Name"]`, `(("Name"))`, `-->|"Label"|`
+- [ ] **Subgraph/end pairs matched?** — Every `subgraph` has a closing `end`
+- [ ] **Trust boundary styles applied?** — `stroke:#e31a1c,stroke-width:3px,stroke-dasharray: 5 5`
+
+---
+
+## Color Palette
+
+> **⛔ CRITICAL: Use ONLY these exact hex codes. Do NOT invent colors, use Chakra UI colors (#4299E1, #48BB78, #E53E3E), Tailwind colors, or any other palette. The colors below are from ColorBrewer qualitative palettes for colorblind accessibility. COPY the classDef lines VERBATIM from this file.**
+
+These colors are shared across ALL Mermaid diagrams. Colors are from ColorBrewer qualitative palettes — designed for colorblind accessibility.
+
+| Color Role | Fill | Stroke | Used For |
+|------------|------|--------|----------|
+| Blue | `#6baed6` | `#2171b5` | Services/Processes |
+| Amber | `#fdae61` | `#d94701` | External Interactors |
+| Green | `#74c476` | `#238b45` | Data Stores |
+| Red | n/a | `#e31a1c` | Trust boundaries (threat model only) |
+| Dark gray | n/a | `#666666` | Arrows/links |
+| Text | all: `color:#000000` | | Black text on every element |
+
+### Design Rationale
+
+| Element | Fill | Stroke | Text | Why |
+|---------|------|--------|------|-----|
+| Process | `#6baed6` | `#2171b5` | `#000000` | Medium blue — visible on both themes |
+| External Interactor | `#fdae61` | `#d94701` | `#000000` | Warm amber — distinct from blue/green |
+| Data Store | `#74c476` | `#238b45` | `#000000` | Medium green — natural for storage |
+| Trust Boundary | none | `#e31a1c` | n/a | Red dashed — 3px for visibility |
+| Arrows/Links | n/a | `#666666` | n/a | Dark gray on white background |
+| Background | `#ffffff` | n/a | n/a | Forced white for dark theme safety |
+
+---
+
+## Forced White Background (REQUIRED)
+
+Every Mermaid diagram — flowchart and sequence — MUST include an `%%{init}%%` block that forces a white background. This ensures diagrams render correctly in dark themes.
+
+> **⛔ CRITICAL: Do NOT add `primaryColor`, `secondaryColor`, `tertiaryColor`, or ANY custom color keys to themeVariables. The init block controls ONLY the background and line color. ALL element colors come from classDef lines — never from themeVariables. If you add color overrides to themeVariables, they will BREAK the classDef palette.**
+
+### Flowchart Init Block
+
+Add as the **first line** of every `.mmd` file or ` ```mermaid ` flowchart:
+
+```
+%%{init: {'theme': 'base', 'themeVariables': { 'background': '#ffffff', 'primaryColor': '#ffffff', 'lineColor': '#666666' }}}%%
+```
+
+**THE ABOVE IS THE ONLY ALLOWED INIT BLOCK FOR FLOWCHARTS.** Do not modify it. Do not add keys. Copy it verbatim.
+
+### Arrow / Link Default Styling
+
+Add after classDef lines:
+
+```
+linkStyle default stroke:#666666,stroke-width:2px
+```
+
+### Sequence Diagram Init Block
+
+Sequence diagrams cannot use `classDef`. Use this init block:
+
+```
+%%{init: {'theme': 'base', 'themeVariables': {
+  'background': '#ffffff',
+  'actorBkg': '#6baed6', 'actorBorder': '#2171b5', 'actorTextColor': '#000000',
+  'signalColor': '#666666', 'signalTextColor': '#666666',
+  'noteBkgColor': '#fdae61', 'noteBorderColor': '#d94701', 'noteTextColor': '#000000',
+  'activationBkgColor': '#ddeeff', 'activationBorderColor': '#2171b5',
+  'sequenceNumberColor': '#767676',
+  'labelBoxBkgColor': '#f0f0f0', 'labelBoxBorderColor': '#666666', 'labelTextColor': '#000000',
+  'loopTextColor': '#000000'
+}}}%%
+```
+
+---
+
+## Diagram Type: Threat Model (DFD)
+
+Used in: `1-threatmodel.md`, `1.1-threatmodel.mmd`, `1.2-threatmodel-summary.mmd`
+
+### `.mmd` File Format — CRITICAL
+
+The `.mmd` file contains **raw Mermaid source only** — no markdown, no code fences. The file must start on line 1 with:
+```
+%%{init: {'theme': 'base', 'themeVariables': { 'background': '#ffffff', 'primaryColor': '#ffffff', 'lineColor': '#666666' }}}%%
+```
+Followed by `flowchart LR` on line 2. NEVER use `flowchart TB`.
+
+**WRONG**: File starts with ` ```plaintext ` or ` ```mermaid ` — these are code fences and corrupt the `.mmd` file.
+
+### ClassDef & Shapes
+
+```
+classDef process fill:#6baed6,stroke:#2171b5,stroke-width:2px,color:#000000
+classDef external fill:#fdae61,stroke:#d94701,stroke-width:2px,color:#000000
+classDef datastore fill:#74c476,stroke:#238b45,stroke-width:2px,color:#000000
+```
+
+| Element Type | Shape Syntax | Example |
+|-------------|-------------|---------|
+| Process | `(("Name"))` circle | `WebApi(("Web API")):::process` |
+| External Interactor | `["Name"]` rectangle | `User["User/Browser"]:::external` |
+| Data Store | `[("Name")]` cylinder | `Database[("PostgreSQL")]:::datastore` |
+
+### Trust Boundary Styling
+
+```
+subgraph BoundaryId["Display Name"]
+    %% elements inside
+end
+style BoundaryId fill:none,stroke:#e31a1c,stroke-width:3px,stroke-dasharray: 5 5
+```
+
+### Flow Labels
+
+```
+Unidirectional:  A -->|"Label"| B
+Bidirectional:   A <-->|"Label"| B
+```
+
+### Data Flow IDs
+
+- Detailed flows: `DF01`, `DF02`, `DF03`...
+- Summary flows: `SDF01`, `SDF02`, `SDF03`...
+
+### Complete DFD Template
+
+```mermaid
+%%{init: {'theme': 'base', 'themeVariables': { 'background': '#ffffff', 'primaryColor': '#ffffff', 'lineColor': '#666666' }}}%%
+flowchart LR
+    classDef process fill:#6baed6,stroke:#2171b5,stroke-width:2px,color:#000000
+    classDef external fill:#fdae61,stroke:#d94701,stroke-width:2px,color:#000000
+    classDef datastore fill:#74c476,stroke:#238b45,stroke-width:2px,color:#000000
+    linkStyle default stroke:#666666,stroke-width:2px
+
+    User["User/Browser"]:::external
+
+    subgraph Internal["Internal Network"]
+        WebApi(("Web API")):::process
+        Database[("PostgreSQL")]:::datastore
+    end
+
+    User <-->|"HTTPS"| WebApi
+    WebApi <-->|"SQL/TLS"| Database
+
+    style Internal fill:none,stroke:#e31a1c,stroke-width:3px,stroke-dasharray: 5 5
+```
+
+### Kubernetes DFD Template (With Sidecars)
+
+```mermaid
+%%{init: {'theme': 'base', 'themeVariables': { 'background': '#ffffff', 'primaryColor': '#ffffff', 'lineColor': '#666666' }}}%%
+flowchart LR
+    classDef process fill:#6baed6,stroke:#2171b5,stroke-width:2px,color:#000000
+    classDef external fill:#fdae61,stroke:#d94701,stroke-width:2px,color:#000000
+    classDef datastore fill:#74c476,stroke:#238b45,stroke-width:2px,color:#000000
+    linkStyle default stroke:#666666,stroke-width:2px
+
+    User["User/Browser"]:::external
+    IdP["Identity Provider"]:::external
+
+    subgraph K8s["Kubernetes Cluster"]
+        subgraph Backend["Backend Services"]
+            ApiService(("API Service<br/>+ AuthProxy, Dapr")):::process
+            Worker(("Worker<br/>+ Dapr")):::process
+        end
+        Redis[("Redis")]:::datastore
+        Database[("PostgreSQL")]:::datastore
+    end
+
+    User -->|"HTTPS"| ApiService
+    ApiService -->|"HTTPS"| User
+    ApiService -->|"HTTPS"| IdP
+    ApiService -->|"SQL/TLS"| Database
+    ApiService -->|"Dapr HTTP"| Worker
+    ApiService -->|"TCP"| Redis
+    Worker -->|"SQL/TLS"| Database
+
+    style K8s fill:none,stroke:#e31a1c,stroke-width:3px,stroke-dasharray: 5 5
+    style Backend fill:none,stroke:#e31a1c,stroke-width:3px,stroke-dasharray: 5 5
+```
+
+**Key points:**
+- AuthProxy and Dapr are annotated on the host node (`+ AuthProxy, Dapr`), not as separate nodes
+- `ApiService -->|"HTTPS"| IdP` = auth proxy's cross-boundary call, drawn from host container
+- `ApiService -->|"TCP"| Redis` = Dapr's cross-boundary call, drawn from host container
+- No intra-pod flows drawn
+
+---
+
+## Diagram Type: Architecture
+
+Used in: `0.1-architecture.md` only
+
+### ClassDef & Shapes
+
+```
+classDef service fill:#6baed6,stroke:#2171b5,stroke-width:2px,color:#000000
+classDef external fill:#fdae61,stroke:#d94701,stroke-width:2px,color:#000000
+classDef datastore fill:#74c476,stroke:#238b45,stroke-width:2px,color:#000000
+```
+
+| Element Type | Shape Syntax | Notes |
+|-------------|-------------|-------|
+| Services/Processes | `["Name"]` or `(["Name"])` | Rounded rectangles or stadium |
+| External Actors | `(["Name"])` with `external` class | Amber distinguishes them |
+| Data Stores | `[("Name")]` cylinder | Same as DFD |
+| **DO NOT** use circles `(("Name"))` | | Reserved for DFD threat model diagrams |
+
+### Layer Grouping Styling (NOT trust boundaries)
+
+```
+style LayerId fill:#f0f4ff,stroke:#2171b5,stroke-width:2px,stroke-dasharray: 5 5
+```
+
+Layer colors:
+- Backend: `fill:#f0f4ff,stroke:#2171b5` (light blue)
+- Data: `fill:#f0fff0,stroke:#238b45` (light green)
+- External: `fill:#fff8f0,stroke:#d94701` (light amber)
+- Infrastructure: `fill:#f5f5f5,stroke:#666666` (light gray)
+
+### Flow Conventions
+
+- Label with **what is communicated**: `"User queries"`, `"Auth tokens"`, `"Log data"`
+- Protocol can be parenthetical: `"Queries (gRPC)"`
+- Simpler arrows than DFD — use `-->` without requiring bidirectional flows
+
+### Kubernetes Pods in Architecture Diagrams
+
+Show pods with their full container composition:
+```
+inf["Inferencing Flow<br/>+ MISE + Dapr"]:::service
+ing["Ingestion Flow<br/>+ MISE + Dapr"]:::service
+```
+
+### Key Difference from DFD
+
+The architecture diagram shows **what the system does** (logical components and interactions). The threat model DFD shows **what could be attacked** (trust boundaries, data flows with protocols, element types). They share many components but serve different purposes.
+
+### Complete Architecture Diagram Template
+
+```mermaid
+%%{init: {'theme': 'base', 'themeVariables': { 'background': '#ffffff', 'primaryColor': '#ffffff', 'lineColor': '#666666' }}}%%
+flowchart LR
+    classDef service fill:#6baed6,stroke:#2171b5,stroke-width:2px,color:#000000
+    classDef external fill:#fdae61,stroke:#d94701,stroke-width:2px,color:#000000
+    classDef datastore fill:#74c476,stroke:#238b45,stroke-width:2px,color:#000000
+    linkStyle default stroke:#666666,stroke-width:2px
+
+    User(["User"]):::external
+
+    subgraph Backend["Backend Services"]
+        Api["API Service"]:::service
+        Worker["Worker"]:::service
+    end
+
+    subgraph Data["Data Layer"]
+        Db[("Database")]:::datastore
+        Cache[("Cache")]:::datastore
+    end
+
+    User -->|"HTTPS"| Api
+    Api --> Worker
+    Worker --> Db
+    Api --> Cache
+
+    style Backend fill:#f0f4ff,stroke:#2171b5,stroke-width:2px,stroke-dasharray: 5 5
+    style Data fill:#f0fff0,stroke:#238b45,stroke-width:2px,stroke-dasharray: 5 5
+```
+
+### Kubernetes Architecture Template
+
+```mermaid
+%%{init: {'theme': 'base', 'themeVariables': { 'background': '#ffffff', 'primaryColor': '#ffffff', 'lineColor': '#666666' }}}%%
+flowchart LR
+    classDef service fill:#6baed6,stroke:#2171b5,stroke-width:2px,color:#000000
+    classDef external fill:#fdae61,stroke:#d94701,stroke-width:2px,color:#000000
+    classDef datastore fill:#74c476,stroke:#238b45,stroke-width:2px,color:#000000
+    linkStyle default stroke:#666666,stroke-width:2px
+
+    User(["User"]):::external
+    IdP(["Azure AD"]):::external
+
+    subgraph K8s["Kubernetes Cluster"]
+        Inf["Inferencing Flow<br/>+ MISE + Dapr"]:::service
+        Ing["Ingestion Flow<br/>+ MISE + Dapr"]:::service
+        Redis[("Redis")]:::datastore
+    end
+
+    User -->|"HTTPS"| Inf
+    Inf -->|"Auth (MISE)"| IdP
+    Ing -->|"Auth (MISE)"| IdP
+    Inf -->|"State (Dapr)"| Redis
+
+    style K8s fill:#f0f4ff,stroke:#2171b5,stroke-width:2px,stroke-dasharray: 5 5
+```
+
+---
+
+## Sequence Diagram Rules
+
+Used in: `0.1-architecture.md` top scenarios
+
+- The **first 3 scenarios MUST** each include a Mermaid `sequenceDiagram`
+- Scenarios 4-5 may optionally include one
+- Use the **Sequence Diagram Init Block** above at the top of each
+- Use `participant` aliases matching the Key Components table
+- Show activations (`activate`/`deactivate`) for request-response patterns
+- Include `Note` blocks for security-relevant steps (e.g., "Validates JWT token")
+- Keep diagrams focused — core workflow, not every error path
+
+### Complete Sequence Diagram Example
+
+```mermaid
+%%{init: {'theme': 'base', 'themeVariables': {
+  'background': '#ffffff',
+  'actorBkg': '#6baed6', 'actorBorder': '#2171b5', 'actorTextColor': '#000000',
+  'signalColor': '#666666', 'signalTextColor': '#666666',
+  'noteBkgColor': '#fdae61', 'noteBorderColor': '#d94701', 'noteTextColor': '#000000',
+  'activationBkgColor': '#ddeeff', 'activationBorderColor': '#2171b5'
+}}}%%
+sequenceDiagram
+    actor User
+    participant Api as API Service
+    participant Db as Database
+
+    User->>Api: POST /resource
+    activate Api
+    Note over Api: Validates JWT token
+    Api->>Db: INSERT query
+    Db-->>Api: Result
+    Api-->>User: 201 Created
+    deactivate Api
+```
+
+---
+
+## Summary Diagram Rules
+
+Used in: `1.2-threatmodel-summary.mmd` (generated only when detailed diagram has >15 elements or >4 trust boundaries)
+
+1. **All trust boundaries must be preserved** — never combine or omit
+2. **Only combine components that are NOT**: entry points, core flow components, security-critical services, primary data stores
+3. **Candidates for aggregation**: supporting infrastructure, secondary caches, multiple externals at same trust level
+4. **Combined element labels must list contents:**
+   ```
+   DataLayer[("Data Layer<br/>(UserDB, OrderDB, Redis)")]
+   SupportServices(("Supporting<br/>(Logging, Monitoring)"))
+   ```
+5. Use `SDF` prefix for summary data flows: `SDF01`, `SDF02`, ...
+6. Include mapping table in `1-threatmodel.md`:
+   ```
+   | Summary Element | Contains | Summary Flows | Maps to Detailed Flows |
+   ```
+
+---
+
+## Naming Conventions
+
+| Item | Convention | Example |
+|------|-----------|---------|
+| Element ID | PascalCase, no spaces | `WebApi`, `UserDb` |
+| Display Name | Human readable in quotes | `"Web API"`, `"User Database"` |
+| Flow Label | Protocol or action in quotes | `"HTTPS"`, `"SQL"`, `"gRPC"` |
+| Flow ID | Unique short identifier | `DF01`, `DF02` |
+| Boundary ID | PascalCase | `InternalNetwork`, `PublicDMZ` |
+
+**CRITICAL: Always quote ALL text in Mermaid diagrams:**
+- Element labels: `["Name"]`, `(("Name"))`, `[("Name")]`
+- Flow labels: `-->|"Label"|`
+- Subgraph titles: `subgraph ID["Title"]`
+
+---
+
+## Quick Reference - Shapes
+
+```
+External Interactor:  ["Name"]     → Rectangle
+Process:              (("Name"))   → Circle (double parentheses)
+Data Store:           [("Name")]   → Cylinder
+```
+
+## Quick Reference - Flows
+
+```
+Unidirectional:  A -->|"Label"| B
+Bidirectional:   A <-->|"Label"| B
+```
+
+## Quick Reference - Boundaries
+
+```
+subgraph BoundaryId["Display Name"]
+    %% elements inside
+end
+style BoundaryId fill:none,stroke:#e31a1c,stroke-width:3px,stroke-dasharray: 5 5
+```
+
+---
+
+## STRIDE Analysis — Sidecar Implications
+
+Although sidecars are NOT separate diagram nodes, they DO appear in STRIDE analysis:
+
+- Sidecars with distinct threat surfaces (e.g., MISE auth bypass, Dapr mTLS) get their own `## Component` section in `2-stride-analysis.md`
+- The component heading notes which pods they are co-located in
+- Threats related to intra-pod communication (localhost bypass, shared namespace) go under the **primary container's** component section
+- **Pod Co-location** line in STRIDE template: list co-located sidecars (e.g., "MISE Sidecar, Dapr Sidecar")
diff --git a/skills/threat-model-analyst/references/incremental-orchestrator.md b/skills/threat-model-analyst/references/incremental-orchestrator.md
new file mode 100644
index 00000000..d11082b4
--- /dev/null
+++ b/skills/threat-model-analyst/references/incremental-orchestrator.md
@@ -0,0 +1,708 @@
+# Incremental Orchestrator — Threat Model Update Workflow
+
+This file contains the complete orchestration logic for performing an **incremental threat model analysis** — generating a new threat model report that builds on an existing baseline report. It is invoked when the user requests an updated analysis and a prior `threat-model-*` folder exists.
+
+**Key difference from single analysis (`orchestrator.md`):** Instead of discovering components from scratch, this workflow inherits the old report's component inventory, IDs, and conventions. It then verifies each item against the current code and discovers new items.
+
+## ⚡ Context Budget — Read Files Selectively
+
+**Phase 1 (setup + change detection):** Read this file (`incremental-orchestrator.md`) only. The old `threat-inventory.json` provides the structural skeleton — no need to read other skill files yet.
+**Phase 2 (report generation):** Read `orchestrator.md` (for mandatory rules 1–34), `output-formats.md`, `diagram-conventions.md` — plus the relevant skeleton from `skeletons/` before writing each file. See the incremental-specific rules below.
+**Phase 3 (verification):** Delegate to a sub-agent with `verification-checklist.md` (all 9 phases, including Phase 8 for comparison HTML).
+
+---
+
+## When to Use This Workflow
+
+Use incremental analysis when ALL of these conditions are met:
+1. The user's request involves updating, re-running, or refreshing a threat model
+2. A prior `threat-model-*` folder exists in the repository with a valid `threat-inventory.json`
+3. The user provides or implies both: a baseline report folder AND a target commit (defaults to HEAD)
+
+**Trigger examples:**
+- "Update the threat model using threat-model-20260309-174425 as the baseline"
+- "Run an incremental threat model analysis against the previous report"
+- "What changed security-wise since the last threat model?"
+- "Refresh the threat model for the latest commit"
+
+**NOT this workflow:**
+- First-time analysis (no baseline) → use `orchestrator.md`
+- "Analyze the security of this repo" with no mention of a prior report → use `orchestrator.md`
+
+---
+
+## Inputs
+
+| Input | Source | Required? |
+|-------|--------|-----------|
+| Baseline report folder | Path to `threat-model-*` directory | Yes |
+| Baseline `threat-inventory.json` | `{baseline_folder}/threat-inventory.json` | Yes |
+| Baseline commit SHA | From `{baseline_folder}/0-assessment.md` Report Metadata | Yes |
+| Target commit | User-provided SHA or defaults to HEAD | Yes (default: HEAD) |
+
+---
+
+**⛔ Sub-Agent Governance applies to ALL phases.** See `orchestrator.md` Sub-Agent Governance section. Sub-agents are READ-ONLY helpers — they NEVER call `create_file` for report files.
+
+## Phase 0: Setup & Validation
+
+1. **Record start time:**
+   ```
+   Get-Date -Format "yyyy-MM-dd HH:mm:ss" -AsUTC
+   ```
+   Store as `START_TIME`.
+
+2. **Gather git info:**
+   ```
+   git remote get-url origin
+   git branch --show-current
+   git rev-parse --short HEAD
+   hostname
+   ```
+
+3. **Validate inputs:**
+   - Confirm baseline folder exists: `Test-Path {baseline_folder}/threat-inventory.json`
+   - Read baseline commit SHA from `0-assessment.md`: search for `| Git Commit |` row
+   - Confirm target commit is resolvable: `git rev-parse {target_sha}`
+   - **Get commit dates:** `git log -1 --format="%ai" {baseline_sha}` and `git log -1 --format="%ai" {target_sha}` — NOT today's date
+   - **Get code change counts** (for HTML metrics bar):
+     ```
+     git rev-list --count {baseline_sha}..{target_sha}
+     git log --oneline --merges --grep="Merged PR" {baseline_sha}..{target_sha} | wc -l
+     ```
+     Store as `COMMIT_COUNT` and `PR_COUNT`.
+
+4. **Baseline code access — reuse or create worktree:**
+   ```
+   # Check for existing worktree
+   git worktree list
+   
+   # If a worktree for baseline_sha exists → reuse it
+   # Verify: git -C {worktree_path} rev-parse HEAD
+   
+   # If not → create one:
+   git worktree add ../baseline-{baseline_sha_short} {baseline_sha}
+   ```
+   Store the worktree path as `BASELINE_WORKTREE` for old-code verification in later phases.
+
+5. **Create output folder:**
+   ```
+   threat-model-{YYYYMMDD-HHmmss}/
+   ```
+
+---
+
+## Phase 1: Load Old Report Skeleton
+
+Read the baseline `threat-inventory.json` and extract the structural skeleton:
+
+```
+From threat-inventory.json, load:
+  - components[]  → all component IDs, types, boundaries, source_files, fingerprints
+  - flows[]       → all flow IDs, from/to, protocols
+  - boundaries[]  → all boundary IDs, contains lists
+  - threats[]     → all threat IDs, component mappings, stride categories, tiers
+  - findings[]    → all finding IDs, titles, severities, CWEs, component mappings
+  - metrics       → totals for validation
+
+Store as the "inherited inventory" — the structural foundation.
+```
+
+**Do NOT read the full prose** from the old report's markdown files yet. Only load structured data. Read old report prose on-demand when:
+- Verifying if a specific code pattern was previously analyzed
+- Resolving ambiguity about a component's role or classification
+- Historical context needed for a finding status decision
+
+---
+
+## Phase 2: Per-Component Change Detection
+
+For each component in the inherited inventory, determine its change status:
+
+```
+For EACH component in inherited inventory:
+
+  1. Check source_files existence at target commit:
+     git ls-tree {target_sha} -- {each source_file}
+  
+  2. If ALL source files missing:
+     → change_status = "removed"
+     → Mark all linked threats as "removed_with_component"
+     → Mark all linked findings as "removed_with_component"
+  
+  3. If source files exist, check for changes:
+     git diff --stat {baseline_sha} {target_sha} -- {source_files}
+     
+     If NO changes → change_status = "unchanged"
+     
+     If changes exist, check if security-relevant:
+       Read the diff: git diff {baseline_sha} {target_sha} -- {source_files}
+       Look for changes in:
+       - Auth/credential patterns (tokens, passwords, certificates)
+       - Network/API surface (new endpoints, changed listeners, port bindings)
+       - Input validation (sanitization, parsing, deserialization)
+       - Command execution patterns (shell exec, process spawn)
+       - Config values (TLS settings, CORS, security headers)
+       - Dependencies (new packages, version changes)
+       
+       If security-relevant → change_status = "modified"
+       If cosmetic only (whitespace, comments, logging, docs) → change_status = "unchanged"
+  
+  4. If files moved or renamed:
+     git log --follow --diff-filter=R {baseline_sha}..{target_sha} -- {source_files}
+     → change_status = "restructured"
+     → Update source_file references to new paths
+```
+
+**Record the classification for every component** — this drives all downstream decisions.
+
+---
+
+## Phase 3: Scan for New Components
+
+```
+1. Enumerate source directories/files at {target_sha} that are NOT referenced
+   by any existing component's source_files or source_directories.
+   Focus on: new top-level directories, new *Service.cs/*Agent.cs/*Server.cs classes,
+   new Helm deployments, new API controllers.
+
+2. Apply the same component discovery rules from orchestrator.md:
+   - Class-anchored naming (PascalCase from actual class names)
+   - Component eligibility criteria (crosses trust boundary or handles security data)
+   - Same naming procedure (primary class → script → config → directory → technology)
+
+3. For each candidate new component:
+   - Verify it didn't exist at baseline: git ls-tree {baseline_sha} -- {path}
+   - If it existed at baseline → this is a "missed component" from the old analysis
+     → Add to Needs Verification section with note: "Component existed at baseline
+       but was not in the previous analysis. May indicate an analysis gap."
+   - If genuinely new (files didn't exist at baseline):
+     → change_status = "new"
+     → Assign a new component ID following the same PascalCase naming rules
+     → Full STRIDE analysis will be performed in Phase 4
+```
+
+---
+
+## Phase 4: Generate Report Files
+
+Now generate all report files. **Read the relevant skill files before starting:**
+- `orchestrator.md` — mandatory rules 1–34 apply to all report files
+- `output-formats.md` — templates and format rules
+- `diagram-conventions.md` — diagram colors and styles
+- **Before writing EACH file, read the corresponding skeleton from `skeletons/skeleton-*.md`** — copy VERBATIM and fill `[FILL]` placeholders
+
+**⛔ SUB-AGENT GOVERNANCE (MANDATORY — prevents the dual-folder bug):** The parent agent owns ALL file creation. Sub-agents are READ-ONLY helpers that search code, gather context, and run verification — they NEVER call `create_file` for report files. See the full Sub-Agent Governance rules in `orchestrator.md`. The ONLY exception is `threat-inventory.json` delegation for large repos — and even then, the sub-agent prompt must include the exact output file path and explicit instruction to write ONLY that one file.
+
+**⛔ CRITICAL: The incremental report is a STANDALONE report.** Someone reading it without the old report must understand the complete security posture. Status annotations ([STILL PRESENT], [FIXED], [NEW CODE], etc.) are additions on top of complete content — not replacements for it.
+
+### 4a. 0.1-architecture.md
+
+- **Read `skeletons/skeleton-architecture.md` first** — use as structural template
+- Copy the old report's component structure as your starting template
+- **Unchanged components:** Regenerate description using the current code (not copy-paste from old report). Same ID, same conventions.
+- **Modified components:** Update description to reflect code changes. Add annotation: `[MODIFIED — security-relevant changes detected]`
+- **New components:** Add with annotation: `[NEW]`
+- **Removed components:** Add with annotation: `[REMOVED]` and brief note
+- Tech stack, deployment model: update if changed, otherwise carry forward
+
+  ⛔ **DEPLOYMENT CLASSIFICATION IS MANDATORY (even in incremental mode):**
+  The `0.1-architecture.md` MUST contain:
+  1. `**Deployment Classification:** \`[VALUE]\`` line (e.g., `K8S_SERVICE`, `LOCALHOST_DESKTOP`)
+  2. `### Component Exposure Table` with columns: Component, Listens On, Auth Required, Reachability, Min Prerequisite, Derived Tier
+  If the baseline had these, carry them forward and update for new/modified components.
+  If the baseline did NOT have these, **derive them from code NOW** — they are required for all subsequent steps.
+  **DO NOT proceed to Step 4b without these two elements in place.**
+
+- Scenarios: keep old scenarios, add new ones for new functionality
+- All standard `0.1-architecture.md` rules from `output-formats.md` apply
+
+### 4b. 1.1-threatmodel.mmd (DFD)
+
+- **Read `skeletons/skeleton-dfd.md` and `skeletons/skeleton-summary-dfd.md` first**
+- Start from the old DFD's logical layout
+- **Same node IDs** for carried-forward components (critical for ID stability)
+- **New components:** Add with distinctive styling — use `classDef newComponent fill:#d4edda,stroke:#28a745,stroke-width:3px`
+- **Removed components:** Show as dashed with gray fill — use `classDef removedComponent fill:#e9ecef,stroke:#6c757d,stroke-width:1px,stroke-dasharray:5`
+- **Same flow IDs** for unchanged flows
+- **New flows:** New IDs continuing the sequence
+- All standard DFD rules from `diagram-conventions.md` apply (flowchart LR, color palette, etc.)
+
+  ⛔ **POST-DFD GATE:** After creating `1.1-threatmodel.mmd`, count elements and boundaries. If elements > 15 OR boundaries > 4 → create `1.2-threatmodel-summary.mmd` using `skeleton-summary-dfd.md` NOW. Do NOT proceed to Step 4c until the decision is made.
+
+### 4c. 1-threatmodel.md
+
+- **Read `skeletons/skeleton-threatmodel.md` first** — use table structure
+- Element table: all old elements + new elements, with an added `Status` column
+  - Values: `Unchanged`, `Modified`, `New`, `Removed`, `Restructured`
+- Flow table: all old flows + new flows, with `Status` column
+- Boundary table: inherited boundaries + any new ones
+- If `1.2-threatmodel-summary.mmd` was generated, include `## Summary View` section with the summary diagram and mapping table
+- All standard table rules from `output-formats.md` apply
+
+### 4d. 2-stride-analysis.md
+
+- **Read `skeletons/skeleton-stride-analysis.md` first** — use Summary table and per-component structure
+
+**⛔ CRITICAL REMINDERS FOR INCREMENTAL STRIDE (these rules from `orchestrator.md` apply identically here):**
+1. **The "A" in STRIDE-A is ALWAYS "Abuse"** (business logic abuse, workflow manipulation, feature misuse). NEVER use "Authorization" as the STRIDE-A category name. This applies to threat ID suffixes (T01.A), N/A justification labels, and all prose. Authorization issues fall under Elevation of Privilege (E), not the A category.
+2. **The `## Summary` table MUST appear at the TOP of the file**, immediately after `## Exploitability Tiers`, BEFORE any individual component sections. Use this EXACT structure at the top:
+
+```markdown
+# STRIDE-A Threat Analysis
+
+## Exploitability Tiers
+| Tier | Label | Prerequisites | Assignment Rule |
+|------|-------|---------------|----------------|
+| **Tier 1** | Direct Exposure | `None` | Exploitable by unauthenticated external attacker with NO prior access. |
+| **Tier 2** | Conditional Risk | Single prerequisite | Requires exactly ONE form of access. |
+| **Tier 3** | Defense-in-Depth | Multiple prerequisites or infrastructure access | Requires significant prior breach or multiple combined prerequisites. |
+
+## Summary
+| Component | Link | S | T | R | I | D | E | A | Total | T1 | T2 | T3 | Risk |
+|-----------|------|---|---|---|---|---|---|---|-------|----|----|----|------|
+<!-- one row per component with numeric counts, then Totals row -->
+
+---
+## [First Component Name]
+```
+
+3. **STRIDE categories may produce 0, 1, 2, 3+ threats** per component. Do NOT cap at 1 threat per category. Components with rich security surfaces should typically have 2-4 threats per relevant category. If every STRIDE cell in the Summary table is 0 or 1, the analysis is too shallow — go back and identify additional threat vectors. The Summary table columns reflect actual threat counts.
+4. **⛔ PREREQUISITE FLOOR CHECK (per threat):** Before assigning a prerequisite to any threat, look up the component's `Min Prerequisite` and `Derived Tier` in the Component Exposure Table (`0.1-architecture.md`). The threat's prerequisite MUST be ≥ the component's floor. The threat's tier MUST be ≥ the component's derived tier. Use the canonical prerequisite→tier mapping from `analysis-principles.md`. Prerequisites MUST use only canonical values: `None`, `Authenticated User`, `Privileged User`, `Internal Network`, `Local Process Access`, `Host/OS Access`, `Admin Credentials`, `Physical Access`, `{Component} Compromise`. ⛔ `Application Access` and `Host Access` are FORBIDDEN.
+
+**⛔ HEADING ANCHOR RULE (applies to ALL output files):** ALL `##` and `###` headings in every output file must be PLAIN text — NO status tags (`[Existing]`, `[Fixed]`, `[Partial]`, `[New]`, `[Removed]`, or any old-style tags) in heading text. Tags break markdown anchor links and pollute table-of-contents. Place status annotations on the FIRST LINE of the section/finding body instead:
+- ✅ `## KmsPluginProvider` with first line `> **[New]** Component added in this release.`
+- ✅ `### FIND-01: Missing Auth Check` with first line `> **[Existing]**`
+- ❌ `## KmsPluginProvider [New]` (breaks `#kmspluginprovider` anchor)
+- ❌ `### FIND-01: Missing Auth Check [Existing]` (pollutes heading)
+
+This rule applies to: `0.1-architecture.md`, `2-stride-analysis.md`, `3-findings.md`, `1-threatmodel.md`.
+
+For each component, the STRIDE analysis approach depends on its change status:
+
+| Component Status | STRIDE Approach |
+|-----------------|-----------------|
+| **Unchanged** | Carry forward all threat entries from old report with `[STILL PRESENT]` annotation. Re-verify each threat's mitigation status against current code. |
+| **Modified** | Re-analyze the component with access to the diff. For each old threat: determine if `still_present`, `fixed`, `mitigated`, or `modified`. Discover new threats from the code changes → classify as `new_in_modified`. |
+| **New** | Full fresh STRIDE-A analysis (same as single-analysis mode). All threats classified as `new_code`. |
+| **Removed** | Section header with note: "Component removed — all threats resolved with `removed_with_component` status." |
+
+**Threat ID continuity:**
+- Old threats keep their original IDs (e.g., T01.S, T02.T)
+- New threats continue the sequence from the old report's highest threat number
+- NEVER reassign or reuse an old threat ID
+
+**N/A categories (from §3.7 of PRD):**
+- Each component gets all 7 STRIDE-A categories addressed
+- Non-applicable categories: `N/A — {1-sentence justification}`
+- N/A entries do NOT count toward threat totals
+
+**Status annotation format in STRIDE tables:**
+Add a `Change` column to each threat table row with one of:
+- `Existing` — threat exists in current code, same as before (includes threats with minor detail changes)
+- `Fixed` — vulnerability was remediated (cite the specific code change)
+- `New` — threat from a new component, code change, or previously unidentified
+- `Removed` — component was removed
+
+<!-- SIMPLIFIED DISPLAY TAGS: Only 5 tags for display in markdown body text.
+  [Existing] = still_present, modified, mitigated (threat still exists)
+  [Fixed] = fixed (fully remediated)
+  [Partial] = partially_mitigated (code changed but vulnerability remains in reduced form)
+  [New] = new_code, new_in_modified, previously_unidentified (new to this report)
+  [Removed] = removed_with_component (component deleted)
+  JSON change_status keeps the detailed values for programmatic use. -->
+
+⛔ POST-STEP CHECK: After writing the Change column for ALL threats, verify:
+  1. Every threat row has exactly one of: Existing, Fixed, New, Removed
+  2. No old-style tags: Still Present, New (Code), New (Modified), Previously Unidentified
+  3. Fixed threats cite the specific code change
+
+### 4e. 3-findings.md
+
+⛔ **BEFORE WRITING ANY FINDING — Re-read `skeletons/skeleton-findings.md` NOW.**
+The skeleton defines the EXACT structure for each finding block, including the mandatory `**Prerequisite basis:**` line in the `#### Evidence` section. Every finding — whether [Existing], [New], [Fixed], or [Partial] — MUST follow this skeleton structure.
+
+⛔ **DEPLOYMENT CONTEXT GATE (FAIL-CLOSED) — applies to ALL findings (new and carried-forward):**
+Read `0.1-architecture.md` Deployment Classification and Component Exposure Table.
+If classification is `LOCALHOST_DESKTOP` or `LOCALHOST_SERVICE`:
+- ZERO findings may have `Exploitation Prerequisites` = `None` → fix to `Local Process Access` or `Host/OS Access`
+- ZERO findings may be in `## Tier 1` → downgrade to T2/T3
+- ZERO CVSS vectors may use `AV:N` unless component has `Reachability = External`
+For ALL classifications:
+- Each finding's prerequisite MUST be ≥ its component's `Min Prerequisite` from the exposure table
+- Each finding's tier MUST be ≥ its component's `Derived Tier`
+- **EVERY finding's `#### Evidence` section MUST start with a `**Prerequisite basis:**` line** citing the specific code/config that determines the prerequisite (e.g., "ClusterIP service, no Ingress — Internal Only per Exposure Table"). This applies to [Existing] findings too — re-derive from current code.
+- Prerequisites MUST use only canonical values. ⛔ `Application Access` and `Host Access` are FORBIDDEN.
+
+For each old finding, verify against the current code:
+
+| Situation | change_status | Action |
+|-----------|---------------|--------|
+| Code unchanged, vulnerability intact | `still_present` | Carry forward with `> **[Existing]**` on first line of body |
+| Code changed to fix the vulnerability | `fixed` | Mark with `> **[Fixed]**`, cite the specific code change |
+| Code changed partially | `partially_mitigated` | Mark with `> **[Partial]**`, explain what changed and what remains |
+| Component removed entirely | `removed_with_component` | Mark with `> **[Removed]**` |
+
+For new findings:
+
+| Situation | change_status | Label |
+|-----------|---------------|-------|
+| New component, new vulnerability | `new_code` | `> **[New]**` |
+| Existing component, vulnerability introduced by code change | `new_in_modified` | `> **[New]**` — cite the specific change |
+| Existing component, vulnerability was in old code but missed | `previously_unidentified` | `> **[New]**` — verify against baseline worktree |
+
+<!-- ⛔ POST-STEP CHECK: After writing all finding annotations:
+  1. Every finding body starts with one of: [Existing], [Fixed], [Partial], [New], [Removed]
+  2. Tags are in body text as blockquote (> **[Tag]**), NOT in the ### heading
+  3. No old-style tags: [STILL PRESENT], [NEW CODE], [NEW IN MODIFIED], [PREVIOUSLY UNIDENTIFIED], [PARTIALLY MITIGATED], [REMOVED WITH COMPONENT]
+  4. JSON change_status uses the detailed values (still_present, new_code, etc.) for programmatic comparison -->
+
+**Finding ID continuity:**
+- Old findings keep their original IDs (FIND-01 through FIND-N)
+- New findings continue the sequence: FIND-N+1, FIND-N+2, ...
+- No gaps, no duplicates
+- Fixed findings are retained but annotated — they are NOT removed from the report
+- **Document order**: Findings are sorted by Tier (1→2→3), then by severity (Critical→Important→Moderate→Low), then by CVSS descending — same as standalone analysis. Because old IDs are preserved, the ID numbers may NOT be numerically ascending in the document. This is acceptable in incremental mode — ID stability for cross-report tracing takes precedence over sequential ordering. The `### FIND-XX:` headings will appear in tier/severity order, not ID order.
+
+**Previously-unidentified verification procedure:**
+1. Identify the finding's component and evidence files
+2. Read the same files at the baseline commit: `cat {BASELINE_WORKTREE}/{file_path}`
+3. If the vulnerability pattern exists in the old code → `previously_unidentified`
+4. If the vulnerability pattern does NOT exist in the old code → `new_in_modified`
+
+### 4f. threat-inventory.json
+
+- **Read `skeletons/skeleton-inventory.md` first** — use exact field names and schema structure
+
+Same schema as single analysis, with additional fields:
+
+```json
+{
+  "schema_version": "1.1",
+  "incremental": true,
+  "baseline_report": "threat-model-20260309-174425",
+  "baseline_commit": "2dd84ab",
+  "target_commit": "abc1234",
+  
+  "components": [
+    {
+      "id": "McpHost",
+      "change_status": "unchanged",
+      ...existing fields...
+    }
+  ],
+  
+  "threats": [
+    {
+      "id": "T01.S",
+      "change_status": "still_present",
+      ...existing fields...
+    }
+  ],
+  
+  "findings": [
+    {
+      "id": "FIND-01",
+      "change_status": "still_present",
+      ...existing fields...
+    }
+  ],
+  
+  "metrics": {
+    ...existing fields...,
+    "status_summary": {
+      "components": {
+        "unchanged": 15,
+        "modified": 2,
+        "new": 1,
+        "removed": 1,
+        "restructured": 0
+      },
+      "threats": {
+        "still_present": 80,
+        "fixed": 5,
+        "mitigated": 3,
+        "new_code": 10,
+        "new_in_modified": 4,
+        "previously_unidentified": 2,
+        "removed_with_component": 8
+      },
+      "findings": {
+        "still_present": 12,
+        "fixed": 2,
+        "partially_mitigated": 1,
+        "new_code": 3,
+        "new_in_modified": 2,
+        "previously_unidentified": 1,
+        "removed_with_component": 1
+      }
+    }
+  }
+}
+```
+
+### 4g. 0-assessment.md
+
+- **Read `skeletons/skeleton-assessment.md` first** — use section order and table structures
+
+Standard assessment sections (all 7 mandatory) plus incremental-specific sections:
+
+**Standard sections (same as single analysis):**
+1. Report Files
+2. Executive Summary (with `> **Note on threat counts:**` blockquote)
+3. Action Summary (with `### Quick Wins`)
+4. Analysis Context & Assumptions (with `### Needs Verification` and `### Finding Overrides`)
+5. References Consulted
+6. Report Metadata
+7. Classification Reference (static table copied from skeleton)
+
+**Additional incremental sections (insert between Action Summary and Analysis Context):**
+
+```markdown
+## Change Summary
+
+### Component Changes
+| Status | Count | Components |
+|--------|-------|------------|
+| Unchanged | X | ComponentA, ComponentB, ... |
+| Modified | Y | ComponentC, ... |
+| New | Z | ComponentD, ... |
+| Removed | W | ComponentE, ... |
+
+### Threat Status
+| Status | Count |
+|--------|-------|
+| Still Present | X |
+| Fixed | Y |
+| New (Code) | Z |
+| New (Modified) | M |
+| Previously Unidentified | W |
+| Removed with Component | V |
+
+### Finding Status
+| Status | Count |
+|--------|-------|
+| Still Present | X |
+| Fixed | Y |
+| Partially Mitigated | P |
+| New (Code) | Z |
+| New (Modified) | M |
+| Previously Unidentified | W |
+| Removed with Component | V |
+
+### Risk Direction
+[Improving / Worsening / Stable] — [1-2 sentence justification based on status distribution]
+
+---
+
+## Previously Unidentified Issues
+
+These vulnerabilities were present in the baseline code at commit `{baseline_sha}` but were not identified in the prior analysis:
+
+| Finding | Title | Component | Evidence |
+|---------|-------|-----------|----------|
+| FIND-XX | [title] | [component] | Baseline code at `{file}:{line}` |
+```
+
+**Report Metadata additions:**
+```markdown
+| Baseline Report | `{baseline_folder}` |
+| Baseline Commit | `{baseline_sha}` (`{baseline_commit_date}` — run `git log -1 --format="%cs" {baseline_sha}`) |
+| Target Commit | `{target_sha}` (`{target_commit_date}` — run `git log -1 --format="%cs" {target_sha}`) |
+| Baseline Worktree | `{worktree_path}` |
+| Analysis Mode | `Incremental` |
+```
+
+### 4h. incremental-comparison.html
+
+- **Read `skeletons/skeleton-incremental-html.md` first** — use 8-section structure and CSS variables
+
+Generate a self-contained HTML file that visualizes the comparison. All data comes from the `change_status` fields already computed in `threat-inventory.json`.
+
+**Structure:**
+
+```html
+<!-- Section 1: Header + Comparison Cards -->
+<div class="header">
+  <div class="report-badge">INCREMENTAL THREAT MODEL COMPARISON</div>
+  <h1>{{repo_name}}</h1>
+</div>
+<div class="comparison-cards">
+  <div class="compare-card baseline">
+    <div class="card-label">BASELINE</div>
+    <div class="card-hash">{{baseline_sha}}</div>
+    <div class="card-date">{{baseline_commit_date from git log}}</div>
+    <div class="risk-badge">{{old_risk_rating}}</div>
+  </div>
+  <div class="compare-arrow">→</div>
+  <div class="compare-card target">
+    <div class="card-label">TARGET</div>
+    <div class="card-hash">{{target_sha}}</div>
+    <div class="card-date">{{target_commit_date from git log}}</div>
+    <div class="risk-badge">{{new_risk_rating}}</div>
+  </div>
+  <div class="compare-card trend">
+    <div class="card-label">TREND</div>
+    <div class="trend-direction">{{Improving|Worsening|Stable}}</div>
+    <div class="trend-duration">{{N months}}</div>
+  </div>
+</div>
+
+<!-- Section 2: Metrics Bar (5 boxes — NO Time Between, use Code Changes) -->
+<div class="metrics-bar">
+  Components: {{old_count}} → {{new_count}} (±N)
+  Trust Boundaries: {{old_boundaries}} → {{new_boundaries}} (±N)
+  Threats: {{old_count}} → {{new_count}} (±N)  
+  Findings: {{old_count}} → {{new_count}} (±N)
+  Code Changes: {{COMMIT_COUNT}} commits, {{PR_COUNT}} PRs
+</div>
+
+<!-- Section 3: Status Summary Cards (colored cards — primary visualization) -->
+<div class="status-cards">
+  <!-- Green card: Fixed (count + list of fixed items) -->
+  <!-- Red card: New (code + modified) (count + list of new items) -->
+  <!-- Amber card: Previously Unidentified (count + list) -->
+  <!-- Gray card: Still Present (count) -->
+</div>
+
+<!-- Section 4: Component Status Grid -->
+<table class="component-grid">
+  <!-- Row per component: ID | Type | Status (color-coded) | Source Files -->
+</table>
+
+<!-- Section 5: Threat/Finding Status Breakdown -->
+<div class="status-breakdown">
+  <!-- Grouped by status: Fixed items, New items, etc. -->
+  <!-- Each item: ID | Title | Component | Status -->
+</div>
+
+<!-- Section 6: STRIDE Heatmap with Deltas -->
+<!-- ⛔ MANDATORY: Heatmap MUST have 13 columns including T1/T2/T3 after a divider -->
+<table class="stride-heatmap">
+  <thead>
+    <tr>
+      <th>Component</th>
+      <th>S</th><th>T</th><th>R</th><th>I</th><th>D</th><th>E</th><th>A</th>
+      <th>Total</th>
+      <th class="divider"></th>
+      <th>T1</th><th>T2</th><th>T3</th>
+    </tr>
+  </thead>
+  <tbody>
+    <!-- Row per component. Each STRIDE cell: value (▲+N or ▼-N delta from baseline) -->
+    <!-- The divider column is a thin visual separator between STRIDE totals and tier breakdown -->
+  </tbody>
+</table>
+
+<!-- Section 7: Needs Verification -->
+<div class="needs-verification">
+  <!-- Items where analysis disagrees with old report -->
+</div>
+
+<!-- Section 8: Footer -->
+<div class="footer">
+  Model: {{model}} | Duration: {{duration}}
+  Baseline: {{baseline_folder}} at {{baseline_sha}}
+  Generated: {{timestamp}}
+</div>
+```
+
+**Styling rules:**
+- Self-contained: ALL CSS in inline `<style>` block. No CDN links.
+- Color conventions: green (#28a745) = fixed, red (#dc3545) = new vulnerability, amber (#fd7e14) = previously unidentified, gray (#6c757d) = still present, blue (#2171b5) = modified
+- Print-friendly: include `@media print` styles
+- Use the same CSS color conventions defined above for visual consistency
+
+---
+
+## Phase 5: Verification
+
+### 5a. Standard Verification
+
+Run the standard `verification-checklist.md` (Phases 0–9) against the new report. The incremental report must pass ALL standard quality checks since it is a standalone report. Delegate to a sub-agent with the **output folder absolute path** so it can read the report files.
+
+### 5b. Incremental Verification
+
+After standard verification passes, run the incremental-specific checks from `experiment-history/mode-c-verification-suite.md` (Phases 1–9, 33 checks). These verify:
+- Structural continuity (every old item accounted for)
+- Code-verified status accuracy (e.g., "fixed" actually verified against code diff)
+- Previously-unidentified classification (verified against baseline worktree)
+- DFD consistency (old nodes present, new nodes distinguished)
+- Standalone quality (no dangling references to old report)
+- Comparison summary accuracy (counts match inventory)
+- Needs Verification completeness
+- Edge cases (merges, splits, rewrites)
+- Metrics/JSON integrity
+
+### 5c. Correction Workflow
+
+1. Collect all PASS/FAIL results
+2. For each FAIL → apply the check's "Fail remediation" action
+3. Re-run failed checks to confirm they pass
+4. After 2 correction attempts, escalate remaining failures to Needs Verification
+5. Record end time and generate execution summary
+
+---
+
+## ⛔ Rules Specific to Incremental Analysis
+
+These rules supplement (not replace) the 34 mandatory rules from `orchestrator.md`:
+
+### Rule I1: Old Report Assessment Judgments Are Preserved
+
+When the new analysis would assign a different TMT category, component type, tier, or threat relevance than the old report → preserve the old report's value. Log the disagreement in Needs Verification with:
+- Old value
+- New analysis's proposed value
+- 1-2 sentence reasoning
+- What the user should check
+
+**Exception:** Factual corrections (file paths, git metadata, arithmetic) are corrected silently and noted in Report Metadata.
+
+### Rule I2: No Silent Overrides
+
+The report body uses the OLD value for assessment judgments. Disagreements go to Needs Verification. The user must explicitly confirm any reclassification.
+
+### Rule I3: Previously-Unidentified Must Be Verified
+
+Every `previously_unidentified` classification MUST include evidence from the baseline worktree. The analyst must actually read the old code at the cited file/line and confirm the vulnerability pattern existed. No guessing based on "it's probably been there."
+
+### Rule I4: Fixed Must Be Code-Verified
+
+Every `fixed` classification MUST cite the specific code change that addressed the vulnerability. Generic statements like "the team fixed this" are not acceptable — show the diff.
+
+### Rule I5: new_in_modified Requires Change Attribution
+
+Every `new_in_modified` finding MUST identify the specific code change that introduced the vulnerability. Cite the diff hunk, new function, new config value, or new dependency that created the issue.
+
+### Rule I6: Do Not Delete Baseline Worktree
+
+The baseline worktree may be reused by future incremental analyses. Do NOT run `git worktree remove` on it. The worktree path is recorded in Report Metadata for reference.
+
+### Rule I7: Change Status Consistency
+
+A component's `change_status` must be consistent with its threats' and findings' statuses:
+- `unchanged` component → its threats should be `still_present` (or `previously_unidentified` for newly discovered threats in unchanged code)
+- `removed` component → ALL its threats/findings must be `removed_with_component`
+- `modified` component → at least one threat should be `modified`, `fixed`, or `new_in_modified`
+- `new` component → ALL its threats must be `new_code`
+
+### Rule I8: Carry Forward, Don't Copy
+
+"Carry forward" means regenerating a threat/finding entry that says the same thing — NOT literally copy-pasting old report text. The regenerated entry should:
+- Use the same ID
+- Reference current file paths (even if unchanged)
+- Be phrased in present tense about the current code
+- Include the `[STILL PRESENT]` annotation
+
+---
+
+## Summary: Phase-by-Phase Checklist
+
+| Phase | Action | Success Criteria |
+|-------|--------|-----------------|
+| 0 | Setup, validate inputs, worktree | All inputs exist, worktree accessible |
+| 1 | Load old inventory skeleton | All arrays populated, metrics match |
+| 2 | Per-component change detection | Every component has a `change_status` |
+| 3 | Scan for new components | New components identified, missed components flagged |
+| 4 | Generate all report files | 8-9 files written to output folder |
+| 5 | Verification (standard + incremental) | All checks pass or escalated to Needs Verification |
diff --git a/skills/threat-model-analyst/references/orchestrator.md b/skills/threat-model-analyst/references/orchestrator.md
new file mode 100644
index 00000000..571caab1
--- /dev/null
+++ b/skills/threat-model-analyst/references/orchestrator.md
@@ -0,0 +1,593 @@
+# Orchestrator — Threat Model Analysis Workflow
+
+This file contains the complete orchestration logic for performing a threat model analysis.
+It is the primary workflow document for the `/threat-model-analyst` skill.
+
+## ⚡ Context Budget — Read Files Selectively
+
+**Do NOT read all 10 skill files at session start.** Read only what each phase needs. This preserves context window for the actual codebase analysis.
+
+**Phase 1 (context gathering):** Read this file (`orchestrator.md`) + `analysis-principles.md` + `tmt-element-taxonomy.md`
+**Phase 2 (writing reports):** Read the relevant skeleton from `skeletons/` BEFORE writing each file. Read `output-formats.md` + `diagram-conventions.md` for rules — but use the skeleton as the structural template.
+- Before `0.1-architecture.md`: read `skeletons/skeleton-architecture.md`
+- Before `1.1-threatmodel.mmd`: read `skeletons/skeleton-dfd.md`
+- Before `1-threatmodel.md`: read `skeletons/skeleton-threatmodel.md`
+- Before `2-stride-analysis.md`: read `skeletons/skeleton-stride-analysis.md`
+- Before `3-findings.md`: read `skeletons/skeleton-findings.md`
+- Before `0-assessment.md`: read `skeletons/skeleton-assessment.md`
+- Before `threat-inventory.json`: read `skeletons/skeleton-inventory.md`
+- Before `incremental-comparison.html`: read `skeletons/skeleton-incremental-html.md`
+**Phase 3 (verification):** Delegate to a sub-agent and include `verification-checklist.md` in the sub-agent prompt. The sub-agent reads the full checklist with a fresh context window — the parent agent does NOT need to read it.
+
+**Key principle:** Sub-agents get fresh context windows. Delegate verification and JSON generation to sub-agents rather than keeping everything in the parent context.
+
+---
+
+## ✅ Mandatory Rules — READ BEFORE STARTING
+
+These are the required behaviors for every threat model report. Follow each rule exactly:
+
+1. Organize findings by **Exploitability Tier** (Tier 1/2/3), never by severity level
+2. Split each component's STRIDE table into Tier 1, Tier 2, Tier 3 sub-sections
+3. Include `Exploitability Tier` and `Remediation Effort` on every finding — both are MANDATORY
+4. STRIDE summary table MUST include T1, T2, T3 columns
+4b. **STRIDE + Abuse Cases categories are exactly:** **S**poofing, **T**ampering, **R**epudiation, **I**nformation Disclosure, **D**enial of Service, **E**levation of Privilege, **A**buse (business logic abuse, workflow manipulation, feature misuse — an extension to standard STRIDE covering misuse of legitimate features). The A is ALWAYS "Abuse" — NEVER "AI Safety", "Authorization", or any other interpretation. Authorization issues belong under E (Elevation of Privilege).
+5. `.md` files: start with `# Heading` on line 1. The `create_file` tool writes raw content — no code fences
+6. `.mmd` files: start with `%%{init:` on line 1. Raw Mermaid source, no fences
+7. Section MUST be titled exactly `## Action Summary`. Include `### Quick Wins` subsection with Tier 1 low-effort findings table
+8. K8s sidecars: annotate host container with `<br/>+ Sidecar` — never create separate sidecar nodes (see `diagram-conventions.md` Rule 1)
+9. Intra-pod localhost flows are implicit — do NOT draw them in diagrams
+10. Action Summary IS the recommendations — no separate `### Key Recommendations` section
+11. Include `> **Note on threat counts:**` blockquote in Executive Summary
+12. Every finding MUST have CVSS 4.0 (score AND full vector string), CWE (with hyperlink), and OWASP (`:2025` suffix)
+13. OWASP suffix is always `:2025` (e.g., `A01:2025 – Broken Access Control`)
+14. Include Threat Coverage Verification table at end of `3-findings.md` mapping every threat → finding
+15. Every component in `0.1-architecture.md` MUST appear in `2-stride-analysis.md`
+16. First 3 scenarios in `0.1-architecture.md` MUST have Mermaid sequence diagrams
+17. `0-assessment.md` MUST include `## Analysis Context & Assumptions` with `### Needs Verification` and `### Finding Overrides` tables
+18. `### Quick Wins` subsection is REQUIRED under Action Summary (include heading with note if none)
+19. ALL 7 sections in `0-assessment.md` are MANDATORY: Report Files, Executive Summary, Action Summary, Analysis Context & Assumptions, References Consulted, Report Metadata, Classification Reference
+20. **Deployment Classification is BINDING.** In `0.1-architecture.md`, set `Deployment Classification` and fill the Component Exposure Table. If classification is `LOCALHOST_DESKTOP` or `LOCALHOST_SERVICE`: zero T1 findings, zero `Prerequisites = None`, zero `AV:N` for non-listener components. See `analysis-principles.md` Deployment Context table.
+21. Finding IDs MUST be sequential top-to-bottom: FIND-01, FIND-02, FIND-03... Renumber after sorting
+22. CWE MUST include hyperlink: `[CWE-306](https://cwe.mitre.org/data/definitions/306.html): Missing Authentication`
+23. After STRIDE, run the Technology-Specific Security Checklist in `analysis-principles.md`. Every technology in the repo needs at least one finding or documented mitigation
+24. CVSS `AV:L` or `PR:H` → finding CANNOT be Tier 1. Downgrade to T2/T3. See CVSS-to-Tier Consistency Check in `analysis-principles.md`
+25. Use only `Low`/`Medium`/`High` effort labels. NEVER generate time estimates, sprint phases, or scheduling. See Prohibited Content in `output-formats.md`
+26. References Consulted: use the exact two-subsection format from `output-formats.md` — `### Security Standards` (3-column table with full URLs) and `### Component Documentation` (3-column table with URLs)
+27. Report Metadata: include ALL fields from `output-formats.md` template — Model, Analysis Started, Analysis Completed, Duration. Run `Get-Date -Format "yyyy-MM-dd HH:mm:ss" -AsUTC` at Step 1 and before writing `0-assessment.md`
+28. `## Summary` table in `2-stride-analysis.md` MUST appear at the TOP, immediately after `## Exploitability Tiers`, BEFORE individual component sections
+29. Related Threats: every threat ID MUST be a hyperlink to `2-stride-analysis.md#component-anchor`. Format: `[T02.S](2-stride-analysis.md#component-name)`
+30. Diagram colors: copy classDef lines VERBATIM from `diagram-conventions.md`. Only allowed fills: `#6baed6` (process), `#fdae61` (external), `#74c476` (datastore). Only allowed strokes: `#2171b5`, `#d94701`, `#238b45`, `#e31a1c`. Use ONLY `%%{init: {'theme': 'base', 'themeVariables': { 'background': '#ffffff', 'primaryColor': '#ffffff', 'lineColor': '#666666' }}}%%` — no other themeVariables keys
+31. Summary DFD: after creating `1.1-threatmodel.mmd`, run the POST-DFD GATE in Step 4. The gate and `skeleton-summary-dfd.md` control whether `1.2-threatmodel-summary.mmd` is generated.
+32. Report Files table in `0-assessment.md`: list `0-assessment.md` (this document) as the FIRST row, followed by 0.1-architecture.md, 1-threatmodel.md, etc. Use the exact template from `output-formats.md`
+33. `threat-inventory.json` MUST be generated for every analysis run (Step 8b). This file enables future comparisons. See `output-formats.md` for schema.
+34. **NEVER delete, modify, or remove any existing `threat-model-*` or `threat-model-compare-*` folders** in the repository. Only write to your own timestamped output folder. Cleaning up temporary git worktrees you created is allowed; deleting other report folders is FORBIDDEN.
+
+### Rule Precedence (when guidance conflicts)
+
+Apply rules in this order:
+1. Literal skeletons in `skeletons/skeleton-*.md` — exact section/table headers and attribute rows
+2. Mandatory Rules in `orchestrator.md` (this list)
+3. Examples in `output-formats.md` (examples are illustrative, not authoritative when they differ from literal skeletons)
+
+If any conflict is detected, follow the highest-precedence item.
+
+**Post-generation:** The verification sub-agent will scan your output for all known deviations listed in `verification-checklist.md` Phase 0. Fix any failures before finalizing.
+
+---
+
+## Workflow
+
+**Exclusions:** Skip these directories:
+- `threat-model-*` (previous reports)
+- `node_modules`, `.git`, `dist`, `build`, `vendor`, `__pycache__`
+
+**Pre-work:** Before writing any output file, scan `verification-checklist.md` Phase 1 (Per-File Structural Checks) and Phase 2 (Diagram Rendering Checks). This internalizes the quality gates so output is correct on the first pass — preventing costly rework. Do NOT run the full verification yet; that happens in Step 10.
+
+### ⛔ Sub-Agent Governance (MANDATORY — prevents duplicate work)
+
+Sub-agents are **independent execution contexts** — they have no memory of the parent's state, instructions, or other sub-agents. Without strict governance, sub-agents will independently perform the ENTIRE analysis, creating duplicate report folders and wasting ~15 min compute + ~100K tokens per duplication.
+
+**Rule 1 — Parent owns ALL file creation.** The parent agent is the ONLY agent that calls `create_file` for report files (0.1-architecture.md, stride-analysis.md, findings.md, etc.). Sub-agents NEVER write report files.
+
+**Rule 2 — Sub-agents are READ-ONLY helpers.** Sub-agents may:
+- Search source code for specific patterns (e.g., "find all auth-related code")
+- Read and analyze files, then return structured data to the parent
+- Run verification checks and return PASS/FAIL results
+- Execute terminal commands (git diff, grep) and return output
+
+**Rule 3 — Sub-agent prompts must be NARROW and SPECIFIC.** Never tell a sub-agent to "perform threat model analysis" or "generate the report." Instead:
+- ✅ "Read these 5 Go files and list every function that handles credentials. Return a table of function name, file, line number."
+- ✅ "Run the verification checklist against the files in {folder}. Return PASS/FAIL for each check."
+- ✅ "Read threat-inventory.json from {path} and verify all array lengths match metrics. Return mismatches."
+- ❌ "Analyze this codebase and write the threat model files."
+- ❌ "Generate 0.1-architecture.md and stride-analysis.md for this component."
+
+**Rule 4 — Output folder path.** The parent creates the timestamped output folder in Step 1 and uses that exact path for ALL `create_file` calls. If a sub-agent needs to read previously written report files, pass the folder path in the sub-agent prompt.
+
+**Rule 5 — The ONLY exception** is `threat-inventory.json` generation (Step 8b), where the parent MAY delegate JSON writing to a sub-agent IF the data is too large. In that case, the sub-agent prompt MUST include: (a) the exact output file path, (b) the data to serialize, and (c) explicit instruction: "Write ONLY this one file. Do NOT create any other files or folders."
+
+### Steps
+
+1. **Record start time & gather context**
+   - Run `Get-Date -Format "yyyy-MM-dd HH:mm:ss" -AsUTC` and store as `START_TIME`
+   - Get git info: `git remote get-url origin`, `git branch --show-current`, `git rev-parse --short HEAD`, `git log -1 --format="%ai" HEAD` (commit date — NOT today's date), `hostname`
+   - Map the system: identify components, trust boundaries, data flows
+   - **Reference:** `analysis-principles.md` for security infrastructure inventory
+
+   **⛔ DEPLOYMENT CLASSIFICATION (MANDATORY — do this BEFORE analyzing code for threats):**
+   Determine the system's deployment class from code evidence (see `skeleton-architecture.md` for values).
+   Record in `0.1-architecture.md` → Deployment Model section. Then fill the **Component Exposure Table** — one row per component showing listen address, auth barrier, external reachability, and minimum prerequisite.
+   This table is the **single source of truth** for prerequisite floors. No threat or finding may have a lower prerequisite than what the exposure table permits for its component.
+
+   **⛔ DETERMINISTIC NAMING — Apply BEFORE writing any files:**
+   
+   When identifying components, assign each a canonical PascalCase `id`. The naming MUST be deterministic — two independent runs on the same codebase MUST produce the same component IDs.
+
+   **⛔ ABSOLUTE RULE: Every component ID MUST be anchored to a real code artifact.**
+   For every component you identify, you MUST be able to point to a specific class, file, or manifest in the codebase that is the "anchor" for that component. If no such artifact exists, the component does not exist.
+
+   **Naming procedure (follow IN ORDER — stop at the first match):**
+   1. **Primary class name** — Use the EXACT class name from the source code. Do NOT abbreviate, expand, or rephrase it.
+      - `TaskProcessor.cs` → `TaskProcessor` (NOT `TaskServer`, NOT `TaskService`)
+      - `SessionStore.cs` → `SessionStore` (NOT `FileSessionStore`, NOT `SessionService`)
+      - `TerminalUserInterface.cs` → `TerminalUserInterface` (NOT `TerminalUI`)
+      - `PowerShellCommandExecutor.cs` → `PowerShellCommandExecutor` (NOT `PowerShellExecutor`)
+      - `ResponsesAPIService.cs` → `ResponsesAPIService` (NOT `LLMService` — that's a DIFFERENT class)
+      - `MCPHost.cs` → `MCPHost` (NOT `OrchestrationHost`)
+   2. **Primary script name** → `Import-Images.ps1` → `ImportImages`
+   3. **Primary config/manifest name** → `Dockerfile` → `DockerContainer`, `values.yaml` → `HelmChart`
+   4. **Directory name** (if component spans multiple files) → `src/ParquetParsing/` → `ParquetParser`
+   5. **Technology name** (for external services/datastores) → "Azure OpenAI" → `AzureOpenAI`, "Redis" → `Redis`
+   6. **External actor role** → `Operator`, `EndUser` (never drop these)
+
+   **⛔ Helm/Kubernetes Deployment Naming (CRITICAL for comparison stability):**
+   When a component is deployed via Helm chart or Kubernetes manifests, use the **Kubernetes workload name** (from the Deployment/StatefulSet metadata.name) as the component ID — NOT the Helm template filename or directory structure:
+   - Look at `metadata.name` in deployment YAML → use that as the component ID (PascalCase normalized)
+   - Example: `metadata.name: devportal` in `templates/knowledge/devportal-deployment.yml` → component ID is `DevPortal`
+   - Example: `metadata.name: phi-model` in `templates/knowledge/phi-deployment.yml` → component ID is `PhiModel`
+   - **Why:** Helm templates frequently get reorganized (e.g., moved from `templates/` to `templates/knowledge/`) but the Kubernetes workload name stays the same. Using the workload name ensures the component ID survives directory reorganizations.
+   - `source_files` MUST include the deployment YAML path AND the application source code path (e.g., both `helmchart/myapp/templates/knowledge/devportal-deployment.yml` AND `developer-portal/src/`)
+   - `source_directories` MUST include BOTH the Helm template directory AND the source code directory
+
+   **External Service Anchoring (for components without repo source code):**
+   External services (cloud APIs, managed databases, SaaS endpoints) don't have source files in the repository. Anchor them to their **integration point** in the codebase:
+   - `source_files` → the client class or config file that defines the connection (e.g., `src/MCP/appsettings.json` for Azure OpenAI connection config, `helmchart/values.yaml` for Redis endpoint config)
+   - `source_directories` → the directory containing the integration code (e.g., `src/MCP/Core/Services/LLM/` for the LLM client)
+   - `class_names` → the CLIENT class in YOUR repo that talks to the service (e.g., `ResponsesAPIService`), NOT the vendor's SDK class (e.g., NOT `OpenAIClient`). If no dedicated client class exists, leave empty.
+   - `namespace` → leave empty `""` (external services don't have repo namespaces)
+   - `config_keys` → the env vars / config keys for the service connection (e.g., `["AZURE_OPENAI_ENDPOINT", "RESPONSES_API_DEPLOYMENT"]`). These are the most stable anchors for external services.
+   - `api_routes` → leave empty (external services expose their own routes, not yours)
+   - `dependencies` → the SDK package used (e.g., `["Azure.AI.OpenAI"]` for NuGet, `["pymilvus"]` for pip)
+   
+   **Why this matters:** External services frequently change display names across LLM runs (e.g., "Azure OpenAI" vs "GPT-4 Endpoint" vs "LLM Backend"). The `config_keys` and `dependencies` fields are what make them matchable across runs.
+
+   **⛔ FORBIDDEN naming patterns — NEVER use these:**
+   - NEVER invent abstract names that don't correspond to a real class: `ConfigurationStore`, `LocalFileSystem`, `DataLayer`, `IngestionPipeline`, `BackendServer`
+   - NEVER abbreviate a class name: `TerminalUI` for `TerminalUserInterface`, `PSExecutor` for `PowerShellCommandExecutor`
+   - NEVER substitute a synonym: `TaskServer` for `TaskProcessor`, `LLMService` for `ResponsesAPIService`
+   - NEVER merge two separate classes into one component: `ResponsesAPIService` and `LLMService` are two different classes → two different components
+   - NEVER create a component for something that doesn't exist in the code: if there's no Windows Registry access code, don't create a `WindowsRegistry` component
+   - NEVER rename between runs: if you called it `TaskProcessor` in run 1, it MUST be `TaskProcessor` in run 2
+
+   **⛔ COMPONENT ANCHOR VERIFICATION (MANDATORY — do this BEFORE Step 2):**
+   After identifying all components, create a mental checklist:
+   ```
+   For EACH component:
+     Q: What is the EXACT filename or class that anchors this component?
+     A: [must cite a real file path, e.g., "src/Core/TaskProcessor.cs"]
+     If you cannot cite a real file → DELETE the component from your list
+   ```
+   This verification catches invented components like `WindowsRegistry` (no registry code exists), `ConfigurationStore` (no such class), `LocalFileSystem` (abstract concept, not a class).
+
+   **⛔ COMPONENT SELECTION STABILITY (when multiple related classes exist):**
+   Many systems have clusters of related classes (e.g., `CredentialManager`, `AzureCredentialProvider`, `AzureAuthenticationHandler`). To ensure deterministic selection:
+   - **Pick the class that OWNS the security-relevant behavior** — the one that makes the trust decision, holds the credential, or processes the data
+   - **Prefer the class registered in dependency injection** over helpers/utilities
+   - **Prefer the higher-level orchestrator** over its internal implementation classes
+   - **Once you pick a class, its alternatives become aliases** — add them to the `aliases` array, not as separate components
+   - **Example**: If `CredentialManager` orchestrates credential lookup and uses `AzureCredentialProvider` internally, `CredentialManager` is the component and `AzureCredentialProvider` is an alias
+   - **Example**: Do NOT include both `SessionStore` and `SessionFiles` — `SessionStore` is the class, `SessionFiles` is an abstract concept
+   - **Count rule**: Two runs on the same code MUST produce the same number of components (±1 for edge cases). A difference of ≥3 components indicates the selection rules were not followed.
+
+   **⛔ STABILITY ANCHORS (for comparison matching):**
+   When recording each component in `threat-inventory.json`, the `fingerprint` fields `source_directories`, `class_names`, and `namespace` serve as **stability anchors** — immutable identifiers that persist even when:
+   - The class is renamed (directory stays the same)
+   - The file is moved to a different directory (class name stays the same)
+   - The component ID changes between analysis runs (namespace stays the same)
+   The comparison matching algorithm relies on these anchors MORE than on the component `id` field. Therefore:
+   - `source_directories` MUST be populated for every process-type component (never empty `[]`)
+   - `class_names` MUST include at least the primary class name
+   - `namespace` MUST be the actual code namespace (e.g., `MyApp.Core.Servers.Health`), not a made-up grouping
+   - These fields are what make a component identifiable across independent analysis runs, even if two LLMs pick different display names
+
+   **⛔ COMPONENT ELIGIBILITY — What qualifies as a threat model component:**
+   A class/service becomes a threat model component ONLY if it meets ALL of these criteria:
+   1. **It crosses a trust boundary OR handles security-sensitive data** (credentials, user input, network I/O, file I/O, process execution)
+   2. **It is a top-level service**, not an internal helper (registered in DI, or the main entry point, or an agent with its own responsibility)
+   3. **It would appear in a deployment diagram** — you could point to it and say "this runs here, talks to that"
+   
+   **ALWAYS include these component types (if they exist in the code):**
+   - ALL agent classes (HealthAgent, InfrastructureAgent, InvestigatorAgent, SupportabilityAgent, etc.)
+   - ALL MCP server classes (HealthServer, InfrastructureServer, etc.)
+   - The main host/orchestrator (MCPHost, etc.)
+   - ALL external service connections (AzureOpenAI, AzureAD, etc.)
+   - ALL credential/auth managers
+   - The user interface entry point
+   - ALL tool execution services (PowerShellCommandExecutor, etc.)
+   - ALL session/state persistence services
+   - ALL LLM service classes (ResponsesAPIService, LLMService — if they are separate classes, they are separate components)
+   - External actors (Operator, EndUser)
+   
+   **NEVER include these as separate components:**
+   - Loggers (LocalFileLogger, TelemetryLogger) — these are cross-cutting concerns, not threat model components
+   - Static helper classes
+   - Model/DTO classes
+   - Configuration builders (unless they handle secrets)
+   - Infrastructure-as-code classes that don't exist at runtime (AzureStackHCI cluster reference, deployment scripts)
+   
+   **The goal:** Every run on the same code should identify the SAME set of ~12-20 components. If you're including a logger or excluding an agent, you're doing it wrong.
+
+   **Boundary naming rules:**
+   - Boundary IDs MUST be PascalCase (never `Layer`, `Zone`, `Group`, `Tier` suffixes)
+   - Derive from deployment topology, NOT from code architecture layers
+   - **Deployment topology determines boundaries:**
+     - Single-process app → **EXACTLY 2 boundaries**: `Application` (the process) + `External` (external services). NEVER use 1 boundary. NEVER use 3+ boundaries. This is mandatory for single-process apps.
+     - Multi-container app → boundaries per container/pod
+     - K8s deployment → `K8sCluster` + per-namespace boundaries if relevant
+     - Client-server → `Client` + `Server`
+   - **K8s multi-service deployments (CRITICAL for microservice architectures):**
+     When a K8s namespace contains multiple Deployments/StatefulSets with DIFFERENT security characteristics, create sub-boundaries based on workload type:
+     - `BackendServices` — API services (FastAPI, Express, etc.) that handle user requests
+     - `DataStorage` — Databases and persistent storage (Redis, Milvus, PostgreSQL, NFS) — these have different access controls, persistence, and backup policies
+     - `MLModels` — ML model servers running on GPU nodes — these have different compute resources, attack surfaces (adversarial inputs), and scaling characteristics
+     - `Agentic` — Agent runtime/manager services if present
+     - The outer `K8sCluster` contains these sub-boundaries
+     - **This is NOT "code layers"** — each sub-boundary represents a different Kubernetes Deployment/StatefulSet with its own security context, resource limits, and network policies
+     - **Test**: If two components are in DIFFERENT Kubernetes Deployments with different service accounts, different network exposure, or different resource requirements → they SHOULD be in different sub-boundaries
+   - **FORBIDDEN boundary schemes (for SINGLE-PROCESS apps only):**
+     - Do NOT create boundaries based on code layers: `PresentationBoundary`, `OrchestrationBoundary`, `AgentBoundary`, `ServiceBoundary` are CODE LAYERS, not deployment boundaries. All these run in the SAME process.
+     - Do NOT split a single process into 4+ boundaries. If all components run in one .exe, they are in ONE boundary.
+   - **Example**: An application where `TerminalUserInterface`, `MCPHost`, `HealthAgent`, `ResponsesAPIService` all run in the same process → they are ALL in `Application`. External services like `AzureOpenAI` are in `External`.
+   - Two runs on the same code MUST produce the same number of boundaries (±1). A difference of ≥2 boundaries is WRONG.
+   - NEVER create boundaries based on code layers (Presentation/Business/Data) — boundaries represent DEPLOYMENT trust boundaries, not code architecture
+
+   **Boundary count locking:**
+   - After identifying boundaries, LOCK the count. Two runs on the same code MUST produce the same number of boundaries (±1 acceptable if one run identifies an edge boundary the other doesn't)
+   - A 4-boundary vs 7-boundary difference on the same code is WRONG and indicates the naming rules were not followed
+
+   **Additional naming rules:**
+   - The SAME component must get the SAME `id` regardless of which LLM model runs the analysis or how many times it runs
+   - External actors (`Operator`, `AzureDataStudio`, etc.) are ALWAYS included — never drop them
+   - Datastores representing distinct storage (files, database) are ALWAYS separate components — never merge them
+   - Lock the component list before Step 2. Use these exact IDs in ALL subsequent files (architecture, DFD, STRIDE, findings, JSON)
+   - If two classes exist as separate files (e.g., `ResponsesAPIService.cs` and `LLMService.cs`), they are TWO components even if they seem related
+
+   **⛔ DATA FLOW COMPLETENESS (MANDATORY — ensures consistent flow enumeration across runs):**
+   Data flows MUST be enumerated exhaustively. Two independent analyses of the same codebase MUST produce the same set of flows. To achieve this:
+   
+   **⛔ RETURN FLOW MODELING RULE (addresses 24% variance in flow counts):**
+   - **DO NOT model separate return flows.** A request-response pair is ONE bidirectional flow (use `<-->` in Mermaid).
+   - Example: `DF01: Operator <--> TUI` (one flow for input and output)
+   - Example: `DF03: MCPHost <--> HealthAgent` (one flow for delegation and result)
+   - **DO model separate flows ONLY when the two directions use different protocols or semantics** (e.g., HTTP request vs WebSocket push-back).
+   - **Why:** When runs independently decide whether to create 1 flow or 2 flows per interaction, the flow count varies by 20-30%. This rule eliminates that variance.
+   - **Flow count formula:** `# flows ≈ # unique component-to-component interactions`. If component A talks to component B, that is 1 flow, not 2.
+   
+   **Flow completeness checklist (use `<-->` bidirectional flows per the return flow rule above):**
+   1. **Ingress/reverse proxy flows**: `DF_EndUser_to_NginxIngress` (bidirectional `<-->`), `DF_NginxIngress_to_Backend` (bidirectional `<-->`). Each is ONE flow, not two.
+   2. **Database/datastore flows**: `DF_Service_to_Redis` (bidirectional `<-->`). ONE flow per service-datastore pair.
+   3. **Auth provider flows**: `DF_Service_to_AzureAD` (bidirectional `<-->`). ONE flow per service-auth pair.
+   4. **Admin access flows**: `DF_Operator_to_Service` (bidirectional `<-->`). ONE per admin interaction.
+   5. **Flow count locking**: After enumerating flows, LOCK the count. Two runs on the same code MUST produce the same number of flows (±3 acceptable). A difference of >5 flows indicates incomplete enumeration.
+   
+   **⛔ EXTERNAL ENTITY INCLUSION RULES (addresses variance in which externals are modeled):**
+   - **ALWAYS include `AzureAD` (or `EntraID`) as an external entity** if the code acquires tokens from Azure AD / Microsoft Entra ID (look for `ChainedTokenCredential`, `ManagedIdentityCredential`, `AzureCliCredential`, MSAL, or any OAuth2/OIDC flow).
+   - **ALWAYS include the infrastructure target** (e.g., `OnPremInfra`, `HCICluster`) as an external entity if the code sends commands to external infrastructure via PowerShell, REST, or WMI.
+   - **ALWAYS include `AzureOpenAI`** (or equivalent LLM endpoint) if the code calls a cloud LLM API.
+   - **ALWAYS include `Operator`** as an external actor for CLI/TUI tools, admin tools, or operator consoles.
+   - **Rule of thumb:** If the code has a client class or config for a service, that service is an external entity.
+   
+   **⛔ TMT CATEGORY RULES (addresses category inconsistency across runs):**
+   - **Tool servers** that expose APIs callable by agents → `SE.P.TMCore.WebSvc` (NOT `SE.P.TMCore.NetApp`)
+   - **Network-level services** that handle connections/sockets → `SE.P.TMCore.NetApp`
+   - **Services that execute OS commands** (PowerShell, bash) → `SE.P.TMCore.OSProcess`
+   - **Services that store data to disk** (SessionStore, FileLogger) → `SE.DS.TMCore.FS` (classify as Data Store, NOT Process)
+   - **Rule:** If a class's primary purpose is persisting data, it is a Data Store. If it does computation or orchestration, it is a Process. Never switch between runs.
+   
+   **⛔ DFD DIRECTION (MANDATORY — addresses layout variance):**
+   - ALL DFDs MUST use `flowchart LR` (left-to-right). NEVER use `flowchart TB`.
+   - ALL summary DFDs MUST also use `flowchart LR`.
+   - This is immutable — do not change based on aesthetics or diagram shape.
+
+   **Acronym rules for PascalCase:**
+   - Preserve well-known acronyms as ALL-CAPS: `API`, `NFS`, `LLM`, `SQL`, `HCI`, `AD`, `UI`, `DB`
+   - Examples: `IngestionAPI` (not `IngestionApi`), `NFSServer` (not `NfsServer`), `AzureAD` (not `AzureAd`), `VectorDBAPI` (not `VectorDbApi`)
+   - Single-word technologies keep standard casing: `Redis`, `Milvus`, `PostgreSQL`, `Nginx`
+
+   **Common technology naming (use EXACTLY these IDs for well-known infrastructure):**
+   - Redis cache/state: `Redis` (never `DaprStateStore`, `RedisCache`, `StateStore`)
+   - Milvus vector DB: `Milvus` (never `MilvusVectorDb`, `VectorDB`)
+   - NGINX ingress: `NginxIngress` (never `IngressNginx`)
+   - Azure AD/Entra: `AzureAD` (never `AzureAd`, `EntraID`)
+   - PostgreSQL: `PostgreSQL` (never `PostgresDb`, `Postgres`)
+   - User/Operator: `Operator` for admin users, `EndUser` for end users
+   - Azure OpenAI: `AzureOpenAI` (never `OpenAIService`, `LLMEndpoint`)
+   - NFS: `NFSServer` (never `NfsServer`, `FileShare`)
+   - If two LLM models are separate deployments, keep them separate (never merge `MistralLLM` + `PhiLLM` into `LocalLlm`)
+   
+   **BUT: for application-specific classes, use the EXACT class name from the code, NOT a technology label:**
+   - `ResponsesAPIService.cs` → `ResponsesAPIService` (NOT `OpenAIService` — the class IS named ResponsesAPIService)
+   - `TaskProcessor.cs` → `TaskProcessor` (NOT `LocalLLM` — the class IS named TaskProcessor)
+   - `SessionStore.cs` → `SessionStore` (NOT `StatePersistence` — the class IS named SessionStore)
+   **Component granularity rules (CRITICAL for stability):**
+   - Model components at the **technology/service level**, not the script/file level
+   - A Docker container running Kusto is `KustoContainer` — NOT decomposed into `KustoService` + `IngestLogs` + `KustoDataDirectory`
+   - A Moby Docker engine is `MobyDockerEngine` — NOT `InstallMoby` (the installer script is evidence, not the component)
+   - An installer for a tool is `SetupInstaller` — NOT renamed to `InstallAzureEdgeDiagnosticTool` (script filename)
+   - Rule: if a component has one primary function (e.g., "run Kusto queries"), model it as ONE component regardless of how many scripts/files implement it
+   - Scripts are EVIDENCE for components, not components themselves
+   - Keep the same granularity across runs — never split a single component into sub-components or merge sub-components between runs
+
+   **⛔ COMPONENT ID FORMAT (MANDATORY — addresses casing variance):**
+   - ALL component IDs MUST be PascalCase. NEVER use kebab-case, snake_case, or camelCase.
+   - Examples: `HealthAgent` (not `health-agent`), `AzureAD` (not `azure-ad`), `MCPHost` (not `mcp-host`)
+   - This applies to ALL artifacts: 0.1-architecture.md, 1-threatmodel.md, DFD mermaid, STRIDE, findings, JSON.
+
+   **⛔ STRIDE SCOPE RULE (addresses external entity analysis variance):**
+   - STRIDE analysis in `2-stride-analysis.md` MUST include sections for ALL elements in the Element Table EXCEPT external actors (Operator, EndUser).
+   - External services (AzureOpenAI, AzureAD, OnPremInfra) DO get STRIDE sections — they are attack surfaces from YOUR system's perspective.
+   - External actors (human users) do NOT get STRIDE sections — they are threat SOURCES, not targets.
+   - This means: if you have 20 elements total and 1 is an external actor, you write 19 STRIDE sections.
+
+   **⛔ STRIDE DEPTH CONSISTENCY (addresses threat count variance):**
+   - Each component MUST get ALL 7 STRIDE-A categories analyzed (S, T, R, I, D, E, A).
+   - Each STRIDE category MUST be explicitly addressed per component: either with one or more concrete threats, OR with an explicit `N/A — {1-sentence justification}` row explaining why that category does not apply to this specific component.
+   - A category may produce 0, 1, 2, 3, or more threats — the count depends on the component's actual attack surface. Do NOT cap at 1 threat per category. Components with rich security surfaces (API services, auth managers, command executors, LLM clients) should typically have 2-4 threats per relevant STRIDE category. Only simple components (static config, read-only data stores) should have mostly 0-1.
+   - **Expected distribution:** For a 15-component system: ~30% of STRIDE cells should be 0 (with N/A), ~40% should be 1, ~25% should be 2, ~5% should be 3+. If ALL cells are 0 or 1 (binary pattern) → the analysis is too shallow. Go back and identify additional threat vectors.
+   - N/A entries do NOT count toward threat totals in the Summary table. Only concrete threat rows count.
+   - The Summary table S/T/R/I/D/E/A columns show the COUNT of concrete threats per category (0 is valid if N/A was justified).
+   - This ensures comprehensive coverage while producing accurate, non-inflated threat counts.
+
+2. **Write architecture overview** (`0.1-architecture.md`)
+   - **Read `skeletons/skeleton-architecture.md` first** — copy skeleton structure, fill `[FILL]` placeholders
+   - System purpose, key components, top scenarios, tech stack, deployment
+   - **Use the exact component IDs locked in Step 1** — do not rename or merge components
+   - **Reference:** `output-formats.md` for template, `diagram-conventions.md` for architecture diagram styles
+
+3. **Inventory security infrastructure**
+   - Identify security-enabling components before flagging gaps
+   - **Reference:** `analysis-principles.md` Security Infrastructure Inventory table
+
+4. **Produce threat model DFD** (`1.1-threatmodel.mmd`, `1.2-threatmodel-summary.mmd`, `1-threatmodel.md`)
+   - **Read `skeletons/skeleton-dfd.md`, `skeletons/skeleton-summary-dfd.md`, and `skeletons/skeleton-threatmodel.md` first**
+   - **Reference:** `diagram-conventions.md` for DFD styles, `tmt-element-taxonomy.md` for element classification
+   - ⚠️ **BEFORE FINALIZING:** Run the Pre-Render Checklist from `diagram-conventions.md`
+
+   ⛔ **POST-DFD GATE — Run IMMEDIATELY after creating `1.1-threatmodel.mmd`:**
+   1. Count elements (nodes with `((...))`, `[(...)`, `["..."]`) in `1.1-threatmodel.mmd`
+   2. Count boundaries (`subgraph` lines)
+   3. If elements > 15 OR boundaries > 4:
+      → You MUST create `1.2-threatmodel-summary.mmd` using `skeleton-summary-dfd.md` NOW
+      → Do NOT proceed to `1-threatmodel.md` until the summary file exists
+   4. If threshold NOT met → skip summary, proceed to `1-threatmodel.md`
+   5. Create `1-threatmodel.md` (include Summary View section if summary was generated)
+
+5. **Enumerate threats** per element and flow using STRIDE-A (`2-stride-analysis.md`)
+   - **Read `skeletons/skeleton-stride-analysis.md` first** — use Summary table and per-component structure
+   - **Reference:** `analysis-principles.md` for tier definitions, `output-formats.md` for STRIDE template
+   - **⛔ PREREQUISITE FLOOR CHECK (per threat):** Before assigning a prerequisite to any threat, look up the component's `Min Prerequisite` and `Derived Tier` in the Component Exposure Table (`0.1-architecture.md`). The threat's prerequisite MUST be ≥ the component's floor. The threat's tier MUST be ≥ the component's derived tier (i.e., if component is T2, no threat can be T1). Use the canonical prerequisite→tier mapping from `analysis-principles.md`.
+
+6. **For each threat:** cite files/functions/endpoints, propose mitigations, provide verification steps
+
+7. **Verify findings** — confirm each finding against actual configuration before documenting
+   - **Reference:** `analysis-principles.md` Finding Validation Checklist
+
+7b. **Technology sweep** — Run the Technology-Specific Security Checklist from `analysis-principles.md`
+   - For every technology found in the repo (Redis, Milvus, PostgreSQL, Docker, K8s, ML models, LLMs, NFS, CI/CD, etc.), verify you have at least one finding or explicit mitigation
+   - This step catches gaps that component-level STRIDE misses (e.g., database auth defaults, container hardening, key management)
+   - Add any missing findings before proceeding to Step 8
+
+8. **Compile findings** (`3-findings.md`)
+   - **Reference:** `output-formats.md` for findings template and Related Threats link format
+   - **Reference:** `skeletons/skeleton-findings.md` — read this skeleton, copy VERBATIM, fill in `[FILL]` placeholders for each finding
+
+   ⛔ **PRE-WRITE GATE — Verify before calling `create_file` for `3-findings.md`:**
+   1. Finding IDs: `### FIND-01:`, `### FIND-02:` — sequential, `FIND-` prefix (NOT `F01` or `F-01`)
+   2. CVSS prefix: every vector starts with `CVSS:4.0/` (NOT bare `AV:N/AC:L/...`)
+   3. Related Threats: each threat ID is a separate hyperlink `[TNN.X](2-stride-analysis.md#anchor)` (NOT plain text)
+   4. Sub-sections: `#### Description`, `#### Evidence`, `#### Remediation`, `#### Verification` (NOT `Recommendation`)
+   5. Sort: within each tier → Critical → Important → Moderate → Low → higher CVSS first
+   6. All 10 mandatory attribute rows present per finding
+   7. **Deployment context gate (FAIL-CLOSED):** Read `0.1-architecture.md` Deployment Classification and Component Exposure Table.
+      If classification is `LOCALHOST_DESKTOP` or `LOCALHOST_SERVICE`:
+      - ZERO findings may have `Exploitation Prerequisites` = `None` → fix to `Local Process Access` (T2) or `Host/OS Access` (T3)
+      - ZERO findings may be in `## Tier 1` → downgrade to T2/T3 based on prerequisite
+      - ZERO CVSS vectors may use `AV:N` unless the **specific component** has `Reachability = External` in the Component Exposure Table → fix to `AV:L`
+      For ALL deployment classifications:
+      - For EACH finding, look up its Component in the exposure table. The finding's prerequisite MUST be ≥ the component's `Min Prerequisite`. The finding's tier MUST be ≥ the component's `Derived Tier`.
+      - Prerequisites MUST use only canonical values: `None`, `Authenticated User`, `Privileged User`, `Internal Network`, `Local Process Access`, `Host/OS Access`, `Admin Credentials`, `Physical Access`, `{Component} Compromise`. ⛔ `Application Access` and `Host Access` are FORBIDDEN.
+      If ANY violation exists → **DO NOT WRITE THE FILE.** Fix all violations first.
+
+   ⛔ **Fail-fast gate:** Immediately after writing, run the Inline Quick-Checks for `3-findings.md` from `verification-checklist.md`. Fix before proceeding.
+
+   ⛔ **MANDATORY: All 3 tier sections must be present.** Even if a tier has zero findings, include the heading with a note:
+   - `## Tier 1 — Direct Exposure (No Prerequisites)` → `*No Tier 1 findings identified for this repository.*`
+   - This ensures structural consistency for comparison matching and validation.
+
+   ⛔ **COVERAGE VERIFICATION FEEDBACK LOOP (MANDATORY):**
+   After writing the Threat Coverage Verification table at the end of `3-findings.md`:
+   1. **Scan the table you just wrote.** Count how many threats have status `✅ Covered` vs `🔄 Mitigated by Platform` vs `⚠️ Needs Review` vs `⚠️ Accepted Risk`.
+   2. **If ANY threat has `⚠️ Accepted Risk`** → FAIL. The tool cannot accept risks. Go back and create a finding for each one.
+   3. **If Platform ratio > 20%** → SUSPECT. Re-examine each `🔄 Mitigated by Platform` entry: is the mitigation truly from an EXTERNAL system managed by a DIFFERENT team? If the mitigation is the repo's own code (auth middleware, file permissions, TLS config, localhost binding), reclassify as `Open` and create a finding.
+   4. **If ANY `Open` threat in `2-stride-analysis.md` has NO corresponding finding** → create a finding NOW. Use the threat's description as the finding title, the mitigation column as the remediation guidance, and assign severity based on STRIDE category.
+   5. **Update `3-findings.md`** with the newly created findings. Renumber sequentially. Update the Coverage table to show `✅ Covered` for each.
+   6. **This loop is the ENTIRE POINT of the Coverage table** — it's not documentation, it's a self-check that forces complete coverage. If you write the table and don't act on gaps, you've wasted the effort.
+
+8b. **Generate threat inventory** (`threat-inventory.json`)
+   - **Read `skeletons/skeleton-inventory.md` first** — use exact field names and schema structure
+   - After writing all markdown reports, compile a structured JSON inventory of all components, boundaries, data flows, threats, and findings
+    - Use canonical PascalCase IDs for components (derived from class/file names) and keep display labels separate
+   - Use canonical flow IDs: `DF_{Source}_to_{Target}`
+    - Include identity keys on every threat and finding for future matching
+    - Include deterministic identity fields for component and boundary matching across runs:
+       - Component: `aliases`, `boundary_kind`, `fingerprint`
+       - Boundary: `kind`, `aliases`, `contains_fingerprint`
+    - Build `fingerprint` from stable evidence (source files, endpoint neighbors, protocols, type) — never from prose wording
+    - Normalize synonyms to the same canonical component ID (example: `SupportAgent` and `SupportabilityAgent` → `SupportabilityAgent`) and store alternate names in `aliases`
+    - Sort arrays deterministically before writing JSON:
+       - `components` by `id`
+       - `boundaries` by `id`
+       - `flows` by `id`
+       - `threats` by `id` then `identity_key.component_id`
+       - `findings` by `id` then `identity_key.component_id`
+   - Extract metrics (totals, per-tier counts, per-STRIDE-category counts)
+   - Include git metadata (commit SHA, branch, date) and analysis metadata (model, timestamps)
+   - **Reference:** `output-formats.md` for the `threat-inventory.json` schema
+   - **This file is NOT linked in 0-assessment.md** but is always present in the output folder
+
+   ⛔ **PRE-WRITE SIZE CHECK (MANDATORY — before calling `create_file` for JSON):**
+   Before writing `threat-inventory.json`, count the data you plan to include:
+   - Count total threats from `2-stride-analysis.md` (grep `^\| T\d+\.`)
+   - Count total findings from `3-findings.md` (grep `### FIND-`)
+   - Count total components from `0.1-architecture.md`
+   - **If threats > 50 OR findings > 15:** DO NOT use a single `create_file` call.
+     Instead, use one of: (a) delegate to sub-agent, (b) Python extraction script, (c) chunked write strategy.
+   - **If threats ≤ 50 AND findings ≤ 15:** single `create_file` is acceptable, but keep entries minimal (1-sentence description/mitigation fields).
+
+   ⛔ **POST-WRITE VALIDATION (MANDATORY — JSON Array Completeness):**
+   After writing `threat-inventory.json`, immediately verify:
+   - `threats.length == metrics.total_threats` — if mismatch, the threats array was truncated during generation. Rebuild by re-reading `2-stride-analysis.md` and extracting every threat row.
+   - `findings.length == metrics.total_findings` — if mismatch, rebuild from `3-findings.md`.
+   - `components.length == metrics.total_components` — if mismatch, rebuild from architecture/element tables.
+   
+   ⛔ **CROSS-FILE THREAT COUNT VERIFICATION (MANDATORY — catches dropped threats):**
+   The JSON `threats.length` can match `metrics.total_threats` but BOTH can be wrong if threats were dropped during JSON generation. To catch this:
+   - Count threat rows in `2-stride-analysis.md`: grep for `^\| T\d+\.` and count unique threat IDs
+   - Compare this count to `threats.length` in the JSON
+   - If the markdown has MORE threats than the JSON → the JSON dropped threats. Rebuild the JSON by re-extracting ALL threats from `2-stride-analysis.md`.
+   - This is the #2 quality issue observed in testing (after truncation). Large repos (114+ threats) frequently have 1-3 threats dropped when sub-agents write the JSON from memory instead of re-reading the STRIDE file.
+
+   ⛔ **FIELD NAME COMPLIANCE GATE (MANDATORY — run immediately after array check):**
+   Read the first component and first threat from the JSON just written and verify these EXACT field names:
+   - `components[0]` has key `"display"` (NOT `"display_name"`, NOT `"name"`) → if wrong, find-replace ALL occurrences
+   - `threats[0]` has key `"stride_category"` (NOT `"category"`) → if wrong, find-replace ALL occurrences
+   - `threats[0].identity_key` has key `"component_id"` (threat→component link must be INSIDE `identity_key`, NOT a top-level `component_id` field on the threat) → if wrong, restructure
+   - `threats[0]` has BOTH `"title"` (short name, e.g., "Information Disclosure — Redis unencrypted traffic") AND `"description"` (longer prose). If only `description` exists without `title`, create `title` from the first sentence of `description`. If `name` or `threat_name` exists instead of `title`, find-replace to `title`
+   - **Why this matters:** Downstream tooling depends on these exact field names. Wrong names cause zero-value heatmaps, broken component matching, and empty display labels in comparison reports.
+   - **If ANY field name is wrong:** fix it NOW with find-replace on the JSON file before proceeding. Do NOT leave it for verification.
+
+   - **This is the #1 quality issue observed in testing.** Large repos (20+ components, 80+ threats) frequently have truncated JSON arrays because the model runs out of output tokens. If ANY array is truncated, you MUST rebuild it before proceeding. Do NOT finalize with mismatched counts.
+
+   ⛔ **HARD GATE — TRUNCATION RECOVERY (MANDATORY):**
+   If post-write validation detects ANY array mismatch:
+   1. **DELETE** the truncated `threat-inventory.json` immediately
+   2. **DO NOT attempt to patch** the truncated file — partial JSON is unreliable
+   3. **Regenerate using one of these strategies** (in preference order):
+      a. **Delegate to a sub-agent** — hand the sub-agent the output folder path and instruct it to read `2-stride-analysis.md` and `3-findings.md`, then write `threat-inventory.json`. The sub-agent has a fresh context window.
+      b. **Python extraction script** — write a Python script that reads the markdown files, extracts threats/findings via regex, and writes the JSON. Run the script via terminal.
+      c. **Chunked write** — use the Large Repo Strategy below.
+   4. **Re-validate** after regeneration — if still mismatched, repeat with the next strategy
+   5. **NEVER proceed to Step 9 (assessment) or Step 10 (verification) with mismatched counts**
+   
+   ⛔ **LARGE REPO STRATEGY (MANDATORY for repos with >60 threats):**
+   For repos producing more than ~60 threats, the JSON file can exceed output token limits if generated in one pass. Use this chunked approach:
+   1. **Write metadata + components + boundaries + flows + metrics first** — these are small arrays
+   2. **Append threats in batches** — write threats array with ~20 threats per append operation. Use `replace_string_in_file` to add batches to the existing file rather than writing the entire JSON in one `create_file` call.
+   3. **Append findings** — similarly batch if >15 findings
+   4. **Final validation** — read the completed file and verify all array lengths match metrics
+   
+   **Alternative approach:** If chunked writing is not feasible, keep each threat/finding entry minimal:
+   - `description` field: max 1 sentence (not full prose paragraphs)
+   - `mitigation` field: max 1 sentence
+   - Remove redundant fields that duplicate markdown content
+   - The JSON is for MATCHING, not for reading — brevity is key
+
+9. **Write assessment** (`0-assessment.md`)
+   - **Reference:** `output-formats.md` for assessment template
+   - **Reference:** `skeletons/skeleton-assessment.md` — read this skeleton, copy VERBATIM, fill in `[FILL]` placeholders
+   - ⚠️ **ALL 7 sections are MANDATORY:** Report Files, Executive Summary, Action Summary (with Quick Wins), Analysis Context & Assumptions (with Needs Verification + Finding Overrides), References Consulted, Report Metadata, Classification Reference
+   - Do NOT add extra sections like "Severity Distribution", "Architecture Risk Areas", "Methodology Notes", or "Deliverables" — these are NOT in the template
+
+   ⛔ **PRE-WRITE GATE — Verify before calling `create_file` for `0-assessment.md`:**
+   1. Exactly 7 sections: Report Files, Executive Summary, Action Summary, Analysis Context & Assumptions (with `&`), References Consulted, Report Metadata, Classification Reference
+   2. `---` horizontal rules between EVERY pair of `## ` sections (minimum 6)
+   3. `### Quick Wins`, `### Needs Verification`, `### Finding Overrides` all present
+   4. References: TWO subsections (`### Security Standards` + `### Component Documentation`) with 3-column tables and full URLs
+   5. ALL metadata values wrapped in backticks; ALL fields present (Model, Analysis Started, Analysis Completed, Duration)
+   6. Element/finding/threat counts match actual counts from other files
+
+   ⛔ **Fail-fast gate:** Immediately after writing, run the Inline Quick-Checks for `0-assessment.md` from `verification-checklist.md`. Fix before proceeding.
+
+10. **Final verification** — iterative correction loop
+
+    This step runs verification and fixes in a loop until all checks pass. Do NOT finalize with any failures remaining.
+
+    **Pass 1 — Comprehensive verification:**
+    - Delegate to a verification sub-agent with the content of `verification-checklist.md` + the output folder path
+    - Sub-agent runs ALL Phase 0–5 checks and reports PASS/FAIL with evidence
+    - If ANY check fails:
+      1. Fix the failed file(s) using the available file-edit tool
+      2. Re-run ONLY the failed checks against the fixed file(s)
+      3. Repeat until the failed checks pass
+
+    **Pass 2 — Regression check (if Pass 1 had fixes):**
+    - Re-run Phase 3 (cross-file consistency) to ensure fixes didn’t break other files
+    - If new failures appear, fix and re-verify
+
+    **Exit condition:** ALL phases report 0 failures. Only then mark the analysis as complete.
+
+    **Sub-agent context management:**
+    - Include the relevant phase content from `verification-checklist.md` in the sub-agent prompt
+    - Include the output folder path so the sub-agent can read files
+    - Sub-agent output MUST include: phase name, total checks, passed, failed, and for each failure: check ID, file, evidence, exact fix instruction. Do not return "looks good" without counts.
+
+---
+
+## Tool Usage
+
+### Progress Tracking (todo)
+- Create todos at start for each major phase
+- Mark in-progress before starting each phase
+- Mark completed immediately after finishing each phase
+
+### Sub-task Delegation (agent)
+Delegate NARROW, READ-ONLY tasks to sub-agents (see Sub-Agent Governance above). Allowed delegations:
+- **Context gathering:** "Search for auth patterns in these directories and return a summary"
+- **Code analysis:** "Read these files and identify security-relevant APIs, credentials, and trust boundaries"
+- **Verification:** Hand the verification sub-agent the content of `verification-checklist.md` and the output folder path. It reads the files and returns PASS/FAIL results. The PARENT fixes any failures.
+- **JSON generation (exception):** For large repos, delegate `threat-inventory.json` writing with exact file path and pre-computed data
+
+**NEVER delegate:** "Write 0.1-architecture.md", "Generate the STRIDE analysis", "Perform the threat model analysis", or any prompt that would cause the sub-agent to independently produce report files.
+
+---
+
+## Verification Checklist (Final Step)
+
+The full verification checklist is in `verification-checklist.md`. It contains 9 phases:
+
+> **Authority hierarchy:** `orchestrator.md` defines the AUTHORING rules (what to do when writing reports). `verification-checklist.md` defines the CHECKING rules (what to verify after writing). Some rules appear in both files for visibility — if they ever conflict, `orchestrator.md` rules take precedence for authoring decisions, and `verification-checklist.md` takes precedence for pass/fail criteria. For the complete list of all structural, diagram, and consistency checks, always consult `verification-checklist.md` — it is the single source of truth for quality gates.
+
+0. **Phase 0 — Common Deviation Scan**: Known deviation patterns with WRONG→CORRECT examples
+1. **Phase 1 — Per-File Structural Checks**: Section order, required content, formatting
+2. **Phase 2 — Diagram Rendering Checks**: Mermaid init blocks, classDef, styles, syntax
+3. **Phase 3 — Cross-File Consistency Checks**: Component coverage, DF mapping, threat-to-finding traceability
+4. **Phase 4 — Evidence Quality Checks**: Evidence concreteness, verify-before-flagging compliance
+5. **Phase 5 — JSON Schema Validation**: Schema fields, array completeness, metrics consistency
+6. **Phase 6 — Deterministic Identity**: Component ID stability, boundary naming, flow ID consistency
+7. **Phase 7 — Evidence-Based Prerequisites**: Prerequisite deployment evidence, coverage completeness
+8. **Phase 8 — Comparison HTML** (incremental only): HTML structure, change annotations, CSS
+
+**Inline Quick-Checks:** `verification-checklist.md` also contains Inline Quick-Checks that MUST be run immediately after writing each file (before Step 10). These catch errors while content is still in active context.
+
+**Two-pass usage:**
+- **Before writing (Workflow pre-work):** Scan Phase 1 and Phase 2 to internalize structural and diagram quality gates. This prevents rework.
+- **After writing (Step 10):** Run ALL Phase 0–4 checks comprehensively against the completed output. Phase 0 is the most critical — it catches the deviations that persist across runs. Fix any failures before finalizing.
+
+**Delegation:** Hand the verification sub-agent the content of `verification-checklist.md` and the output folder. It will run all checks and produce a PASS/FAIL summary. Fix any failures before finalizing.
+
+---
+
+## Starting the Analysis
+
+If no folder path is provided, analyze the entire repository from its root.
diff --git a/skills/threat-model-analyst/references/output-formats.md b/skills/threat-model-analyst/references/output-formats.md
new file mode 100644
index 00000000..ac15eb84
--- /dev/null
+++ b/skills/threat-model-analyst/references/output-formats.md
@@ -0,0 +1,1062 @@
+# Output Formats — Report File Templates
+
+⛔ **SELF-CORRECT DIRECTIVE:** After writing ANY file using templates from this document, immediately run the Self-Check section at the bottom. Your response to the orchestrator MUST include the filled checklist with ✅/❌ for each item. If ANY item is ❌, fix the file before proceeding to the next step.
+
+This file defines the structure and content of every output file produced by the Threat Model Analyst. Each section is self-contained with templates, rules, and validation checklists.
+
+**Diagram conventions** are in a separate file: [diagram-conventions.md](./diagram-conventions.md)
+**Analysis methodology** is in a separate file: [analysis-principles.md](./analysis-principles.md)
+
+---
+
+## Output Folder
+
+Create a timestamped folder at the start of analysis:
+- Format: `threat-model-YYYYMMDD-HHmmss` (UTC time)
+- Example: `threat-model-20260130-073845`
+- Write ALL output files to this folder
+
+---
+
+## File Content Formatting — CRITICAL RULE
+
+**NEVER wrap `.md` file content in code fences.** When using `create_file` or `edit_file`:
+- The tool writes raw content to disk. If you include ` ```markdown ` at the start, it becomes literal text in the file.
+- **WRONG**: Content starts with ` ```markdown ` — the file will contain the fence as literal text
+- **CORRECT**: Content starts directly with `# Heading` on line 1
+- This applies to ALL `.md` files: `0.1-architecture.md`, `0-assessment.md`, `1-threatmodel.md`, `2-stride-analysis.md`, `3-findings.md`
+
+**NEVER wrap `.mmd` file content in code fences.** The `.mmd` file is raw Mermaid source:
+- **WRONG**: Content starts with ` ```plaintext ` or ` ```mermaid `
+- **CORRECT**: Content starts with `%%{init:` on line 1, followed by `flowchart` or `graph` on line 2
+
+**Self-check before every file write:** Look at the first characters of your content. If they are ` ``` ` — STOP and remove the fence.
+
+---
+
+## File List
+
+| File | Description | Always? |
+|------|-------------|---------|
+| `0-assessment.md` | Executive summary, risk rating, action plan, metadata | Yes |
+| `0.1-architecture.md` | Architecture overview, components, scenarios, tech stack | Yes |
+| `1-threatmodel.md` | Threat model DFD diagram + element/flow/boundary tables | Yes |
+| `1.1-threatmodel.mmd` | Pure Mermaid DFD (source of truth for detailed diagram) | Yes |
+| `1.2-threatmodel-summary.mmd` | Summary DFD (only if >15 elements or >4 boundaries) | Conditional |
+| `2-stride-analysis.md` | Full STRIDE-A analysis for all components | Yes |
+| `3-findings.md` | Prioritized security findings with remediation | Yes |
+| `threat-inventory.json` | Structured JSON inventory for comparison matching | Yes |
+| `incremental-comparison.html` | Visual HTML comparison report (incremental mode only) | Conditional |
+
+---
+
+## 0.1-architecture.md
+
+**Purpose:** High-level architecture overview — generated FIRST, before threat modeling begins.
+
+**When to generate:** Every run. Not conditional.
+
+**Diagrams:** All inline Mermaid in the markdown. NO separate `.mmd` files for 0.1-architecture.md.
+
+### Content Structure
+
+```markdown
+# Architecture Overview
+
+## System Purpose
+<!-- 2-4 sentences: What is this system? What problem does it solve? Who are the users? -->
+
+## Key Components
+| Component | Type | Description |
+|-----------|------|-------------|
+| [Name] | [Process / Data Store / External Service / External Interactor] | [One-line role description] |
+
+## Component Diagram
+<!-- Architecture diagram using service/external/datastore classDef (NOT DFD circles). See diagram-conventions.md for styles. -->
+
+## Top Scenarios
+<!-- 3-5 most important workflows. First 3 MUST include sequence diagrams. -->
+
+### Scenario 1: [Name]
+[2-3 sentence description]
+<!-- Mermaid sequenceDiagram here -->
+
+### Scenario 2: [Name]
+### Scenario 3: [Name]
+
+## Technology Stack
+| Layer | Technologies |
+|-------|--------------|
+| Languages | ... |
+| Frameworks | ... |
+| Data Stores | ... |
+| Infrastructure | ... |
+| Security | ... |
+
+## Deployment Model
+<!-- How deployed? On-prem, cloud, hybrid? Containers, VMs? -->
+
+## Security Infrastructure Inventory
+| Component | Security Role | Configuration | Notes |
+|-----------|---------------|---------------|-------|
+| [e.g., MISE Sidecar] | [e.g., Authentication proxy] | [e.g., Entra ID OIDC] | [e.g., All API pods] |
+
+## Repository Structure
+| Directory | Purpose |
+|-----------|---------|
+| [path/] | [Contents] |
+```
+
+### Processing Rules
+
+1. Generate **before** creating the threat model diagram
+2. Derive all content from code analysis — do not speculate
+3. If a section cannot be determined, state that explicitly
+4. Target: **150-250 lines** minimum. Previous iterations produced only 100-150 lines which is too thin. Include detailed component descriptions, port/protocol info, and substantial scenario narratives.
+5. Key Components table should align with threat model diagram elements
+6. Use **architecture** diagram styles (not DFD) — see `diagram-conventions.md`
+7. After writing, verify each Mermaid block has valid syntax
+8. **Top Scenarios**: The first 3 scenarios MUST include Mermaid `sequenceDiagram` blocks showing the interaction flow. Each sequence diagram should show actual participants, messages with protocol details, and alt/opt blocks for error paths.
+9. **Component alignment**: Every component listed in Key Components MUST later appear as a section in `2-stride-analysis.md`
+10. **Deployment Model**: Must include specific details: ports, protocols, bind addresses, network exposure, and deployment topology (single machine / cluster / multi-tier)
+11. **Security Infrastructure Inventory**: Populate with EVERY security-relevant component found in code (auth, encryption, access control, logging, secrets management)
+
+---
+
+## 1-threatmodel.md + 1.1-threatmodel.mmd
+
+**Purpose:** System threat model as a Data Flow Diagram (DFD).
+
+### Generation Steps
+
+**Step 1:** Create `1.1-threatmodel.mmd` (source of truth)
+- Pure Mermaid code, no markdown wrapper
+- Use DFD shapes and styles from `diagram-conventions.md`
+
+**Step 2:** Run the POST-DFD GATE from `orchestrator.md` Step 4 to evaluate and create `1.2-threatmodel-summary.mmd` if threshold is met. See `skeletons/skeleton-summary-dfd.md` for the template.
+
+**Step 3:** Create `1-threatmodel.md` (include Summary View section if summary was generated)
+
+### 1-threatmodel.md Content
+
+```markdown
+# Threat Model
+
+## Data Flow Diagram
+<!-- Copy EXACT diagram from 1.1-threatmodel.mmd wrapped in ```mermaid fence -->
+
+## Element Table
+| Element | Type | TMT Category | Description | Trust Boundary |
+|---------|------|--------------|-------------|----------------|
+
+- **Type** = high-level DFD category: `Process`, `External Interactor`, or `Data Store`
+- **TMT Category** = specific TMT ID from tmt-element-taxonomy.md §1 (e.g. `SE.P.TMCore.WebSvc`, `SE.EI.TMCore.Browser`, `SE.DS.TMCore.SQL`)
+- For Kubernetes-based applications where pods run sidecars, add an optional **Co-located Sidecars** column (e.g. `MISE, Dapr` or `—`)
+
+## Data Flow Table
+| ID | Source | Target | Protocol | Description |
+|----|--------|--------|----------|-------------|
+
+## Trust Boundary Table
+| Boundary | Description | Contains |
+|----------|-------------|----------|
+
+## Summary View (only if summary diagram generated)
+<!-- Copy from 1.2-threatmodel-summary.mmd -->
+
+## Summary to Detailed Mapping
+| Summary Element | Contains | Summary Flows | Maps to Detailed Flows |
+```
+
+**Key rules:**
+- Diagram in `.mmd` and `.md` must be IDENTICAL (copy, don't regenerate)
+- Use `DF01`, `DF02` for detailed flows; `SDF01`, `SDF02` for summary flows
+
+---
+
+## 2-stride-analysis.md
+
+**Purpose:** Full STRIDE + Abuse Cases threat analysis for every component.
+
+### Structure Requirements
+
+1. Each component's threats **MUST be split into Tier 1, Tier 2, Tier 3 sub-sections** with separate tables
+2. Summary table **MUST include T1, T2, T3 columns**
+3. All three tier sub-sections appear for every component (even if empty — use "*No Tier N threats identified*")
+
+### Anchor-Safe Headings (CRITICAL)
+
+Component `## ` headings become link targets from `3-findings.md`.
+- Use **only** letters, numbers, spaces, and hyphens
+- **FORBIDDEN in headings:** `&`, `/`, `(`, `)`, `.`, `:`, `'`, `"`, `+`, `@`, `!`
+- Replace: `&` → `and`, `/` → `-`, parentheses → remove
+
+**Anchor rule:** heading → lowercase, spaces → hyphens, strip non-alphanumeric except hyphens.
+
+### Template
+
+> **⛔ CRITICAL: The `## Summary` table MUST appear at the TOP of the file, immediately after `## Exploitability Tiers` and BEFORE any individual `## Component` sections. It is a navigation aid — readers need it first. The model consistently moves it to the BOTTOM — that is WRONG. Follow this exact order: `# STRIDE + Abuse Cases — Threat Analysis` → `## Exploitability Tiers` → `## Summary` → `---` → `## Component 1` → `## Component 2` → ...**
+
+> **⛔ RIGID TIER DEFINITIONS — Apply these EXACTLY. Do NOT use subjective judgment.** This is a skill directive — do NOT copy this line into the output. The tier table below is what goes into the report, WITHOUT this directive line.
+
+> **⛔ LEAKED DIRECTIVE CHECK:** The output file MUST NOT contain the text "RIGID TIER DEFINITIONS", "Do NOT use subjective judgment", or any line starting with `⛔`. These are skill instructions, not report content. If you see them in your output, remove them before finalizing.
+
+```markdown
+# STRIDE + Abuse Cases — Threat Analysis
+
+## Exploitability Tiers
+
+Threats are classified into three exploitability tiers based on the prerequisites an attacker needs:
+
+| Tier | Label | Prerequisites | Assignment Rule |
+|------|-------|---------------|----------------|
+| **Tier 1** | Direct Exposure | `None` | Exploitable by unauthenticated external attacker with NO prior access. The prerequisite field MUST say `None`. |
+| **Tier 2** | Conditional Risk | Single prerequisite: `Authenticated User`, `Privileged User`, `Internal Network`, or single `{Boundary} Access` | Requires exactly ONE form of access. The prerequisite field has ONE item. |
+| **Tier 3** | Defense-in-Depth | `Host/OS Access`, `Admin Credentials`, `{Component} Compromise`, `Physical Access`, or MULTIPLE prerequisites joined with `+` | Requires significant prior breach, infrastructure access, or multiple combined prerequisites. |
+```
+
+> **⛔ COPY THE TIERS TABLE VERBATIM.** The 4th column must be `Assignment Rule` (NOT `Example`, `Description`, `Criteria`, or any other name). The cell values must be the exact text above — do NOT replace them with deployment-specific examples. Do NOT add a "Deployment context affecting tier assignment" paragraph after the table — deployment context belongs in the individual component sections, not in the tier definitions.
+
+> **⛔ STRIDE-A CATEGORY LABELS (MANDATORY — the “A” is “Abuse”, NEVER “Authorization”):**
+> The 7 STRIDE-A categories used in ALL tables (Summary, per-component Tier tables, threat-inventory.json) are:
+> **S**poofing | **T**ampering | **R**epudiation | **I**nformation Disclosure | **D**enial of Service | **E**levation of Privilege | **A**buse
+> “Abuse” covers: business logic abuse, workflow manipulation, feature misuse, unintended use of legitimate features.
+> The model frequently generates “Authorization” for the A column — this is WRONG. If you see “Authorization” anywhere as a STRIDE category label, replace it with “Abuse”. The Category column in threat rows MUST say “Abuse” (not “Authorization”). N/A entries must also say “Abuse — N/A” (not “Authorization — N/A”).
+
+## Summary
+| Component | Link | S | T | R | I | D | E | A | Total | T1 | T2 | T3 | Risk |
+|-----------|------|---|---|---|---|---|---|---|-------|----|----|----|------|
+
+---
+
+## Component Name
+
+**Trust Boundary:** [boundary name]
+**Role:** [brief description]
+**Data Flows:** [list of DF IDs]
+**Pod Co-location:** [sidecars if K8s — see diagram-conventions.md]
+
+### STRIDE-A Analysis
+
+> **⛔ CATEGORY NAMING: The 7 STRIDE-A categories are: Spoofing, Tampering, Repudiation, Information Disclosure, Denial of Service, Elevation of Privilege, Abuse. The "A" category is ALWAYS "Abuse" — NEVER "Authorization". Authorization issues belong under Elevation of Privilege (E). This applies to N/A justification labels, threat table Category columns, and all prose.**
+
+#### Tier 1 — Direct Exposure (No Prerequisites)
+| ID | Category | Threat | Prerequisites | Affected Flow | Mitigation | Status |
+|----|----------|--------|---------------|---------------|------------|--------|
+
+#### Tier 2 — Conditional Risk
+| ID | Category | Threat | Prerequisites | Affected Flow | Mitigation | Status |
+
+#### Tier 3 — Defense-in-Depth
+| ID | Category | Threat | Prerequisites | Affected Flow | Mitigation | Status |
+```
+
+**⛔ STRIDE Status Column — Valid Values (must match Coverage table):**
+The `Status` column in each threat row MUST use exactly one of these values:
+- `Open` — Threat is not mitigated; MUST map to a finding (`✅ Covered` in Coverage table). The finding documents the vulnerability and remediation guidance.
+- `Mitigated` — Threat is mitigated by the engineering team's own code, configuration, or design decisions in THIS repository. Maps to `✅ Mitigated (FIND-XX)` in Coverage table. A finding MUST be created that documents WHAT the team did, WHERE in the code, and HOW it mitigates the threat. This gives credit to the engineering team for security work they've already done.
+- `Platform` — Threat is mitigated by an EXTERNAL platform that is NOT part of the analyzed codebase. See strict definition below. Maps to `🔄 Mitigated by Platform` in Coverage table. NO finding is created — the mitigation is outside this team's control.
+
+**How to distinguish Mitigated vs Platform:**
+| Question | If YES → | If NO → |
+|----------|----------|---------|
+| Is the mitigation implemented in code within THIS repository? | `Mitigated` | Check next |
+| Is the mitigation in deployment config controlled by THIS team? | `Mitigated` | Check next |
+| Is the mitigation provided by a completely external system? | `Platform` | `Open` (no mitigation exists) |
+
+**Examples of `Mitigated` (team's own work — create finding to document it):**
+- Auth middleware validating JWT tokens — the team wrote this code
+- TLS certificate generation and configuration — the team implemented this
+- File permissions set to 0600 in the code — the team chose secure defaults
+- Input validation or sanitization functions — the team built defenses
+- Rate limiting middleware — the team added throttling
+- Localhost-only binding — the team made an architectural security decision
+
+**The finding for a `Mitigated` threat documents the existing control:**
+- Title: descriptive of what IS in place (e.g., "JWT Authentication Middleware on API Endpoints")
+- Severity: Low (existing control) or Moderate (if control has gaps)
+- Mitigation Type: `Existing Control`
+- Remediation section: describes what's already implemented + any hardening recommendations
+- This ensures the Coverage table shows the team's security work, not just gaps
+
+**⛔ STRICT DEFINITION OF "PLATFORM" (MANDATORY):**
+`Platform` status is ONLY valid when ALL of these conditions are true:
+1. The mitigation is provided by a system **completely outside** the analyzed repository's code
+2. The mitigation is **managed by a different team/organization** (e.g., Azure AD is managed by Microsoft Identity team, not by this repo's team)
+3. The mitigation **cannot be disabled or weakened** by modifying code in this repository
+
+**Examples of LEGITIMATE Platform mitigations:**
+- Azure AD token signing (managed by Microsoft Identity, not this code)
+- K8s RBAC (managed by K8s control plane, not this operator)
+- Azure Arc tunnel encryption (managed by Arc team, not this agent)
+- TPM hardware security (hardware, not software)
+
+**Examples of things that are NOT "Platform" — they are `Mitigated` (team's work):**
+- ✅ "Auth middleware on endpoints" → `Mitigated` — team wrote the auth code. Create finding documenting it.
+- ✅ "TLS on localhost" → `Mitigated` — team implemented TLS. Create finding documenting the implementation.
+- ✅ "File permissions 0600" → `Mitigated` — team set secure defaults. Create finding documenting the choice.
+- ✅ "Localhost binding" → `Mitigated` — team made architectural security decision. Create finding.
+- ✅ "Input validation" → `Mitigated` — team built defense. Create finding documenting what's validated.
+- ✅ "Operation state machine" → `Mitigated` — team's logic prevents abuse. Create finding.
+
+**⛔ MAXIMUM PLATFORM RATIO:** If more than 20% of threats are classified as "🔄 Mitigated by Platform", re-examine each. Many should be `Mitigated` (team's code) not `Platform` (external). In a typical application, 5-15% are genuinely platform-mitigated, 20-40% are mitigated by the team's own code, and the rest are `Open` (needing remediation).
+
+**⛔ NEVER use these values:**
+- ❌ `Partial` — ambiguous. If partially mitigated, it's `Open` (the remaining gap is the finding)
+- ❌ `N/A` — every threat is applicable if it's in the table
+- ❌ `Accepted` — the tool does not accept risks
+- ❌ `Needs Review` — every threat must be either Covered, Mitigated, or Platform
+
+**Consistency rule:** The STRIDE `Status` column and the Findings Coverage table `Status` MUST agree:
+| STRIDE Status | Coverage Table Status | Meaning |
+|---|---|---|
+| `Open` | `✅ Covered (FIND-XX)` | Finding documents a vulnerability needing remediation |
+| `Mitigated` | `✅ Mitigated (FIND-XX)` | Finding documents an existing control the team built — gives credit for security work |
+| `Platform` | `🔄 Mitigated by Platform` | External platform handles it — no finding needed |
+
+**⛔ "Accepted Risk" and "Needs Review" are FORBIDDEN.** The tool does NOT have authority to accept risks or defer threats. Every threat maps to either a finding (Covered or Mitigated) or a genuine external platform mitigation. There is no middle ground.
+
+### Arithmetic Verification (MANDATORY)
+
+After writing ALL component tables:
+1. Count actual threat rows per component per category (S,T,R,I,D,E,A) — compare with summary table
+2. Verify Total = S+T+R+I+D+E+A for each row
+3. Verify T1+T2+T3 = Total for each row
+4. Verify Totals row = column-wise sum
+5. Row count cross-check: threat rows in detail = Total in summary
+
+---
+
+## 3-findings.md
+
+**Purpose:** Prioritized security findings with evidence and remediation.
+
+> **⛔ IMPORTANT: Before writing this file, read [skeleton-findings.md](./skeletons/skeleton-findings.md) and copy the skeleton VERBATIM for each finding. Fill in the `[FILL]` placeholders. This prevents template drift.**
+
+### Structure Requirements
+
+Organized by **Exploitability Tier** (NOT by severity):
+1. `## Tier 1 — Direct Exposure (No Prerequisites)`
+2. `## Tier 2 — Conditional Risk (Authenticated / Single Prerequisite)`
+3. `## Tier 3 — Defense-in-Depth (Prior Compromise / Host Access)`
+
+**DO NOT** use `## Critical Findings`, `## Important Findings`, etc.
+Sort by severity **within** each tier, then by CVSS descending.
+
+**Tier Assignment for Findings:**
+- A finding's tier is determined by its `Exploitation Prerequisites` value, using the same rules as STRIDE-A tier assignment (see [analysis-principles.md](./analysis-principles.md))
+- If a finding covers threats from multiple tiers (via Related Threats), assign it to the **highest-priority tier** (lowest tier number) among its related threats
+
+**Ordering within each tier:** Sort findings by:
+1. **SDL Bugbar Severity** descending: Critical → Important → Moderate → Low
+2. **Within each severity band**, sort by CVSS 4.0 score descending (highest first)
+
+**After writing all findings**, verify the sort order:
+- List all findings with their severity, CVSS score, and tier
+- Confirm no finding with higher CVSS appears after a lower CVSS finding within the same severity band and tier
+- If misordered, renumber and reorder before finalizing
+
+**Finding ID Numbering — MUST be sequential:**
+- Use `FIND-01`, `FIND-02`, `FIND-03`, ... only. `F-01`, `F01`, or `Finding 1` formats are NOT allowed.
+- IDs MUST appear in order in the document: FIND-01 before FIND-02 before FIND-03, etc.
+- ❌ NEVER have FIND-06 appear before FIND-04 in the document. If reordering findings, renumber ALL IDs to maintain sequential order.
+- After final sort, scan the document top-to-bottom: the first finding heading must be FIND-01, the next FIND-02, etc. No gaps, no out-of-order.
+
+### Finding Attributes (ALL MANDATORY)
+
+| Attribute | Description |
+|-----------|-------------|
+| SDL Bugbar Severity | Critical / Important / Moderate / Low |
+| CVSS 4.0 | Score AND full vector string (e.g., `9.3 (CVSS:4.0/AV:N/AC:L/AT:N/PR:N/UI:N/VC:H/VI:H/VA:H/SC:N/SI:N/SA:N)`) — BOTH are mandatory |
+| CWE | ID, name, AND hyperlink (e.g., `[CWE-306](https://cwe.mitre.org/data/definitions/306.html): Missing Authentication for Critical Function`) |
+| OWASP | Top 10:2025 mapping (A01:2025 format — never :2021) |
+| Exploitation Prerequisites | From tier definitions |
+| Exploitability Tier | Tier 1 / Tier 2 / Tier 3 |
+| Remediation Effort | Low / Medium / High |
+| Mitigation Type | Redesign / Standard Mitigation / Custom Mitigation / Existing Control / Accept Risk / Transfer Risk |
+| Component | Affected component |
+| Related Threats | Individual links to `2-stride-analysis.md#component-anchor` |
+
+### Full Finding Example
+
+```markdown
+### FIND-01: Missing Authentication on API
+
+| Attribute | Value |
+|-----------|-------|
+| SDL Bugbar Severity | Critical |
+| CVSS 4.0 | 9.3 (CVSS:4.0/AV:N/AC:L/AT:N/PR:N/UI:N/VC:H/VI:H/VA:H/SC:N/SI:N/SA:N) |
+| CWE | [CWE-306](https://cwe.mitre.org/data/definitions/306.html): Missing Authentication for Critical Function |
+| OWASP | A07:2025 – Authentication Failures |
+| Exploitation Prerequisites | None (external attacker) |
+| Exploitability Tier | Tier 1 — Direct Exposure |
+| Remediation Effort | Medium |
+| Mitigation Type | Standard Mitigation |
+| Component | API Gateway |
+| Related Threats | [T01.S](2-stride-analysis.md#api-gateway), [T01.R](2-stride-analysis.md#api-gateway) |
+
+#### Description
+
+The API endpoint /api/v1/resources accepts requests without any authentication check...
+
+#### Evidence
+
+`src/Controllers/ResourceController.cs` line 45 — no `[Authorize]` attribute on controller.
+
+#### Remediation
+
+Add `[Authorize]` attribute to the controller class and configure JWT bearer authentication in `Program.cs`.
+
+#### Verification
+
+Send an unauthenticated GET request to `/api/v1/resources` — should return 401 Unauthorized.
+```
+
+### Related Threats Link Format
+
+> **⛔ CRITICAL: Related Threats MUST be hyperlinks, NOT plain text. The model consistently outputs plain text like `T-02, T-17` — this is WRONG. Each threat ID must link to the specific component section in stride analysis.**
+
+- Individual links per threat ID: `[T01.S](2-stride-analysis.md#component-name)`
+- **WRONG**: `T-02, T-17, T-23` (plain text, no links)
+- **WRONG**: `[T08.S, T08.T](2-stride-analysis.md)` (grouped, no anchor)
+- **CORRECT**: `[T08.S](2-stride-analysis.md#redis-state-store), [T08.T](2-stride-analysis.md#redis-state-store)`
+- Every `| **Related Threats** |` cell must contain ONLY `[Txx.Y](2-stride-analysis.md#anchor)` format links separated by commas
+
+### Post-Write Checks
+
+1. **Anchor spot-check**: Verify 3+ Related Threats links resolve to real `##` headings
+2. **Threat coverage check**: Every threat ID in `2-stride-analysis.md` must be referenced by at least one finding
+3. **Sort order check**: Within each tier, no higher-CVSS finding appears after a lower-CVSS finding in the same severity band
+4. **CVSS-to-Tier consistency**: Scan every finding — if CVSS has `AV:L` or `PR:H`, finding MUST NOT be in Tier 1. Fix by downgrading the tier, not by changing the CVSS.
+5. **Threat Coverage Verification table**: At the end of `3-findings.md`, include:
+
+```markdown
+## Threat Coverage Verification
+
+| Threat ID | Finding ID | Status |
+|-----------|------------|--------|
+| T01.S | FIND-01 | ✅ Covered |
+| T01.T | FIND-05 | ✅ Mitigated (team implemented TLS) |
+| T02.I | — | 🔄 Mitigated by Platform (Azure AD) |
+```
+
+Every threat from `2-stride-analysis.md` must appear in this table. Status is one of:
+- `✅ Covered (FIND-XX)` — finding documents a vulnerability that needs remediation
+- `✅ Mitigated (FIND-XX)` — finding documents an existing control the team built (gives credit for security work done)
+- `🔄 Mitigated by Platform` — external system handles it (only for genuinely external platforms)
+
+**⛔ THIS TABLE IS A FEEDBACK LOOP, NOT DOCUMENTATION:**
+The purpose of this table is to force you to check your work. After filling it out:
+1. If ANY threat has a `—` dash in the Finding ID column with status other than `🔄 Mitigated by Platform` → **you missed a finding. Go back and create one.**
+2. If Platform count > 20% of total threats → **you are overusing Platform as an escape hatch. Re-examine.**
+3. If any threat is listed as `⚠️ Accepted Risk` or `⚠️ Needs Review` → **VIOLATION. Create a finding or verify it's genuinely Platform.**
+
+The table should drive you to 100% coverage: every threat maps to either a finding (`✅ Covered`) or a legitimate external platform mitigation (`🔄 Mitigated by Platform`). There is no third option.
+
+**⛔ FINDING GENERATION RULE:**
+If a threat in `2-stride-analysis.md` has a non-empty `Mitigation` column, it MUST become a finding. The mitigation text provides the remediation — use it. The only exception is threats genuinely mitigated by an EXTERNAL platform (Azure AD, K8s RBAC, TPM hardware) that this code cannot disable.
+
+---
+
+## 0-assessment.md
+
+**Purpose:** Executive summary, risk rating, action plan, and metadata. The "front page" of the report.
+
+> **⛔ IMPORTANT: Before writing this file, read [skeleton-assessment.md](./skeletons/skeleton-assessment.md) and copy the skeleton VERBATIM. Fill in the `[FILL]` placeholders. This prevents template drift.**
+
+### Section Order (MANDATORY — ALL 7 sections REQUIRED, do NOT skip any)
+
+1. **Report Files** (REQUIRED) — Links to all report deliverables
+2. **Executive Summary** (REQUIRED) — Risk rating + coverage. NO separate "Key Recommendations" subsection
+3. **Action Summary** (REQUIRED) — Tier-based prioritized action plan with `### Quick Wins` subsection
+4. **Analysis Context & Assumptions** (REQUIRED) — Scope, infrastructure context, `### Needs Verification` table, finding overrides
+5. **References Consulted** (REQUIRED) — Security standards + component documentation
+6. **Report Metadata** (REQUIRED) — Model, timestamps, duration, git info
+7. **Classification Reference** (REQUIRED) — MUST be the last section. Static table copied from skeleton.
+
+⚠️ **Enforcement:** If a section has no data, include it with empty tables or "N/A" notes — NEVER omit the section entirely. The agent in previous iterations skipped sections 1, 4, 5, and 6 entirely. ALL SEVEN must be present.
+
+### Report Files Template
+
+The Report Files table MUST list `0-assessment.md` (this file) as the FIRST row, followed by the other files:
+
+```markdown
+## Report Files
+
+| File | Description |
+|------|-------------|
+| [0-assessment.md](0-assessment.md) | This document — executive summary, risk rating, action plan, metadata |
+| [0.1-architecture.md](0.1-architecture.md) | Architecture overview, components, scenarios, tech stack |
+| [1-threatmodel.md](1-threatmodel.md) | Threat model DFD diagram with element, flow, and boundary tables |
+| [1.1-threatmodel.mmd](1.1-threatmodel.mmd) | Pure Mermaid DFD source file |
+| [1.2-threatmodel-summary.mmd](1.2-threatmodel-summary.mmd) | Summary DFD (only if generated) |
+| [2-stride-analysis.md](2-stride-analysis.md) | Full STRIDE-A analysis for all components |
+| [3-findings.md](3-findings.md) | Prioritized security findings with remediation |
+```
+
+⚠️ **`0-assessment.md` MUST be the first row.** The model consistently lists `0.1-architecture.md` first — that is WRONG. This file IS the front page of the report and lists itself first.
+
+### Risk Rating
+
+The heading must be plain text with NO emojis: `### Risk Rating: Elevated`, NOT `### Risk Rating: 🟠 Elevated`
+
+### Threat Count Context Paragraph
+
+Include at end of Executive Summary:
+
+```markdown
+> **Note on threat counts:** This analysis identified [N] threats across [M] components. This count reflects comprehensive STRIDE-A coverage, not systemic insecurity. Of these, **[T1 count] are directly exploitable** without prerequisites (Tier 1). The remaining [T2+T3 count] represent conditional risks and defense-in-depth considerations.
+```
+
+### Action Summary Template
+
+> **⛔ FIXED PRIORITY MAPPING — The Priority column values are DETERMINISTIC, not judgment-based:**
+> | Tier | Priority | Always |
+> |------|----------|--------|
+> | Tier 1 | 🔴 Critical Risk | ALWAYS — regardless of threat/finding count |
+> | Tier 2 | 🟠 Elevated Risk | ALWAYS — regardless of threat/finding count |
+> | Tier 3 | 🟡 Moderate Risk | ALWAYS — regardless of threat/finding count |
+>
+> **NEVER change the priority based on how many threats or findings exist in that tier.** Even if Tier 1 has 0 threats and 0 findings, the priority is still 🔴 Critical Risk — because IF a Tier 1 threat existed, it would be critical. The priority reflects the tier's inherent severity, not the count. A report with Tier 1 = "🟢 Low Risk" is WRONG and must be fixed.
+
+```markdown
+## Action Summary
+
+| Tier | Description | Threats | Findings | Priority |
+|------|-------------|---------|----------|----------|
+| Tier 1 | Directly exploitable | 5 | 3 | 🔴 Critical Risk |
+| Tier 2 | Requires authenticated access | 8 | 4 | 🟠 Elevated Risk |
+| Tier 3 | Requires prior compromise | 12 | 5 | 🟡 Moderate Risk |
+| **Total** | | **25** | **12** | |
+```
+
+> **⛔ EXACTLY 4 ROWS: The Action Summary table MUST have exactly 4 data rows: Tier 1, Tier 2, Tier 3, and Total. Do NOT add rows for "Mitigated", "Platform", "Fixed", "Accepted", or any other status. Mitigated threats are distributed across their respective tiers — they are NOT a separate tier. If you find yourself adding a "Mitigated" row, STOP and remove it.**
+
+```markdown
+
+### Quick Wins
+<!-- Tier 1 findings with Low remediation effort — high impact, quick fixes -->
+| Finding | Title | Why Quick |
+|---------|-------|-----------|
+| FIND-XX | [title] | [reason] |
+```
+
+⚠️ **Quick Wins is a REQUIRED subsection.** The `### Quick Wins` heading and table MUST appear after the tier summary table inside Action Summary. If no low-effort findings exist, write: `### Quick Wins\n\nNo low-effort findings identified. All findings require Medium or High effort.`
+
+**Processing Rules for Action Summary:**
+1. Populate the tier table with actual counts from `3-findings.md` (findings per tier) and `2-stride-analysis.md` (threats per tier from T1/T2/T3 columns in summary table)
+2. Quick Wins lists only Tier 1 findings with `Remediation Effort: Low` — highest-impact, lowest-effort items
+3. If no Tier 1 Low-effort findings exist, show Tier 2 Low-effort findings instead, with a note: "No Tier 1 quick wins identified. These Tier 2 items offer the best effort-to-impact ratio:"
+4. If no Low-effort findings exist at all, keep `### Quick Wins` heading and add: `No low-effort findings identified. All findings require Medium or High effort.`
+5. Verify: Findings column sums must equal total findings count in `3-findings.md`
+6. Verify: Threats column sums must equal total threats count in `2-stride-analysis.md` summary table
+
+### ⛔ PROHIBITED Content in Action Summary and All Output Files
+
+**NEVER generate ANY of the following:**
+- `### Priority Remediation by Phase` or any phase-based remediation roadmap
+- Sprint references (`Sprint 1-2`, `Sprint 3-4`, etc.)
+- Time-based phases (`Phase 1 — Immediate`, `Phase 2 — Short-term`, `Phase 3 — Medium-term`, `Phase 4 — Long-term`, `Backlog`)
+- Time-to-fix estimates (`~1 hour`, `~2 hours`, `~4 hours`, `1-2 days`, etc.)
+- Timeline or scheduling language (`immediately`, `next quarter`, `within 30 days`, `addressed within`)
+- Effort duration labels (`(hours)`, `(days)`, `(weeks)`) after Low/Medium/High effort levels
+
+**The report identifies WHAT to fix and WHY (tier + severity + effort level). It does NOT prescribe WHEN to fix it.** Scheduling is the team's responsibility. Only use `Low`, `Medium`, `High` for remediation effort — never attach time durations.
+
+### Analysis Context & Assumptions Template
+
+⚠️ **This ENTIRE section is REQUIRED.** Previous iterations skipped it entirely. Include ALL sub-sections below, even if tables are empty.
+
+```markdown
+## Analysis Context & Assumptions
+
+### Analysis Scope
+| Constraint | Description |
+|------------|-------------|
+| Scope | [Full repo or specific area] |
+| Excluded | [What was excluded] |
+| Focus Areas | [Special focus if any] |
+
+### Infrastructure Context
+| Category | Discovered from Codebase | Findings Affected |
+|----------|--------------------------|-------------------|
+
+**Every entry in "Discovered from Codebase" MUST include a relative link to the source file or document from which the information was inferred.** Example:
+
+```
+| Deployment Model | Air-gapped, single-admin workstation ([daemon.json](src/Container/Moby/daemon.json), [InstallAzureEdgeDiagnosticTool.ps1](src/Setup/InstallArtifacts/InstallAzureEdgeDiagnosticTool.ps1)) | All findings — no Tier 1 |
+| Network Exposure | All services bind to localhost:80 only ([KustoContainerHelper.psm1](src/Container/Kusto/KustoContainerHelper.psm1)) | FIND-01, FIND-03 |
+```
+
+### Needs Verification
+| Item | Question | What to Check | Why Uncertain |
+|------|----------|---------------|---------------|
+
+### Finding Overrides
+| Finding ID | Original Severity | Override | Justification | New Status |
+|------------|-------------------|----------|---------------|------------|
+| — | — | — | No overrides applied. Update this section after review. | — |
+
+### Additional Notes
+<!-- Any other context from the user's prompt -->
+
+[Freeform notes provided by user]
+```
+
+### References Consulted Template
+
+> **⛔ CRITICAL: This section MUST have TWO subsections with THREE-column tables including full URLs. Do NOT flatten into a simple 2-column `| Reference | Usage |` table. The model ALWAYS tries to simplify this — do NOT simplify it.**
+
+```markdown
+## References Consulted
+
+### Security Standards
+| Standard | URL | How Used |
+|----------|-----|----------|
+| Microsoft SDL Bug Bar | https://www.microsoft.com/en-us/msrc/sdlbugbar | Severity classification |
+| OWASP Top 10:2025 | https://owasp.org/Top10/2025/ | Threat categorization |
+| CVSS 4.0 | https://www.first.org/cvss/v4.0/specification-document | Risk scoring |
+| CWE | https://cwe.mitre.org/ | Weakness classification |
+| STRIDE | https://learn.microsoft.com/en-us/azure/security/develop/threat-modeling-tool-threats | Threat enumeration methodology |
+| NIST SP 800-53 Rev. 5 | https://csrc.nist.gov/pubs/sp/800-53/r5/upd1/final | Control mapping |
+
+### Component Documentation
+| Component | Documentation URL | Relevant Section |
+|-----------|------------------|------------------|
+| [e.g., Dapr] | [e.g., https://docs.dapr.io/operations/security/] | [e.g., mTLS configuration] |
+| [e.g., Redis] | [e.g., https://redis.io/docs/management/security/] | [e.g., Authentication] |
+```
+
+**Processing Rules:**
+1. Always include the Security Standards table — populate with actual standards consulted
+2. Every row MUST have a full URL (https://...) — never omit the URL column
+3. Populate Component Documentation with technologies actually consulted during analysis
+4. Do not add documentation that was not used
+
+### Report Metadata Template
+
+> **⛔ CRITICAL: ALL fields below are MANDATORY. Do NOT skip Model, Analysis Started, Analysis Completed, or Duration. The previous run omitted these — that is a critical failure. Run `Get-Date` at start and end to compute Duration.**
+
+```markdown
+## Report Metadata
+
+| Field | Value |
+|-------|-------|
+| Source Location | `[Full path]` |
+| Git Repository | `[Remote URL or "Unavailable"]` |
+| Git Branch | `[Branch name or "Unavailable"]` |
+| Git Commit | `[Short SHA]` (`[YYYY-MM-DD]` — run `git log -1 --format="%cs" [SHA]` to get commit date) |
+| Model | `[Model name — ask the system or state the model you are running as]` |
+| Machine Name | `[hostname]` |
+| Analysis Started | `[UTC timestamp from command]` |
+| Analysis Completed | `[UTC timestamp from command]` |
+| Duration | `[Computed difference between started and completed]` |
+| Output Folder | `[folder name]` |
+| Prompt | `[The user's prompt text that triggered this analysis]` |
+```
+
+**Gathering rules:**
+- START_TIME: Run `Get-Date -Format "yyyy-MM-dd HH:mm:ss" -AsUTC` at workflow Step 1
+- END_TIME: Run again before writing 0-assessment.md
+- Git fields: `git remote get-url origin`, `git branch --show-current`, `git rev-parse --short HEAD`
+- If any command fails → "Unavailable"
+- **NEVER estimate timestamps** from folder names
+- Model: State the model you are currently running as (e.g., `Claude Opus 4.6`, `GPT-5.3 Codex`, `Gemini 3 Pro`)
+- Machine: run `hostname`
+
+### Coverage Counts Consistency
+
+Before writing 0-assessment.md:
+- Count elements from `1-threatmodel.md` Element Table
+- Count findings from `3-findings.md`
+- Count threats from `2-stride-analysis.md` summary table
+- Use these exact numbers in Executive Summary and Action Summary
+
+### Formatting Rules
+
+1. `---` horizontal rules between every `##` section
+2. Report Metadata values all wrapped in backticks
+3. Finding Overrides always uses table format (even when empty)
+4. Report Files section always first
+5. `0.1-architecture.md` always listed in Report Files table
+
+---
+
+## Common Mistakes Checklist
+
+These are the most observed deviations. Check after writing each file:
+
+1. ❌ Organizing by severity → ✅ Organize by **Exploitability Tier**
+2. ❌ Flat STRIDE tables → ✅ Split into Tier 1/2/3 sub-sections per component
+3. ❌ Missing `Exploitability Tier` and `Remediation Effort` → ✅ MANDATORY on every finding
+4. ❌ STRIDE summary missing T1/T2/T3 columns → ✅ Include T1|T2|T3 columns
+5. ❌ Wrapping `.md` in ` ```markdown ` code fences → ✅ Start with `# Heading` on line 1. The `create_file` tool writes raw content — fences become literal text in the file.
+6. ❌ Wrapping `.mmd` in ` ```plaintext ` or ` ```mermaid ` → ✅ Start with `%%{init:` on line 1. The `.mmd` file is raw Mermaid source.
+7. ❌ Missing Action Summary → ✅ Section MUST be titled exactly `## Action Summary`. MUST include `### Quick Wins` subsection with table of Tier 1 low-effort findings.
+8. ❌ Missing threat count context paragraph → ✅ Include `> **Note on threat counts:**` blockquote in Executive Summary
+9. ❌ Omitting empty tier sections → ✅ Always include all three tiers per component
+10. ❌ Adding separate `### Key Recommendations` or `### Top Recommendations` or `### Priority Remediation Roadmap` → ✅ Action Summary IS the recommendations — no other name.
+11. ❌ Drawing sidecars as separate nodes → ✅ See `diagram-conventions.md` Rule 1
+12. ❌ Missing CVSS 4.0 vector string → ✅ Every finding MUST have both score AND full vector (e.g., `CVSS:4.0/AV:N/AC:L/...`)
+13. ❌ Missing CWE or OWASP on findings → ✅ MANDATORY on every finding
+14. ❌ Using OWASP `:2021` suffix → ✅ ALWAYS use `:2025` (e.g., `A01:2025 – Broken Access Control`). The 2025 edition is current.
+15. ❌ Missing Threat Coverage Verification table → ✅ Required at end of `3-findings.md`
+16. ❌ Architecture component not in STRIDE analysis → ✅ Every component in 0.1-architecture.md must have a STRIDE section
+17. ❌ Missing sequence diagrams for top scenarios → ✅ First 3 scenarios in 0.1-architecture.md MUST have Mermaid sequence diagrams
+18. ❌ Missing Needs Verification section in 0-assessment.md → ✅ Include under Analysis Context & Assumptions
+19. ❌ Missing `## Analysis Context & Assumptions` section entirely → ✅ REQUIRED. Previous iterations skipped this section. Must include Scope, Needs Verification, and Finding Overrides sub-tables.
+20. ❌ Missing `### Quick Wins` subsection → ✅ REQUIRED under Action Summary. List Tier 1 low-effort findings; if none, include heading with note.
+21. ❌ Skipping `## Report Files`, `## References Consulted`, or `## Report Metadata` → ✅ ALL 7 sections in 0-assessment.md are MANDATORY. Never omit any.
+22. ❌ Finding IDs out of order (FIND-06 before FIND-04) → ✅ Finding IDs MUST be sequential top-to-bottom: FIND-01, FIND-02, FIND-03, ... Renumber after sorting.
+23. ❌ CWE without hyperlink → ✅ CWE MUST include hyperlink: `[CWE-306](https://cwe.mitre.org/data/definitions/306.html): Missing Authentication for Critical Function`
+24. ❌ Time estimates or scheduling in output → ✅ NEVER generate `~1 hour`, `Sprint 1-2`, `Phase 1 — Immediate`, `(hours)`, or any timeline/duration in ANY output file. The report says WHAT to fix, not WHEN.
+
+---
+
+## threat-inventory.json
+
+**Purpose:** Structured JSON inventory of all components, data flows, boundaries, threats, and findings.
+This file enables automated comparison between two threat model runs.
+
+**When to generate:** Every run (Step 8b). Generated AFTER all markdown files are written.
+
+**NOT linked in `0-assessment.md`** — this is a machine-readable artifact, not a human-readable report file.
+
+### Schema
+
+```json
+{
+  "schema_version": "1.0",
+  "commit": "abc1234",
+  "commit_date": "2025-08-15",
+  "branch": "main",
+  "analysis_timestamp": "2025-08-15T14:30:00Z",
+  "repository": "https://github.com/org/repo",
+  "report_folder": "threat-model-20250815-143000",
+
+  "components": [
+    {
+      "id": "RedisStateStore",
+      "display": "Redis State Store",
+      "aliases": ["Redis", "StateStoreRedis"],
+      "type": "data_store",
+      "tmt_type": "SE.DS.TMCore.NoSQL",
+      "boundary": "DataLayer",
+      "boundary_kind": "ClusterBoundary",
+      "source_files": ["helmchart/myapp/templates/redis-statefulset.yaml"],
+      "fingerprint": {
+        "component_type": "data_store",
+        "boundary_kind": "ClusterBoundary",
+        "source_files": ["helmchart/myapp/templates/redis-statefulset.yaml"],
+        "source_directories": ["helmchart/myapp/templates/"],
+        "class_names": [],
+        "namespace": "",
+        "api_routes": [],
+        "config_keys": ["REDIS_HOST", "REDIS_PORT"],
+        "dependencies": [],
+        "inbound_from": ["InferencingFlow"],
+        "outbound_to": [],
+        "protocols": ["TCP"]
+      },
+      "sidecars": []
+    }
+  ],
+
+  "boundaries": [
+    {
+      "id": "DataLayer",
+      "display": "Data Layer",
+      "aliases": ["Data Boundary", "Persistence Layer"],
+      "kind": "ClusterBoundary",
+      "contains": ["RedisStateStore", "VectorDB"],
+      "contains_fingerprint": "RedisStateStore|VectorDB"
+    }
+  ],
+
+  "flows": [
+    {
+      "id": "DF_InferencingFlow_to_Redis",
+      "display": "DF25: InferencingFlow → Redis",
+      "from": "InferencingFlow",
+      "to": "RedisStateStore",
+      "protocol": "TCP",
+      "label": "State store operations",
+      "bidirectional": true,
+      "security": {
+        "encryption": "none",
+        "authentication": "none"
+      }
+    }
+  ],
+
+  "threats": [
+    {
+      "id": "T05.I",
+      "identity_key": {
+        "component_id": "RedisStateStore",
+        "stride_category": "I",
+        "attack_surface": "helmchart/values.yaml:redis.tls.enabled",
+        "data_flow_id": "DF_InferencingFlow_to_Redis"
+      },
+      "title": "Information Disclosure — Redis unencrypted traffic",
+      "description": "Redis state store transmits data without TLS...",
+      "tier": 1,
+      "prerequisites": "None",
+      "affected_flow": "DF25",
+      "mitigation": "Enable TLS on Redis connections",
+      "status": "Open"
+    }
+  ],
+
+  "findings": [
+    {
+      "id": "FIND-01",
+      "identity_key": {
+        "component_id": "RedisStateStore",
+        "vulnerability": "CWE-306",
+        "attack_surface": "helmchart/values.yaml:redis.auth"
+      },
+      "title": "Redis state store has no authentication",
+      "severity": "Critical",
+      "cvss_score": 9.4,
+      "cvss_vector": "CVSS:4.0/AV:N/AC:L/AT:N/PR:N/UI:N/VC:H/VI:H/VA:H/SC:N/SI:N/SA:N",
+      "cwe": "CWE-306",
+      "owasp": "A07:2025",
+      "tier": 1,
+      "effort": "Low",
+      "related_threats": ["T05.I", "T05.T"],
+      "evidence_files": ["helmchart/myapp/values.yaml"],
+      "component": "Redis State Store"
+    }
+  ],
+
+  "metrics": {
+    "total_components": 15,
+    "total_flows": 30,
+    "total_boundaries": 7,
+    "total_threats": 97,
+    "total_findings": 18,
+    "threats_by_tier": { "T1": 12, "T2": 53, "T3": 32 },
+    "findings_by_tier": { "T1": 7, "T2": 7, "T3": 4 },
+    "findings_by_severity": { "Critical": 4, "Important": 8, "Moderate": 6 },
+    "threats_by_stride": { "S": 14, "T": 19, "R": 8, "I": 20, "D": 15, "E": 14, "A": 7 }
+  }
+}
+```
+
+> **⛔ stride_category MUST be a SINGLE LETTER:** `S`, `T`, `R`, `I`, `D`, `E`, or `A`. NEVER use full names like `"Spoofing"` or `"Denial of Service"`. The heatmap computation and comparison matching depend on single-letter codes. If you write `"stride_category": "Denial of Service"` instead of `"stride_category": "D"`, the heatmap will show all zeros for STRIDE columns while tier columns have correct values — this is a critical data integrity bug.
+
+### Incremental Analysis Extensions
+
+When generating `threat-inventory.json` for an **incremental analysis** (see `incremental-orchestrator.md`), add these fields:
+
+**Top-level fields:**
+- `"incremental": true` — marks this as an incremental report
+- `"baseline_report": "threat-model-20260309-174425"` — path to baseline report folder
+- `"baseline_commit": "2dd84ab"` — the commit SHA of the baseline report
+- `"target_commit": "abc1234"` — the commit being analyzed
+- `"schema_version": "1.1"` — incremental reports use schema version 1.1
+
+**Per-component:** `"change_status"` — one of:
+- `"unchanged"` — source files identical or cosmetic-only changes
+- `"modified"` — security-relevant source file changes
+- `"restructured"` — files moved/renamed, same logical component
+- `"removed"` — source files deleted
+- `"new"` — component didn't exist at baseline
+- `"merged_into:{id}"` — merged into another component
+- `"split_into:{id1},{id2}"` — split into multiple components
+
+**Per-threat:** `"change_status"` — one of:
+- `"still_present"` — threat exists in current code, same as before
+- `"fixed"` — vulnerability was remediated (must cite code change)
+- `"mitigated"` — partial remediation applied
+- `"modified"` — threat still exists but details changed
+- `"new_code"` — threat from a genuinely new component
+- `"new_in_modified"` — threat introduced by code changes in existing component
+- `"previously_unidentified"` — threat existed in baseline code but wasn't in old report
+- `"removed_with_component"` — component was removed
+
+**Per-finding:** `"change_status"` — same values as per-threat, plus:
+- `"partially_mitigated"` — code changed partially, vulnerability partially remains
+
+**metrics.status_summary** — counts per `change_status` for components, threats, and findings. See `incremental-orchestrator.md` §4f for the full schema.
+
+### Canonical Naming Rules
+
+**Component IDs** — Derived from actual class/file names, PascalCase:
+- `SupportabilityAgent.cs` → `SupportabilityAgent`
+- `PowerShellCommandExecutor.cs` → `PowerShellCommandExecutor`
+- "Redis State Store" → `RedisStateStore`
+- "Ingress-NGINX" → `IngressNginx`
+
+**Flow IDs** — Deterministic from endpoints:
+- Format: `DF_{Source}_to_{Target}`
+- `DF_Operator_to_TerminalUI`
+- `DF_InferencingFlow_to_RedisStateStore`
+
+**Identity Keys** — Each threat and finding gets a canonical identity key:
+- Threats: `component_id` + `stride_category` + `attack_surface` + `data_flow_id`
+- Findings: `component_id` + `vulnerability` (CWE) + `attack_surface`
+- These keys are independent of LLM-generated prose — they anchor to code artifacts
+
+### Deterministic Identity Rules (MANDATORY)
+
+Use these rules so repeated runs on unchanged code produce comparable inventories.
+
+1. **Canonical ID vs display name**
+  - `id` is stable identity; `display` is presentation text
+  - Never derive identity from prose wording in findings or diagram labels
+
+2. **Alias capture**
+  - Every component and boundary must include an `aliases` array
+  - Include discovered synonyms from architecture/DFD/STRIDE/findings (deduplicated, sorted)
+  - Keep canonical `id` stable even if display wording changes across runs
+
+3. **Boundary kind taxonomy (TMT-aligned)**
+  - Use `boundary_kind`/`kind` from this set — describes the NATURE of the trust transition, not what's inside:
+    - `MachineBoundary` — between different hosts/VMs (e.g., host ↔ guest, VM1 ↔ VM2)
+    - `NetworkBoundary` — between network zones (e.g., corporate LAN ↔ internet, DMZ ↔ internal)
+    - `ClusterBoundary` — between K8s/container cluster and outside (e.g., cluster ↔ external services)
+    - `ProcessBoundary` — between OS processes or containers on same host (e.g., sidecar ↔ main container)
+    - `PrivilegeBoundary` — between different privilege levels (e.g., user mode ↔ kernel, unprivileged ↔ admin)
+    - `SandboxBoundary` — between sandboxed and unsandboxed execution (e.g., browser sandbox, WASM)
+  - Each value answers: "what changes when you cross this line?" (different machine, network, cluster, process, privilege, sandbox)
+  - Do NOT use component-grouping labels (DataStorage, ApplicationCore, AgentExecution) as boundary kinds — those describe WHAT's inside, not the nature of the trust transition
+
+3b. **Boundary ID derivation** (MANDATORY — apply the same deterministic naming as components)
+  - Derive boundary IDs from deployment/infrastructure names, NOT abstract concepts:
+    - Docker host → `Docker` (never `DockerEnvironment` or `ContainerRuntime`)
+    - Kubernetes cluster → `K8sCluster` (never `KubernetesEnvironment`)
+    - Operator's machine → `OperatorWorkstation` (never `HostOS` or `LocalMachine`)
+    - External cloud services → `ExternalServices` (never `CloudBoundary`)
+    - Data storage grouped → `DataStorage` (never `DataLayer` or `PersistenceLayer`)
+    - Backend application services → `BackendServices` (never `AppBoundary` or `ApplicationCore`)
+    - ML/AI inference models → `MLModels` (never `InferenceModels` or `ModelBoundary`)
+    - DMZ/public zone → `PublicZone` (never `DMZBoundary` or `IngressZone`)
+    - Agent execution → `AgentExecution` (keep this exact ID)
+    - Tool execution → `ToolExecution` (keep this exact ID)
+  - Once a boundary ID is chosen in Step 1, use it EVERYWHERE (DFD, tables, JSON)
+  - Never restructure containment between runs on the same code (same component → same boundary)
+
+4. **Component fingerprint**
+  - `fingerprint` must be built from stable evidence:
+    - sorted `source_files` — full file paths to primary source files
+    - sorted `source_directories` — parent directory paths of source files (more stable than filenames across refactors)
+    - sorted `class_names` — primary class, struct, or interface names defined in the component's source files (e.g., `["HealthServer", "IHealthService"]`). For non-code components (datastores, external services), leave empty.
+    - `namespace` — the primary namespace/package (e.g., `"MCP.Core.Servers.Health"` for C#, `"ragapp.src.ingestflow"` for Python). Empty for non-code components.
+    - sorted `api_routes` — HTTP API endpoint patterns exposed by this component (e.g., `["/api/health", "/api/v1/chat"]`). Empty if not an HTTP service.
+    - sorted `config_keys` — environment variables and configuration keys consumed by this component (e.g., `["AZURE_OPENAI_ENDPOINT", "REDIS_HOST"]`). Extract from appsettings.json, .env files, Helm values, or code that reads env vars.
+    - sorted `dependencies` — external package/library dependencies specific to this component (e.g., `["Microsoft.SemanticKernel", "Azure.AI.OpenAI"]` for NuGet, `["pymilvus", "fastapi"]` for pip). Only include packages that are characteristic of this component, not framework-wide dependencies.
+    - sorted `inbound_from` and `outbound_to` component IDs
+    - sorted `protocols`
+    - `component_type` and `boundary_kind`
+  - Do not include mutable prose in the fingerprint
+  - **Deterministic matching priority:** `source_directories` > `class_names` > `namespace` > `api_routes` > `config_keys` are all highly stable signals that survive component renames. Two components sharing any of these are almost certainly the same real component.
+
+  **Fingerprint Field → Comparison Matching Signal Map:**
+  | Fingerprint Field | Comparison Signal | Max Points | Stability |
+  |---|---|---|---|
+  | `source_files` | Signal 2 — Source file/directory overlap | +30 | High (files rarely move) |
+  | `source_directories` | Signal 2 — Source file/directory overlap | +25 | Very High (directories almost never change) |
+  | `class_names` | Signal 3 — Class/Namespace match | +25 | Very High (classes rarely rename) |
+  | `namespace` | Signal 3 — Class/Namespace match | +20 | Very High (namespaces are structural) |
+  | `api_routes` | Signal 4 — API route / Config key overlap | +15 | High (API contracts are versioned) |
+  | `config_keys` | Signal 4 — API route / Config key overlap | +10 | High (config keys are stable) |
+  | `dependencies` | Signal 4 — API route / Config key overlap | +5 | Medium (packages change with upgrades) |
+  | `inbound_from` / `outbound_to` | Signal 5 — Topology overlap | +15 | Low (uses component IDs which may drift) |
+  | `component_type` + `boundary_kind` | Signal 6 — Type + boundary kind | +10 | Medium (boundary naming may vary) |
+  | `protocols` | (Not directly scored — used as tiebreaker) | — | Medium |
+
+  **Every field in this table MUST be populated during analysis (Step 8b).** Empty arrays `[]` are acceptable when the field genuinely doesn't apply (e.g., `api_routes` for a datastore). But `source_directories` and `class_names` must NEVER be empty for process-type components — these are the primary matching anchors.
+
+5. **Boundary containment fingerprint**
+  - `contains_fingerprint` = sorted `contains` joined with `|`
+  - Use this for boundary rename detection during comparison
+
+6. **Deterministic ordering**
+  - Sort all arrays and nested list fields before writing JSON
+  - This makes diffs stable and prevents accidental churn
+
+### Processing Rules
+
+1. Generate AFTER all markdown files are written (Step 8b)
+2. Populate from the same analysis data used to write the markdown files
+3. Ensure component IDs use PascalCase derived from actual class/file names
+4. Ensure flow IDs use the canonical `DF_{Source}_to_{Target}` format
+5. All threat and finding identity keys must reference actual code artifacts (file paths, config keys)
+6. Include git metadata from Step 1 (commit, branch, date)
+7. The `metrics` object must match the counts in the markdown reports
+8. This file is NOT listed in the Report Files table of `0-assessment.md`
+9. Populate `aliases`, `boundary_kind`/`kind`, `fingerprint`, and `contains_fingerprint` for deterministic matching
+10. If a component has multiple observed names in the same run, keep one canonical `id` and store all alternates in `aliases`
+
+> **⚠️ CRITICAL — Array completeness:**
+> The `threats` array MUST contain one entry for every threat listed in `2-stride-analysis.md`.
+> The `findings` array MUST contain one entry for every finding in `3-findings.md`.
+> The `components` array MUST contain one entry for every component in the Element Table.
+> **Verify:** `threats.length == metrics.total_threats`, `findings.length == metrics.total_findings`,
+> `components.length == metrics.total_components`. If mismatched, the JSON is incomplete — go back
+> and add the missing entries. Do NOT truncate arrays to save space.
+
+---
+
+## Self-Check — Run After Writing Each File
+
+⛔ **MANDATORY:** After writing each file, verify these checks and report results. Fix any ❌ before proceeding.
+
+### After `2-stride-analysis.md`:
+- [ ] Summary table appears BEFORE individual component sections
+- [ ] 3 tier sub-sections per component (Tier 1, Tier 2, Tier 3)
+- [ ] Status column uses only: `Open`, `Mitigated`, `Platform` (no `Accepted Risk`, no `Needs Review`)
+- [ ] Platform ratio within limit (≤20% standalone, ≤35% K8s operator)
+- [ ] Every threat has single-letter STRIDE category (S/T/R/I/D/E/A)
+
+### After `3-findings.md`:
+- [ ] 3 tier headings: `## Tier 1`, `## Tier 2`, `## Tier 3` (all present)
+- [ ] Zero occurrences of "Accepted Risk" anywhere in the file
+- [ ] Every finding has CVSS 4.0 vector string
+- [ ] Action Summary: T1=Critical, T2=Elevated, T3=Moderate priorities
+- [ ] 4th column header is "Assignment Rule" (not "Example")
+
+### After `threat-inventory.json`:
+- [ ] `threats.length == metrics.total_threats` (zero tolerance)
+- [ ] `findings.length == metrics.total_findings` (zero tolerance)
+- [ ] If threats > 50, used sub-agent/Python/chunked — NOT single `create_file`
+- [ ] Every component has non-empty `fingerprint.source_directories`
+- [ ] Arrays sorted by canonical key
+- [ ] **Field names match schema exactly:** components use `display` (NOT `display_name`), threats use `stride_category` (NOT `category`), threat→component link is inside `identity_key.component_id` (NOT top-level `component_id`), threats have BOTH `title` (short name) AND `description` (longer prose) — NOT just `description` alone
+
+### After `0-assessment.md`:
+- [ ] Exactly 7 sections: Report Files, Executive Summary, Action Summary, Analysis Context & Assumptions, References Consulted, Report Metadata, Classification Reference
+- [ ] `---` horizontal rule between every pair of `##` sections
+
+---
+
+## Enumeration Reference
+
+All reports MUST use these exact values. Do NOT abbreviate, substitute, or invent alternatives.
+
+**Component Types:** `process` | `data_store` | `external_service` | `external_interactor`
+
+**Boundary Kinds (TMT-aligned):** `MachineBoundary` | `NetworkBoundary` | `ClusterBoundary` | `ProcessBoundary` | `PrivilegeBoundary` | `SandboxBoundary`
+
+**Exploitability Tiers:** `Tier 1` (Direct Exposure — no prerequisites) | `Tier 2` (Conditional Risk — single prerequisite) | `Tier 3` (Defense-in-Depth — multiple prerequisites)
+
+**STRIDE + Abuse Categories:** `S` Spoofing | `T` Tampering | `R` Repudiation | `I` Information Disclosure | `D` Denial of Service | `E` Elevation of Privilege | `A` Abuse
+
+**SDL Bugbar Severity:** `Critical` | `Important` | `Moderate` | `Low`
+
+**Remediation Effort:** `Low` | `Medium` | `High`
+
+**Mitigation Type (OWASP-aligned):** `Redesign` | `Standard Mitigation` | `Custom Mitigation` | `Existing Control` | `Accept Risk` | `Transfer Risk`
+
+**Threat Status:** `Open` | `Mitigated` | `Platform`
+
+**Finding Change Status (incremental):** `Still Present` | `Fixed` | `New` | `New (Code)` | `New (Previously Unidentified)` | `Removed`
+
+**OWASP Top 10:2025 suffix:** Always `:2025` (e.g., `A01:2025 – Broken Access Control`)
+- [ ] Quick Wins, Needs Verification, Finding Overrides subsections present
+- [ ] Deployment pattern documented (K8s operator vs standalone)
+- [ ] All metadata values in backticks
+
+**Also verify (applies to ALL files):** No leaked directives (⛔, RIGID, NON-NEGOTIABLE in output), no time estimates, no nested output folders. See `verification-checklist.md` Phase 0 for the full common deviation list.
diff --git a/skills/threat-model-analyst/references/skeletons/skeleton-architecture.md b/skills/threat-model-analyst/references/skeletons/skeleton-architecture.md
new file mode 100644
index 00000000..1503cfd5
--- /dev/null
+++ b/skills/threat-model-analyst/references/skeletons/skeleton-architecture.md
@@ -0,0 +1,133 @@
+# Skeleton: 0.1-architecture.md
+
+> **⛔ Copy the template content below VERBATIM (excluding the outer code fence). Replace `[FILL]` placeholders. Do NOT add/rename/reorder sections.**
+> **⛔ Key Components table columns are EXACTLY: `Component | Type | Description`. DO NOT rename to `Role`, `Change`, `Function`.**
+> **⛔ Technology Stack table columns are EXACTLY: `Layer | Technologies` (2 columns). DO NOT add `Version` column or rename `Layer` to `Category`.**
+> **⛔ Security Infrastructure Inventory and Repository Structure sections are MANDATORY — do NOT omit them.**
+
+---
+
+````markdown
+# Architecture Overview
+
+## System Purpose
+
+[FILL-PROSE: 2-4 sentences — what is this system, what problem does it solve, who are the users]
+
+## Key Components
+
+| Component | Type | Description |
+|-----------|------|-------------|
+[REPEAT: one row per component]
+| [FILL: PascalCase name] | [FILL: Process / Data Store / External Service / External Interactor] | [FILL: one-line description] |
+[END-REPEAT]
+
+<!-- ⛔ POST-TABLE CHECK: Verify Key Components:
+  1. Every component has PascalCase name (not kebab-case or snake_case)
+  2. Type is one of: Process / Data Store / External Service / External Interactor
+  3. Row count matches the number of nodes in the Component Diagram below
+  If ANY check fails → FIX NOW. -->
+
+## Component Diagram
+
+```mermaid
+[FILL: Architecture diagram using service/external/datastore styles — NOT DFD circles]
+```
+
+## Top Scenarios
+
+[REPEAT: 3-5 scenarios. First 3 MUST include sequence diagrams.]
+
+### Scenario [FILL: N]: [FILL: Title]
+
+[FILL-PROSE: 2-3 sentence description]
+
+```mermaid
+sequenceDiagram
+    [FILL: participants, messages, alt/opt blocks]
+```
+
+[END-REPEAT]
+
+<!-- ⛔ POST-SECTION CHECK: Verify Top Scenarios:
+  1. At least 3 scenarios listed
+  2. First 3 scenarios MUST have sequenceDiagram blocks
+  3. Each sequence diagram has participant lines and message arrows
+  If ANY check fails → FIX NOW. -->
+
+## Technology Stack
+
+| Layer | Technologies |
+|-------|--------------|
+| Languages | [FILL] |
+| Frameworks | [FILL] |
+| Data Stores | [FILL] |
+| Infrastructure | [FILL] |
+| Security | [FILL] |
+
+<!-- ⛔ POST-TABLE CHECK: Verify Technology Stack has all 5 rows filled. If Security row is empty, list security-relevant libraries/frameworks found in the code. -->
+
+## Deployment Model
+
+[FILL-PROSE: deployment description — ports, protocols, bind addresses, network exposure, topology (single machine / cluster / multi-tier)]
+
+**Deployment Classification:** `[FILL: one of LOCALHOST_DESKTOP | LOCALHOST_SERVICE | AIRGAPPED | K8S_SERVICE | NETWORK_SERVICE]`
+
+<!-- ⛔ DEPLOYMENT CLASSIFICATION RULES:
+  LOCALHOST_DESKTOP — Single-process console/GUI app, no network listeners (or localhost-only), single-user workstation. T1 FORBIDDEN.
+  LOCALHOST_SERVICE — Daemon/service binding to 127.0.0.1 only. T1 FORBIDDEN.
+  AIRGAPPED — No internet connectivity. T1 forbidden for network-originated attacks.
+  K8S_SERVICE — Kubernetes Deployment/StatefulSet with ClusterIP or LoadBalancer. T1 allowed.
+  NETWORK_SERVICE — Public API, cloud endpoint, internet-facing. T1 allowed.
+  This classification is BINDING on all subsequent prerequisite and tier assignments. -->
+
+### Component Exposure Table
+
+| Component | Listens On | Auth Required | Reachability | Min Prerequisite | Derived Tier |
+|-----------|------------|---------------|--------------|------------------|-------------|
+[REPEAT: one row per component from Key Components table]
+| [FILL: component name] | [FILL: port/address or "N/A — no listener"] | [FILL: Yes (mechanism) / No] | [FILL: one of: External / Internal Only / Localhost Only / No Listener] | [FILL: one of closed enum — see rules below] | [FILL: T1 / T2 / T3] |
+[END-REPEAT]
+
+<!-- ⛔ EXPOSURE TABLE RULES:
+  1. Every component from Key Components MUST have a row.
+  2. "Listens On" = the actual bind address from code (e.g., "127.0.0.1:8080", "0.0.0.0:443", "N/A — no listener").
+  3. "Reachability" MUST be one of these 4 values (closed enum):
+     - `External` — reachable from public internet or untrusted network
+     - `Internal Only` — reachable only within a private network (K8s cluster, VNet, etc.)
+     - `Localhost Only` — binds to 127.0.0.1 or named pipe, same-host only
+     - `No Listener` — does not accept inbound connections (outbound-only, console I/O, library)
+  4. "Min Prerequisite" MUST be one of these values (closed enum):
+     - `None` — only valid when Reachability = External AND Auth Required = No
+     - `Authenticated User` — Reachability = External AND Auth Required = Yes
+     - `Internal Network` — Reachability = Internal Only AND Auth Required = No
+     - `Privileged User` — requires admin/operator role
+     - `Local Process Access` — Reachability = Localhost Only (same-host process can connect)
+     - `Host/OS Access` — Reachability = No Listener (requires filesystem, console, or debug access)
+     - `Admin Credentials` — requires admin credentials + host access
+     - `Physical Access` — requires physical presence
+     ⛔ FORBIDDEN values: `Application Access`, `Host Access` (ambiguous — use `Local Process Access` or `Host/OS Access` instead)
+  5. "Derived Tier" is mechanically determined from Min Prerequisite:
+     - `None` → T1
+     - `Authenticated User`, `Privileged User`, `Internal Network`, `Local Process Access` → T2
+     - `Host/OS Access`, `Admin Credentials`, `Physical Access`, `{Component} Compromise`, or any `A + B` → T3
+  6. No threat or finding for this component may have a LOWER prerequisite than Min Prerequisite.
+  7. No threat or finding for this component may have a HIGHER tier (lower number) than Derived Tier.
+  8. This table is the SINGLE SOURCE OF TRUTH for prerequisite floors and tier ceilings. STRIDE and findings MUST respect it. -->
+
+## Security Infrastructure Inventory
+
+| Component | Security Role | Configuration | Notes |
+|-----------|---------------|---------------|-------|
+[REPEAT: one row per security-relevant component found in code]
+| [FILL] | [FILL] | [FILL] | [FILL] |
+[END-REPEAT]
+
+## Repository Structure
+
+| Directory | Purpose |
+|-----------|---------|
+[REPEAT: one row per key directory]
+| [FILL: path/] | [FILL] |
+[END-REPEAT]
+````
diff --git a/skills/threat-model-analyst/references/skeletons/skeleton-assessment.md b/skills/threat-model-analyst/references/skeletons/skeleton-assessment.md
new file mode 100644
index 00000000..0e58c4d2
--- /dev/null
+++ b/skills/threat-model-analyst/references/skeletons/skeleton-assessment.md
@@ -0,0 +1,273 @@
+# Skeleton: 0-assessment.md
+
+> **⛔ Copy the template content below VERBATIM (excluding the outer code fence). Replace `[FILL]` placeholders. Do NOT add/rename/reorder sections.**
+> `[FILL]` = single value | `[FILL-PROSE]` = paragraphs | `[REPEAT]...[END-REPEAT]` = N copies | `[CONDITIONAL]...[END-CONDITIONAL]` = include if condition met
+
+---
+
+```markdown
+# Security Assessment
+
+---
+
+## Report Files
+
+| File | Description |
+|------|-------------|
+| [0-assessment.md](0-assessment.md) | This document — executive summary, risk rating, action plan, metadata |
+| [0.1-architecture.md](0.1-architecture.md) | Architecture overview, components, scenarios, tech stack |
+| [1-threatmodel.md](1-threatmodel.md) | Threat model DFD diagram with element, flow, and boundary tables |
+| [1.1-threatmodel.mmd](1.1-threatmodel.mmd) | Pure Mermaid DFD source file |
+| [2-stride-analysis.md](2-stride-analysis.md) | Full STRIDE-A analysis for all components |
+| [3-findings.md](3-findings.md) | Prioritized security findings with remediation |
+[CONDITIONAL: Include if 1.2-threatmodel-summary.mmd was generated]
+| [1.2-threatmodel-summary.mmd](1.2-threatmodel-summary.mmd) | Summary DFD for large systems |
+[END-CONDITIONAL]
+[CONDITIONAL: Include for incremental analysis]
+| [incremental-comparison.html](incremental-comparison.html) | Visual comparison report |
+[END-CONDITIONAL]
+
+<!-- ⛔ POST-TABLE CHECK: Verify Report Files:
+  1. `0-assessment.md` is the FIRST row (not 0.1-architecture.md)
+  2. All generated files are listed
+  3. Conditional rows (1.2-threatmodel-summary.mmd, incremental-comparison.html) only if those files exist
+  If ANY check fails → FIX NOW. -->
+
+---
+
+## Executive Summary
+
+[FILL-PROSE: 2-3 paragraph summary of the system and its security posture]
+
+[FILL: "The analysis covers [N] system elements across [M] trust boundaries."]
+
+### Risk Rating: [FILL: Critical / Elevated / Moderate / Low]
+
+[FILL-PROSE: risk rating justification paragraph]
+
+> **Note on threat counts:** This analysis identified [FILL: N] threats across [FILL: M] components. This count reflects comprehensive STRIDE-A coverage, not systemic insecurity. Of these, **[FILL: T1 count] are directly exploitable** without prerequisites (Tier 1). The remaining [FILL: T2+T3 count] represent conditional risks and defense-in-depth considerations.
+
+<!-- ⛔ POST-SECTION CHECK: Verify Executive Summary:
+  1. Risk Rating heading has NO emojis: `### Risk Rating: Elevated` not `### Risk Rating: 🟠 Elevated`
+  2. Note on threat counts blockquote is present
+  3. Element count and boundary count match actual counts from 1-threatmodel.md
+  If ANY check fails → FIX NOW. -->
+
+---
+
+## Action Summary
+
+| Tier | Description | Threats | Findings | Priority |
+|------|-------------|---------|----------|----------|
+| [Tier 1](3-findings.md#tier-1--direct-exposure-no-prerequisites) | Directly exploitable | [FILL] | [FILL] | 🔴 Critical Risk |
+| [Tier 2](3-findings.md#tier-2--conditional-risk-authenticated--single-prerequisite) | Requires authenticated access | [FILL] | [FILL] | 🟠 Elevated Risk |
+| [Tier 3](3-findings.md#tier-3--defense-in-depth-prior-compromise--host-access) | Requires prior compromise | [FILL] | [FILL] | 🟡 Moderate Risk |
+| **Total** | | **[FILL]** | **[FILL]** | |
+
+<!-- ⛔ POST-TABLE CHECK: Verify Action Summary:
+  1. EXACTLY 4 data rows: Tier 1, Tier 2, Tier 3, Total — NO 'Mitigated', 'Platform', or 'Fixed' rows
+  2. Priority column is FIXED: Tier 1=🔴 Critical Risk, Tier 2=🟠 Elevated Risk, Tier 3=🟡 Moderate Risk — never changed based on counts
+  3. Threats column sums match 2-stride-analysis.md Totals row
+  4. Findings column sums match 3-findings.md FIND- heading count
+  5. Tier 1/2/3 cells are hyperlinks to 3-findings.md tier headings — verify anchors resolve
+  If ANY check fails → FIX NOW before continuing. -->
+
+### Priority by Tier and CVSS Score (Top 10)
+
+| Finding | Tier | CVSS Score | SDL Severity | Title |
+|---------|------|------------|-------------|-------|
+[REPEAT: top 10 findings only, sorted by Tier (T1 first, then T2, then T3), then by CVSS score descending within each tier]
+| [FIND-XX](3-findings.md#find-xx-title-slug) | T[FILL] | [FILL] | [FILL] | [FILL] |
+[END-REPEAT]
+
+<!-- ⛔ POST-TABLE CHECK: Verify Priority by Tier and CVSS Score:
+  1. Maximum 10 rows (top 10 findings only, not all findings)
+  2. Sort order: ALL Tier 1 findings first (by CVSS desc), then Tier 2 (by CVSS desc), then Tier 3 (by CVSS desc)
+  3. Every Finding cell is a hyperlink: [FIND-XX](3-findings.md#find-xx-title-slug)
+  4. Verify each hyperlink anchor resolves: compute the anchor from the ACTUAL heading text in 3-findings.md (lowercase, spaces→hyphens, strip special chars). The link must match whatever the heading is.
+  5. CVSS scores match the actual finding's CVSS value in 3-findings.md
+  If ANY check fails → FIX NOW. -->
+
+### Quick Wins
+
+<!-- Quick Wins Finding column: each Finding cell MUST be a hyperlink to 3-findings.md, same format as Priority table:
+  [FIND-XX](3-findings.md#find-xx-title-slug)
+  Compute the anchor from the ACTUAL heading text in 3-findings.md. -->
+
+| Finding | Title | Why Quick |
+|---------|-------|-----------|
+[REPEAT]
+| [FIND-XX](3-findings.md#find-xx-title-slug) | [FILL] | [FILL] |
+[END-REPEAT]
+
+---
+
+[CONDITIONAL: Include ONLY for incremental analysis]
+
+## Change Summary
+
+### Component Changes
+| Status | Count | Components |
+|--------|-------|------------|
+| Unchanged | [FILL] | [FILL] |
+| Modified | [FILL] | [FILL] |
+| New | [FILL] | [FILL] |
+| Removed | [FILL] | [FILL] |
+
+### Threat Status
+| Status | Count |
+|--------|-------|
+| Existing | [FILL] |
+| Fixed | [FILL] |
+| New | [FILL] |
+| Removed | [FILL] |
+
+### Finding Status
+| Status | Count |
+|--------|-------|
+| Existing | [FILL] |
+| Fixed | [FILL] |
+| Partial | [FILL] |
+| New | [FILL] |
+| Removed | [FILL] |
+
+### Risk Direction
+
+[FILL: Improving / Worsening / Stable] — [FILL-PROSE: 1-2 sentence justification]
+
+---
+
+## Previously Unidentified Issues
+
+[FILL-PROSE: or "No previously unidentified issues found."]
+
+| Finding | Title | Component | Evidence |
+|---------|-------|-----------|----------|
+[REPEAT]
+| [FILL] | [FILL] | [FILL] | [FILL] |
+[END-REPEAT]
+
+[END-CONDITIONAL]
+
+---
+
+## Analysis Context & Assumptions
+
+### Analysis Scope
+| Constraint | Description |
+|------------|-------------|
+| Scope | [FILL] |
+| Excluded | [FILL] |
+| Focus Areas | [FILL] |
+
+### Infrastructure Context
+| Category | Discovered from Codebase | Findings Affected |
+|----------|--------------------------|-------------------|
+[REPEAT]
+| [FILL] | [FILL: include relative file links] | [FILL] |
+[END-REPEAT]
+
+### Needs Verification
+| Item | Question | What to Check | Why Uncertain |
+|------|----------|---------------|---------------|
+[REPEAT]
+| [FILL] | [FILL] | [FILL] | [FILL] |
+[END-REPEAT]
+
+### Finding Overrides
+| Finding ID | Original Severity | Override | Justification | New Status |
+|------------|-------------------|----------|---------------|------------|
+| — | — | — | No overrides applied. Update this section after review. | — |
+
+### Additional Notes
+
+[FILL-PROSE: or "No additional notes."]
+
+---
+
+## References Consulted
+
+### Security Standards
+| Standard | URL | How Used |
+|----------|-----|----------|
+| Microsoft SDL Bug Bar | https://www.microsoft.com/en-us/msrc/sdlbugbar | Severity classification |
+| OWASP Top 10:2025 | https://owasp.org/Top10/2025/ | Threat categorization |
+| CVSS 4.0 | https://www.first.org/cvss/v4.0/specification-document | Risk scoring |
+| CWE | https://cwe.mitre.org/ | Weakness classification |
+| STRIDE | https://learn.microsoft.com/en-us/azure/security/develop/threat-modeling-tool-threats | Threat enumeration |
+[REPEAT: additional standards if used]
+| [FILL] | [FILL] | [FILL] |
+[END-REPEAT]
+
+### Component Documentation
+| Component | Documentation URL | Relevant Section |
+|-----------|------------------|------------------|
+[REPEAT]
+| [FILL] | [FILL] | [FILL] |
+[END-REPEAT]
+
+---
+
+## Report Metadata
+
+| Field | Value |
+|-------|-------|
+| Source Location | `[FILL]` |
+| Git Repository | `[FILL]` |
+| Git Branch | `[FILL]` |
+| Git Commit | `[FILL: SHA from git rev-parse --short HEAD]` (`[FILL: date from git log -1 --format="%ai" — NOT today's date]`) |
+| Model | `[FILL]` |
+| Machine Name | `[FILL]` |
+| Analysis Started | `[FILL]` |
+| Analysis Completed | `[FILL]` |
+| Duration | `[FILL]` |
+| Output Folder | `[FILL]` |
+| Prompt | `[FILL: the user's prompt text that triggered this analysis]` |
+[CONDITIONAL: incremental]
+| Baseline Report | `[FILL]` |
+| Baseline Commit | `[FILL: SHA]` (`[FILL: commit date]`) |
+| Target Commit | `[FILL: SHA]` (`[FILL: commit date]`) |
+| Baseline Worktree | `[FILL]` |
+| Analysis Mode | `Incremental` |
+[END-CONDITIONAL]
+
+<!-- ⛔ POST-TABLE CHECK: Verify Report Metadata:
+  1. ALL values wrapped in backticks: `value`
+  2. Git Commit, Baseline Commit, Target Commit each include date in parentheses
+  3. Duration field is present (not missing)
+  4. Model field states the actual model name
+  5. Analysis Started and Analysis Completed are real timestamps (not estimated from folder name)
+  If ANY check fails → FIX NOW. -->
+
+---
+
+## Classification Reference
+
+<!-- SKELETON INSTRUCTION: Copy the table below verbatim. Do NOT modify values. Do NOT copy this HTML comment into the output. -->
+
+| Classification | Values |
+|---------------|--------|
+| **Exploitability Tiers** | **T1** Direct Exposure (no prerequisites) · **T2** Conditional Risk (single prerequisite) · **T3** Defense-in-Depth (multiple prerequisites or infrastructure access) |
+| **STRIDE + Abuse** | **S** Spoofing · **T** Tampering · **R** Repudiation · **I** Information Disclosure · **D** Denial of Service · **E** Elevation of Privilege · **A** Abuse (feature misuse) |
+| **SDL Severity** | `Critical` · `Important` · `Moderate` · `Low` |
+| **Remediation Effort** | `Low` · `Medium` · `High` |
+| **Mitigation Type** | `Redesign` · `Standard Mitigation` · `Custom Mitigation` · `Existing Control` · `Accept Risk` · `Transfer Risk` |
+| **Threat Status** | `Open` · `Mitigated` · `Platform` |
+| **Incremental Tags** | `[Existing]` · `[Fixed]` · `[Partial]` · `[New]` · `[Removed]` (incremental reports only) |
+| **CVSS** | CVSS 4.0 vector with `CVSS:4.0/` prefix |
+| **CWE** | Hyperlinked CWE ID (e.g., [CWE-306](https://cwe.mitre.org/data/definitions/306.html)) |
+| **OWASP** | OWASP Top 10:2025 mapping (e.g., A01:2025 – Broken Access Control) |
+```
+
+**Critical format rules baked into this skeleton:**
+- `0-assessment.md` is the FIRST row in Report Files (not `0.1-architecture.md`)
+- `## Analysis Context & Assumptions` uses `&` (never word "and")
+- `---` horizontal rules between EVERY pair of `## ` sections (minimum 6)
+- `### Quick Wins` always present (with fallback note if no low-effort findings)
+- `### Needs Verification` and `### Finding Overrides` always present (even if empty with `—`)
+- References has TWO subsections with THREE-column tables (never flat 2-column)
+- ALL metadata values wrapped in backticks
+- ALL metadata fields present (Model, Analysis Started, Analysis Completed, Duration)
+- Risk Rating heading has NO emojis
+- Action Summary has EXACTLY 4 data rows: Tier 1, Tier 2, Tier 3, Total — NO "Mitigated" or "Platform" rows
+- Git Commit rows include commit date in parentheses: `SHA` (`date`)
diff --git a/skills/threat-model-analyst/references/skeletons/skeleton-dfd.md b/skills/threat-model-analyst/references/skeletons/skeleton-dfd.md
new file mode 100644
index 00000000..54e83e50
--- /dev/null
+++ b/skills/threat-model-analyst/references/skeletons/skeleton-dfd.md
@@ -0,0 +1,68 @@
+# Skeleton: 1.1-threatmodel.mmd
+
+> **⛔ This is a raw Mermaid file — NO markdown wrapper. Line 1 MUST start with `%%{init:`.**
+> **The init block, classDefs, and linkStyle are FIXED — never change colors/strokes.**
+> **Diagram direction is ALWAYS `flowchart LR` — NEVER `flowchart TB`.**
+> **⛔ The template below is shown inside a code fence for readability only — do NOT include the fence in the output file.**
+
+---
+
+```
+%%{init: {'theme': 'base', 'themeVariables': { 'background': '#ffffff', 'primaryColor': '#ffffff', 'lineColor': '#666666' }}}%%
+flowchart LR
+    classDef process fill:#6baed6,stroke:#2171b5,stroke-width:2px,color:#000000
+    classDef external fill:#fdae61,stroke:#d94701,stroke-width:2px,color:#000000
+    classDef datastore fill:#74c476,stroke:#238b45,stroke-width:2px,color:#000000
+    [CONDITIONAL: incremental mode — include BOTH lines below]
+    classDef newComponent fill:#d4edda,stroke:#28a745,stroke-width:3px,color:#000000
+    classDef removedComponent fill:#e9ecef,stroke:#6c757d,stroke-width:1px,stroke-dasharray:5,color:#6c757d
+    [END-CONDITIONAL]
+
+    [REPEAT: one line per external actor/interactor — outside all subgraphs]
+    [FILL: NodeID]["[FILL: Display Name]"]:::external
+    [END-REPEAT]
+
+    [REPEAT: one subgraph per trust boundary]
+    subgraph [FILL: BoundaryID]["[FILL: Boundary Display Name]"]
+        [REPEAT: processes and datastores inside this boundary]
+        [FILL: NodeID](("[FILL: Process Name]")):::process
+        [FILL: NodeID][("[FILL: DataStore Name]")]:::datastore
+        [END-REPEAT]
+    end
+    [END-REPEAT]
+
+    [REPEAT: one line per data flow — use <--> for bidirectional request-response]
+    [FILL: SourceID] <-->|"[FILL: DF##: description]"| [FILL: TargetID]
+    [END-REPEAT]
+
+    [REPEAT: one style line per trust boundary subgraph]
+    style [FILL: BoundaryID] fill:none,stroke:#e31a1c,stroke-width:3px,stroke-dasharray: 5 5
+    [END-REPEAT]
+
+    linkStyle default stroke:#666666,stroke-width:2px
+```
+
+**NEVER change these fixed elements:**
+- `%%{init:` themeVariables: only `background`, `primaryColor`, `lineColor`
+- `flowchart LR` — never TB
+- classDef colors: process=#6baed6/#2171b5, external=#fdae61/#d94701, datastore=#74c476/#238b45
+- Incremental classDefs (when applicable): newComponent=#d4edda/#28a745 (light green), removedComponent=#e9ecef/#6c757d (gray dashed)
+- New components MUST use `:::newComponent` (NOT `:::process`). Removed components MUST use `:::removedComponent`.
+- Trust boundary style: `fill:none,stroke:#e31a1c,stroke-width:3px,stroke-dasharray: 5 5`
+- linkStyle: `stroke:#666666,stroke-width:2px`
+
+**DFD shapes:**
+- Process: `(("Name"))` (double parentheses = circle)
+- Data Store: `[("Name")]` (bracket-paren = cylinder)
+- External: `["Name"]` (brackets = rectangle)
+- All labels MUST be quoted in `""`
+- All subgraph IDs: `subgraph ID["Title"]`
+
+<!-- ⛔ POST-DFD GATE — IMMEDIATELY after creating this file:
+  1. Count element nodes: lines with (("...")), [("...")], ["..."] shapes
+  2. Count boundaries: lines with 'subgraph'
+  3. If elements > 15 OR boundaries > 4:
+     → OPEN skeleton-summary-dfd.md and create 1.2-threatmodel-summary.mmd NOW
+     → Do NOT proceed to 1-threatmodel.md until summary exists
+  4. If threshold NOT met → skip summary, proceed to 1-threatmodel.md
+  This is the most frequently skipped step. The gate is MANDATORY. -->
diff --git a/skills/threat-model-analyst/references/skeletons/skeleton-findings.md b/skills/threat-model-analyst/references/skeletons/skeleton-findings.md
new file mode 100644
index 00000000..cfed926f
--- /dev/null
+++ b/skills/threat-model-analyst/references/skeletons/skeleton-findings.md
@@ -0,0 +1,197 @@
+# Skeleton: 3-findings.md
+
+> **⛔ Copy the template content below VERBATIM (excluding the outer code fence). Replace `[FILL]` placeholders. ALL 10 attribute rows are MANDATORY per finding. Organize by TIER, not by severity.**
+> **⛔ DO NOT abbreviate attribute names. Use EXACT names: `SDL Bugbar Severity` (not `Severity`), `Exploitation Prerequisites` (not `Prerequisites`), `Exploitability Tier` (not `Tier`), `Remediation Effort` (not `Effort`), `CVSS 4.0` (not `CVSS Score`).**
+> **⛔ DO NOT use bold inline headers (`**Description:**`). Use `#### Description` markdown h4 headings.**
+> **⛔ Tier section headings MUST be: `## Tier 1 — Direct Exposure (No Prerequisites)`, NOT `## Tier 1 Findings`.**
+
+---
+
+```markdown
+# Security Findings
+
+---
+
+## Tier 1 — Direct Exposure (No Prerequisites)
+
+[REPEAT: one finding block per Tier 1 finding, sorted by severity (Critical→Important→Moderate→Low) then CVSS descending]
+
+### FIND-[FILL: NN]: [FILL: title]
+
+| Attribute | Value |
+|-----------|-------|
+| SDL Bugbar Severity | [FILL: Critical / Important / Moderate / Low] |
+| CVSS 4.0 | [FILL: N.N] (CVSS:4.0/[FILL: full vector starting with AV:]) |
+| CWE | [CWE-[FILL: NNN]](https://cwe.mitre.org/data/definitions/[FILL: NNN].html): [FILL: weakness name] |
+| OWASP | A[FILL: NN]:2025 – [FILL: category name] |
+| Exploitation Prerequisites | [FILL: text or "None"] |
+| Exploitability Tier | Tier [FILL: 1/2/3] — [FILL: tier description] |
+| Remediation Effort | [FILL: Low / Medium / High] |
+| Mitigation Type | [FILL: Redesign / Standard Mitigation / Custom Mitigation / Existing Control / Accept Risk / Transfer Risk] |
+| Component | [FILL: component name] |
+| Related Threats | [T[FILL: NN].[FILL: X]](2-stride-analysis.md#[FILL: component-anchor]), [T[FILL: NN].[FILL: X]](2-stride-analysis.md#[FILL: component-anchor]) |
+
+<!-- ⛔ POST-FINDING CHECK: Verify this finding IMMEDIATELY:
+  1. ALL 10 attribute rows present (SDL Bugbar Severity through Related Threats)
+  2. Row names are EXACT: 'SDL Bugbar Severity' (not 'SDL Bugbar'), 'Exploitation Prerequisites' (not 'Prerequisites'), 'Exploitability Tier' (not 'Risk Tier'), 'Remediation Effort' (not 'Effort')
+  3. Related Threats are HYPERLINKS with `](2-stride-analysis.md#` — NOT plain text like 'T01.S, T02.T'
+  4. CVSS starts with `CVSS:4.0/` — NOT bare vector
+  5. CWE is a hyperlink to cwe.mitre.org — NOT plain text
+  6. OWASP uses `:2025` suffix — NOT `:2021`
+  If ANY check fails → FIX THIS FINDING NOW before writing the next one. -->
+
+#### Description
+
+[FILL-PROSE: technical description of the vulnerability]
+
+#### Evidence
+
+**Prerequisite basis:** [FILL: cite the specific code/config that determines this finding's prerequisite — e.g., "binds to 127.0.0.1 only (src/Server.cs:42)", "no auth middleware on /api routes (Startup.cs:18)", "console app with no network listener (Program.cs)". This MUST match the Component Exposure Table in 0.1-architecture.md.]
+
+[FILL: specific file paths, line numbers, config keys, code snippets]
+
+#### Remediation
+
+[FILL: actionable remediation steps]
+
+#### Verification
+
+[FILL: how to verify the fix was applied]
+
+<!-- ⛔ POST-SECTION CHECK: Verify this finding's sub-sections:
+  1. Exactly 4 sub-headings present: `#### Description`, `#### Evidence`, `#### Remediation`, `#### Verification`
+  2. Sub-headings use `####` level (NOT bold `**Description:**` inline text)
+  3. No extra sub-headings like `#### Impact`, `#### Recommendation`, `#### Mitigation`
+  4. Description has at least 2 sentences of technical detail
+  5. Evidence cites specific file paths or line numbers (not generic)
+  If ANY check fails → FIX NOW before moving to next finding. -->
+
+[END-REPEAT]
+[CONDITIONAL-EMPTY: If no Tier 1 findings, include this line instead of the REPEAT block]
+*No Tier 1 findings identified for this repository.*
+[END-CONDITIONAL-EMPTY]
+
+---
+
+## Tier 2 — Conditional Risk (Authenticated / Single Prerequisite)
+
+[REPEAT: same finding block structure as Tier 1, sorted same way]
+
+### FIND-[FILL: NN]: [FILL: title]
+
+| Attribute | Value |
+|-----------|-------|
+| SDL Bugbar Severity | [FILL] |
+| CVSS 4.0 | [FILL] (CVSS:4.0/[FILL]) |
+| CWE | [CWE-[FILL]](https://cwe.mitre.org/data/definitions/[FILL].html): [FILL] |
+| OWASP | A[FILL]:2025 – [FILL] |
+| Exploitation Prerequisites | [FILL] |
+| Exploitability Tier | Tier [FILL] — [FILL] |
+| Remediation Effort | [FILL] |
+| Mitigation Type | [FILL] |
+| Component | [FILL] |
+| Related Threats | [FILL] |
+
+#### Description
+
+[FILL-PROSE]
+
+#### Evidence
+
+**Prerequisite basis:** [FILL: cite the specific code/config that determines this finding's prerequisite — must match the Component Exposure Table in 0.1-architecture.md]
+
+[FILL]
+
+#### Remediation
+
+[FILL]
+
+#### Verification
+
+[FILL]
+
+[END-REPEAT]
+[CONDITIONAL-EMPTY: If no Tier 2 findings, include this line instead of the REPEAT block]
+*No Tier 2 findings identified for this repository.*
+[END-CONDITIONAL-EMPTY]
+
+---
+
+## Tier 3 — Defense-in-Depth (Prior Compromise / Host Access)
+
+[REPEAT: same finding block structure]
+
+### FIND-[FILL: NN]: [FILL: title]
+
+| Attribute | Value |
+|-----------|-------|
+| SDL Bugbar Severity | [FILL] |
+| CVSS 4.0 | [FILL] (CVSS:4.0/[FILL]) |
+| CWE | [CWE-[FILL]](https://cwe.mitre.org/data/definitions/[FILL].html): [FILL] |
+| OWASP | A[FILL]:2025 – [FILL] |
+| Exploitation Prerequisites | [FILL] |
+| Exploitability Tier | Tier [FILL] — [FILL] |
+| Remediation Effort | [FILL] |
+| Mitigation Type | [FILL] |
+| Component | [FILL] |
+| Related Threats | [FILL] |
+
+#### Description
+
+[FILL-PROSE]
+
+#### Evidence
+
+**Prerequisite basis:** [FILL: cite the specific code/config that determines this finding's prerequisite — must match the Component Exposure Table in 0.1-architecture.md]
+
+[FILL]
+
+#### Remediation
+
+[FILL]
+
+#### Verification
+
+[FILL]
+
+[END-REPEAT]
+[CONDITIONAL-EMPTY: If no Tier 3 findings, include this line instead of the REPEAT block]
+*No Tier 3 findings identified for this repository.*
+[END-CONDITIONAL-EMPTY]
+```
+
+At the END of `3-findings.md`, append the Threat Coverage Verification table:
+
+```markdown
+---
+
+## Threat Coverage Verification
+
+| Threat ID | Finding ID | Status |
+|-----------|------------|--------|
+[REPEAT: one row per threat from ALL components in 2-stride-analysis.md]
+| [FILL: T##.X] | [FILL: FIND-## or —] | [FILL: ✅ Covered (FIND-XX) / ✅ Mitigated (FIND-XX) / 🔄 Mitigated by Platform] |
+[END-REPEAT]
+
+<!-- ⛔ POST-TABLE CHECK: Verify Threat Coverage Verification:
+  1. Status column uses ONLY these 3 values with emoji prefixes:
+     - `✅ Covered (FIND-XX)` — vulnerability needs remediation
+     - `✅ Mitigated (FIND-XX)` — team built a control (documented in finding)
+     - `🔄 Mitigated by Platform` — external platform handles it
+  2. Do NOT use plain text like "Finding", "Mitigated", "Covered" without the emoji
+  3. Do NOT use "Needs Review", "Accepted Risk", or "N/A"
+  4. Column headers are EXACTLY: `Threat ID | Finding ID | Status` (NOT `Threat | Finding | Status`)
+  5. Every threat from 2-stride-analysis.md appears in this table (no missing threats)
+  If ANY check fails → FIX NOW. -->
+```
+
+**Fixed rules baked into this skeleton:**
+- Finding ID: `FIND-` prefix (never `F-`, `F01`, `Finding`)
+- Attribute names: `SDL Bugbar Severity`, `Exploitation Prerequisites`, `Exploitability Tier`, `Remediation Effort` (exact — not abbreviated)
+- CVSS: starts with `CVSS:4.0/` (never bare vector)
+- CWE: hyperlinked (never plain text)
+- OWASP: `:2025` suffix (never `:2021`)
+- Related Threats: individual hyperlinks (never plain text)
+- Sub-sections: `#### Description`, `#### Evidence`, `#### Remediation`, `#### Verification`
+- Organized by TIER — no `## Critical Findings` or `## Mitigated` sections
+- Exactly 3 tier sections (all mandatory, even if empty with "*No Tier N findings identified.*")
diff --git a/skills/threat-model-analyst/references/skeletons/skeleton-incremental-html.md b/skills/threat-model-analyst/references/skeletons/skeleton-incremental-html.md
new file mode 100644
index 00000000..868730ae
--- /dev/null
+++ b/skills/threat-model-analyst/references/skeletons/skeleton-incremental-html.md
@@ -0,0 +1,150 @@
+# Skeleton: incremental-comparison.html
+
+> **⛔ Self-contained HTML — ALL CSS inline. No CDN links. Follow this exact 8-section structure.**
+
+---
+
+The HTML report has exactly 8 sections in this order. Each section MUST be present.
+
+## Section 1: Header + Comparison Cards
+```html
+<div class="header">
+  <div class="report-badge">INCREMENTAL THREAT MODEL COMPARISON</div>
+  <h1>[FILL: repo name]</h1>
+</div>
+<div class="comparison-cards">
+  <div class="compare-card baseline">
+    <div class="card-label">BASELINE</div>
+    <div class="card-hash">[FILL: baseline SHA]</div>
+    <div class="card-date">[FILL: baseline commit date from git log]</div>
+    <div class="risk-badge [FILL: old-class]">[FILL: old rating]</div>
+  </div>
+  <div class="compare-arrow">→</div>
+  <div class="compare-card target">
+    <div class="card-label">TARGET</div>
+    <div class="card-hash">[FILL: target SHA]</div>
+    <div class="card-date">[FILL: target commit date from git log]</div>
+    <div class="risk-badge [FILL: new-class]">[FILL: new rating]</div>
+  </div>
+  <div class="compare-card trend">
+    <div class="card-label">TREND</div>
+    <div class="trend-direction [FILL: color]">[FILL: Improving / Worsening / Stable]</div>
+    <div class="trend-duration">[FILL: N months]</div>
+  </div>
+</div>
+```
+<!-- SKELETON INSTRUCTION: Section 2 (Risk Shift) is merged into Section 1 above. The old separate risk-shift div is removed. The comparison-cards div replaces both the old subtitle + risk-shift + time-between box. -->
+
+## Section 2: Metrics Bar (5 boxes)
+```html
+<div class="metrics-bar">
+  [FILL: Components: old → new (±N)]
+  [FILL: Trust Boundaries: old → new (±N)]
+  [FILL: Threats: old → new (±N)]
+  [FILL: Findings: old → new (±N)]
+  [FILL: Code Changes: N commits, M PRs — use git rev-list --count and git log --oneline --merges --grep="Merged PR"]
+</div>
+```
+**MUST include Trust Boundaries as one of the 5 metrics. 5th box is Code Changes (NOT Time Between).**
+
+## Section 3: Status Summary Cards (colored)
+```html
+<div class="status-cards">
+  <!-- Green card --> Fixed: [FILL: count] [FILL: 1-sentence summary, NO IDs]
+  <!-- Red card --> New: [FILL: count] [FILL: 1-sentence summary, NO IDs]
+  <!-- Amber card --> Previously Unidentified: [FILL: count] [FILL: 1-sentence summary, NO IDs]
+  <!-- Gray card --> Still Present: [FILL: count] [FILL: 1-sentence summary, NO IDs]
+</div>
+```
+<!-- SKELETON INSTRUCTION: Status cards show COUNT + a short human-readable sentence ONLY.
+  DO NOT include threat IDs (T06.S, T02.E), finding IDs (FIND-14), or component names.
+  Good: "1 credential handling vulnerability remediated"
+  Good: "4 new components with 21 new threats identified"
+  Good: "No new threats or findings introduced"
+  Bad: "T06.S: DefaultAzureCredential → ManagedIdentityCredential"
+  Bad: "ConfigurationOrchestrator — 5 threats (T16.*), LLMService — 6 threats (T17.*)"
+  The detailed item-by-item breakdown with IDs belongs in Section 5 (Threat/Finding Status Breakdown). -->
+**Status info appears ONLY here — NOT also in the metrics bar.**
+
+## Section 4: Component Status Grid
+```html
+<table class="component-grid">
+  <tr><th>Component</th><th>Type</th><th>Status</th><th>Source Files</th></tr>
+  [REPEAT: one row per component with color-coded status badge]
+  <tr><td>[FILL]</td><td>[FILL]</td><td><span class="badge-[FILL: status]">[FILL]</span></td><td>[FILL]</td></tr>
+  [END-REPEAT]
+</table>
+```
+
+## Section 5: Threat/Finding Status Breakdown
+```html
+<div class="status-breakdown">
+  [FILL: Grouped by status — Fixed items, New items, etc.]
+  [REPEAT: Each item: ID | Title | Component | Status]
+  [END-REPEAT]
+</div>
+```
+
+## Section 6: STRIDE Heatmap with Deltas
+```html
+<table class="stride-heatmap">
+  <thead>
+    <tr>
+      <th>Component</th>
+      <th>S</th><th>T</th><th>R</th><th>I</th><th>D</th><th>E</th><th>A</th>
+      <th>Total</th>
+      <th class="divider"></th>
+      <th>T1</th><th>T2</th><th>T3</th>
+    </tr>
+  </thead>
+  <tbody>
+    [REPEAT: one row per component]
+    <tr>
+      <td>[FILL: component]</td>
+      <td>[FILL: S value] [FILL: delta indicator ▲/▼]</td>
+      ... [same for T, R, I, D, E, A, Total] ...
+      <td class="divider"></td>
+      <td>[FILL: T1]</td><td>[FILL: T2]</td><td>[FILL: T3]</td>
+    </tr>
+    [END-REPEAT]
+  </tbody>
+</table>
+```
+**MUST have 13 columns: Component + S + T + R + I + D + E + A + Total + divider + T1 + T2 + T3**
+
+## Section 7: Needs Verification
+```html
+<div class="needs-verification">
+  [REPEAT: items where analysis disagrees with old report]
+  [FILL: item description]
+  [END-REPEAT]
+</div>
+```
+
+## Section 8: Footer
+```html
+<div class="footer">
+  Model: [FILL] | Duration: [FILL]
+  Baseline: [FILL: folder] at [FILL: SHA]
+  Generated: [FILL: timestamp]
+</div>
+```
+
+---
+
+**Fixed CSS variables (use in `<style>` block):**
+```css
+--red: #dc3545;    /* new vulnerability */
+--green: #28a745;  /* fixed/improved */
+--amber: #fd7e14;  /* previously unidentified */
+--gray: #6c757d;   /* still present */
+--accent: #2171b5; /* modified/info */
+```
+
+**Fixed rules:**
+- ALL CSS in inline `<style>` block — no external stylesheets
+- Include `@media print` styles
+- Heatmap MUST have T1/T2/T3 columns after divider
+- Metrics bar MUST include Trust Boundaries
+- Status data in cards ONLY — not duplicated in metrics bar
+- HTML threat/finding totals MUST match markdown STRIDE summary totals
diff --git a/skills/threat-model-analyst/references/skeletons/skeleton-inventory.md b/skills/threat-model-analyst/references/skeletons/skeleton-inventory.md
new file mode 100644
index 00000000..e5266c59
--- /dev/null
+++ b/skills/threat-model-analyst/references/skeletons/skeleton-inventory.md
@@ -0,0 +1,139 @@
+# Skeleton: threat-inventory.json
+
+> **⛔ Use EXACT field names shown below. Common errors: `display_name` (wrong→`display`), `category` (wrong→`stride_category`), `name` (wrong→`title`).**
+> **⛔ The template below is shown inside a code fence for readability only — do NOT include the fence in the output file. The `.json` file must start with `{` on line 1.**
+
+---
+
+```json
+{
+  "schema_version": "[FILL: 1.0 for standalone, 1.1 for incremental]",
+  "report_folder": "[FILL: threat-model-YYYYMMDD-HHmmss]",
+  "commit": "[FILL: short SHA]",
+  "commit_date": "[FILL: commit date UTC]",
+  "branch": "[FILL]",
+  "repository": "[FILL: remote URL]",
+  "analysis_timestamp": "[FILL: UTC timestamp]",
+  "model": "[FILL]",
+
+  "components": [
+    [REPEAT: sorted by id]
+    {
+      "id": "[FILL: PascalCase]",
+      "display": "[FILL: display name — NOT display_name]",
+      "type": "[FILL: process / external_service / data_store / external_interactor]",
+      "tmt_type": "[FILL: SE.P.TMCore.* / SE.EI.TMCore.* / SE.DS.TMCore.* from tmt-element-taxonomy.md]",
+      "boundary": "[FILL: boundary ID]",
+      "boundary_kind": "[FILL: MachineBoundary / NetworkBoundary / ClusterBoundary / ProcessBoundary / PrivilegeBoundary / SandboxBoundary]",
+      "aliases": [],
+      "source_files": ["[FILL: relative paths]"],
+      "source_directories": ["[FILL: relative dirs]"],
+      "fingerprint": {
+        "component_type": "[FILL: process / external_service / data_store / external_interactor]",
+        "boundary_kind": "[FILL: MachineBoundary / NetworkBoundary / ClusterBoundary / ProcessBoundary / PrivilegeBoundary / SandboxBoundary]",
+        "source_files": ["[FILL: relative paths]"],
+        "source_directories": ["[FILL: relative dirs — MUST NOT be empty for process-type]"],
+        "class_names": ["[FILL]"],
+        "namespace": "[FILL]",
+        "config_keys": [],
+        "api_routes": [],
+        "dependencies": [],
+        "inbound_from": ["[FILL: component IDs that send data TO this component]"],
+        "outbound_to": ["[FILL: component IDs this component sends data TO]"],
+        "protocols": ["[FILL: gRPC / HTTPS / SQL / etc.]"]
+      },
+      "sidecars": ["[FILL: co-located sidecar names, or empty array]"]
+    }
+    [END-REPEAT]
+  ],
+
+  "boundaries": [
+    [REPEAT: sorted by id]
+    {
+      "id": "[FILL: PascalCase boundary ID]",
+      "display": "[FILL]",
+      "kind": "[FILL: MachineBoundary / NetworkBoundary / ClusterBoundary / ProcessBoundary / PrivilegeBoundary / SandboxBoundary]",
+      "aliases": [],
+      "contains": ["[FILL: component IDs]"],
+      "contains_fingerprint": "[FILL: sorted pipe-delimited component IDs]"
+    }
+    [END-REPEAT]
+  ],
+
+  "flows": [
+    [REPEAT: sorted by id]
+    {
+      "id": "[FILL: DF_Source_to_Target]",
+      "from": "[FILL: component ID]",
+      "to": "[FILL: component ID]",
+      "protocol": "[FILL]",
+      "description": "[FILL: 1 sentence max]"
+    }
+    [END-REPEAT]
+  ],
+
+  "threats": [
+    [REPEAT: sorted by id then identity_key.component_id]
+    {
+      "id": "[FILL: T##.X]",
+      "title": "[FILL: short title — REQUIRED]",
+      "description": "[FILL: 1 sentence — REQUIRED]",
+      "stride_category": "[FILL: S/T/R/I/D/E/A — SINGLE LETTER, NOT full word]",
+      "tier": [FILL: 1/2/3],
+      "prerequisites": "[FILL]",
+      "status": "[FILL: Open/Mitigated/Platform]",
+      "mitigation": "[FILL: 1 sentence or empty]",
+      "identity_key": {
+        "component_id": "[FILL: PascalCase — MUST be inside identity_key, NOT top-level]",
+        "data_flow_id": "[FILL: DF_Source_to_Target]",
+        "stride_category": "[FILL: S/T/R/I/D/E/A]",
+        "attack_surface": "[FILL: brief description of the attack surface]"
+      }
+    }
+    [END-REPEAT]
+  ],
+
+  "findings": [
+    [REPEAT: sorted by id then identity_key.component_id]
+    {
+      "id": "[FILL: FIND-##]",
+      "title": "[FILL]",
+      "severity": "[FILL: Critical/Important/Moderate/Low]",
+      "cvss_score": [FILL: N.N],
+      "cvss_vector": "[FILL: CVSS:4.0/AV:...]",
+      "cwe": "[FILL: CWE-###]",
+      "owasp": "[FILL: A##:2025]",
+      "tier": [FILL: 1/2/3],
+      "effort": "[FILL: Low/Medium/High]",
+      "related_threats": ["[FILL: T##.X]"],
+      "evidence_files": ["[FILL: relative paths]"],
+      "component": "[FILL: display name]",
+      "identity_key": {
+        "component_id": "[FILL: PascalCase]",
+        "vulnerability": "[FILL: CWE-###]",
+        "attack_surface": "[FILL: file:key or endpoint]"
+      }
+    }
+    [END-REPEAT]
+  ],
+
+  "metrics": {
+    "total_components": [FILL],
+    "total_boundaries": [FILL],
+    "total_flows": [FILL],
+    "total_threats": [FILL],
+    "total_findings": [FILL],
+    "threats_by_tier": { "T1": [FILL], "T2": [FILL], "T3": [FILL] },
+    "findings_by_tier": { "T1": [FILL], "T2": [FILL], "T3": [FILL] },
+    "threats_by_stride": { "S": [FILL], "T": [FILL], "R": [FILL], "I": [FILL], "D": [FILL], "E": [FILL], "A": [FILL] },
+    "findings_by_severity": { "Critical": [FILL], "Important": [FILL], "Moderate": [FILL], "Low": [FILL] }
+  }
+}
+```
+
+**MANDATORY field name compliance:**
+- `"display"` — NOT `"display_name"`, `"name"`
+- `"stride_category"` — NOT `"category"` — SINGLE LETTER (S/T/R/I/D/E/A)
+- `"title"` AND `"description"` — both required on every threat
+- `identity_key.component_id` — component link INSIDE identity_key, NOT top-level
+- Sort all arrays deterministically before writing
diff --git a/skills/threat-model-analyst/references/skeletons/skeleton-stride-analysis.md b/skills/threat-model-analyst/references/skeletons/skeleton-stride-analysis.md
new file mode 100644
index 00000000..6c093fc0
--- /dev/null
+++ b/skills/threat-model-analyst/references/skeletons/skeleton-stride-analysis.md
@@ -0,0 +1,106 @@
+# Skeleton: 2-stride-analysis.md
+
+> **⛔ Copy the template content below VERBATIM (excluding the outer code fence). Replace `[FILL]` placeholders. The "A" in STRIDE-A is ALWAYS "Abuse" — NEVER "Authorization".**
+> **⛔ Exploitability Tiers table MUST have EXACTLY 4 columns: `Tier | Label | Prerequisites | Assignment Rule`. DO NOT merge into 3 columns. DO NOT rename `Assignment Rule` to `Description`.**
+> **⛔ Summary table MUST include a `Link` column: `Component | Link | S | T | R | I | D | E | A | Total | T1 | T2 | T3 | Risk`**
+> **⛔ N/A Categories MUST use a table (`| Category | Justification |`), NOT prose/bullet points.**
+
+---
+
+```markdown
+# STRIDE + Abuse Cases — Threat Analysis
+
+> This analysis uses the standard **STRIDE** methodology (Spoofing, Tampering, Repudiation, Information Disclosure, Denial of Service, Elevation of Privilege) extended with **Abuse Cases** (business logic abuse, workflow manipulation, feature misuse). The "A" column in tables below represents Abuse — a supplementary category covering threats where legitimate features are misused for unintended purposes. This is distinct from Elevation of Privilege (E), which covers authorization bypass.
+
+## Exploitability Tiers
+
+Threats are classified into three exploitability tiers based on the prerequisites an attacker needs:
+
+| Tier | Label | Prerequisites | Assignment Rule |
+|------|-------|---------------|----------------|
+| **Tier 1** | Direct Exposure | `None` | Exploitable by unauthenticated external attacker with NO prior access. The prerequisite field MUST say `None`. |
+| **Tier 2** | Conditional Risk | Single prerequisite: `Authenticated User`, `Privileged User`, `Internal Network`, or single `{Boundary} Access` | Requires exactly ONE form of access. The prerequisite field has ONE item. |
+| **Tier 3** | Defense-in-Depth | `Host/OS Access`, `Admin Credentials`, `{Component} Compromise`, `Physical Access`, or MULTIPLE prerequisites joined with `+` | Requires significant prior breach, infrastructure access, or multiple combined prerequisites. |
+
+<!-- ⛔ POST-TABLE CHECK: Verify this table has EXACTLY 4 columns (Tier|Label|Prerequisites|Assignment Rule). If you wrote 3 columns or named the 4th column 'Description' or 'Example' → FIX NOW before continuing. -->
+
+## Summary
+
+| Component | Link | S | T | R | I | D | E | A | Total | T1 | T2 | T3 | Risk |
+|-----------|------|---|---|---|---|---|---|---|-------|----|----|----|------|
+[REPEAT: one row per component — numeric STRIDE counts, 0 is valid with N/A justification]
+| [FILL: ComponentName] | [Link](#[FILL: anchor]) | [FILL] | [FILL] | [FILL] | [FILL] | [FILL] | [FILL] | [FILL] | [FILL: sum] | [FILL] | [FILL] | [FILL] | [FILL: Low/Medium/High/Critical] |
+[END-REPEAT]
+| **Totals** | | **[FILL]** | **[FILL]** | **[FILL]** | **[FILL]** | **[FILL]** | **[FILL]** | **[FILL]** | **[FILL]** | **[FILL]** | **[FILL]** | **[FILL]** | |
+
+<!-- ⛔ POST-TABLE CHECK: Verify this Summary table:
+  1. Has 14 columns: Component | Link | S | T | R | I | D | E | A | Total | T1 | T2 | T3 | Risk
+  2. The 2nd column is a SEPARATE 'Link' column with `[Link](#anchor)` values — do NOT embed links inside the Component column
+  3. S+T+R+I+D+E+A = Total for every row
+  4. T1+T2+T3 = Total for every row
+  5. No row has ALL 1s in every STRIDE column (if so, the analysis is too shallow)
+  6. The 'A' column header represents 'Abuse' not 'Authorization'
+  If ANY check fails → FIX NOW before writing component sections. -->
+
+---
+
+[REPEAT: one section per component — do NOT include sections for external actors (Operator, EndUser)]
+
+## [FILL: ComponentName]
+
+**Trust Boundary:** [FILL: boundary name]
+**Role:** [FILL: brief description]
+**Data Flows:** [FILL: DF##, DF##, ...]
+**Pod Co-location:** [FILL: sidecars if K8s, or "N/A" if not K8s]
+
+### STRIDE-A Analysis
+
+#### Tier 1 — Direct Exposure (No Prerequisites)
+
+| ID | Category | Threat | Prerequisites | Affected Flow | Mitigation | Status |
+|----|----------|--------|---------------|---------------|------------|--------|
+[REPEAT: threat rows or "*No Tier 1 threats identified.*"]
+| [FILL: T##.X] | [FILL: Spoofing/Tampering/Repudiation/Information Disclosure/Denial of Service/Elevation of Privilege/Abuse] | [FILL] | [FILL] | [FILL: DF##] | [FILL] | [FILL: Open/Mitigated/Platform] |
+[END-REPEAT]
+
+#### Tier 2 — Conditional Risk
+
+| ID | Category | Threat | Prerequisites | Affected Flow | Mitigation | Status |
+|----|----------|--------|---------------|---------------|------------|--------|
+[REPEAT: threat rows or "*No Tier 2 threats identified.*"]
+| [FILL] | [FILL] | [FILL] | [FILL] | [FILL] | [FILL] | [FILL] |
+[END-REPEAT]
+
+#### Tier 3 — Defense-in-Depth
+
+| ID | Category | Threat | Prerequisites | Affected Flow | Mitigation | Status |
+|----|----------|--------|---------------|---------------|------------|--------|
+[REPEAT: threat rows or "*No Tier 3 threats identified.*"]
+| [FILL] | [FILL] | [FILL] | [FILL] | [FILL] | [FILL] | [FILL] |
+[END-REPEAT]
+
+#### Categories Not Applicable
+
+| Category | Justification |
+|----------|---------------|
+[REPEAT: one row per N/A STRIDE category — use "Abuse" not "Authorization" for the A category]
+| [FILL: Spoofing/Tampering/Repudiation/Information Disclosure/Denial of Service/Elevation of Privilege/Abuse] | [FILL: 1-sentence justification] |
+[END-REPEAT]
+
+<!-- ⛔ POST-COMPONENT CHECK: Verify this component:
+  1. Category column uses full names (not abbreviations like 'S', 'T', 'DoS')
+  2. 'A' category is 'Abuse' (NEVER 'Authorization')
+  3. Status column uses ONLY: Open, Mitigated, Platform
+  4. All 3 tier sub-sections present (even if empty with '*No Tier N threats*')
+  5. N/A table present for any STRIDE categories without threats
+  If ANY check fails → FIX NOW before moving to next component. -->
+
+[END-REPEAT]
+```
+
+**STRIDE + Abuse Cases — the 7 categories are EXACTLY:**
+Spoofing | Tampering | Repudiation | Information Disclosure | Denial of Service | Elevation of Privilege | Abuse
+
+**Note:** The first 6 are standard STRIDE. "Abuse" is a supplementary category for business logic misuse (workflow manipulation, feature exploitation, API abuse). It is NOT "Authorization" — authorization issues belong under Elevation of Privilege (E).
+
+**Valid Status values:** `Open` | `Mitigated` | `Platform` — NO other values permitted.
diff --git a/skills/threat-model-analyst/references/skeletons/skeleton-summary-dfd.md b/skills/threat-model-analyst/references/skeletons/skeleton-summary-dfd.md
new file mode 100644
index 00000000..9c2d9b41
--- /dev/null
+++ b/skills/threat-model-analyst/references/skeletons/skeleton-summary-dfd.md
@@ -0,0 +1,62 @@
+# Skeleton: 1.2-threatmodel-summary.mmd
+
+> **⛔ ALWAYS evaluate this skeleton after creating `1.1-threatmodel.mmd`.**
+> Count elements (nodes with `(("..."))`, `[("...")]`, `["..."]`) and boundaries (`subgraph`) in the detailed DFD.
+> - If elements > 15 OR boundaries > 4 → this file is **REQUIRED**. Fill the template below.
+> - If elements ≤ 15 AND boundaries ≤ 4 → **SKIP** this file. Proceed to `1-threatmodel.md`.
+> **⛔ This is a raw Mermaid file. The template below is shown inside a code fence for readability only — do NOT include the fence in the output file. The `.mmd` file must start with `%%{init:` on line 1.**
+
+---
+
+```
+%%{init: {'theme': 'base', 'themeVariables': { 'background': '#ffffff', 'primaryColor': '#ffffff', 'lineColor': '#666666' }}}%%
+flowchart LR
+    classDef process fill:#6baed6,stroke:#2171b5,stroke-width:2px,color:#000000
+    classDef external fill:#fdae61,stroke:#d94701,stroke-width:2px,color:#000000
+    classDef datastore fill:#74c476,stroke:#238b45,stroke-width:2px,color:#000000
+
+    [FILL: External actors — keep all, do not aggregate]
+    [FILL: ExternalActor]["[FILL: Name]"]:::external
+
+    [REPEAT: one subgraph per trust boundary — ALL boundaries MUST be preserved]
+    subgraph [FILL: BoundaryID]["[FILL: Boundary Name]"]
+        [FILL: Aggregated and individual nodes]
+    end
+    [END-REPEAT]
+
+    [REPEAT: summary data flows using SDF prefix]
+    [FILL: Source] <-->|"[FILL: SDF##: description]"| [FILL: Target]
+    [END-REPEAT]
+
+    [REPEAT: boundary styles]
+    style [FILL: BoundaryID] fill:none,stroke:#e31a1c,stroke-width:3px,stroke-dasharray: 5 5
+    [END-REPEAT]
+
+    linkStyle default stroke:#666666,stroke-width:2px
+```
+
+## Aggregation Rules
+
+**Reference:** `diagram-conventions.md` → Summary Diagram Rules for full details.
+
+1. **ALL trust boundaries MUST be preserved** — never combine or omit boundaries.
+2. **Keep individually:** entry points, core flow components, security-critical services, primary data stores, all external actors.
+3. **Aggregate only:** supporting infrastructure, secondary caches, multiple externals at same trust level.
+4. **Aggregated element labels MUST list contents:**
+   ```
+   DataLayer[("Data Layer<br/>(UserDB, OrderDB, Redis)")]
+   SupportServices(("Supporting<br/>(Logging, Monitoring)"))
+   ```
+5. **Flow IDs:** Use `SDF` prefix: `SDF01`, `SDF02`, ...
+
+## Required in `1-threatmodel.md`
+
+When this file is generated, `1-threatmodel.md` MUST include:
+- A `## Summary View` section with this diagram in a ` ```mermaid ` fence
+- A `## Summary to Detailed Mapping` table:
+
+```markdown
+| Summary Element | Contains | Summary Flows | Maps to Detailed Flows |
+|----------------|----------|---------------|------------------------|
+| [FILL] | [FILL: list of detailed elements] | [FILL: SDF##] | [FILL: DF## list] |
+```
diff --git a/skills/threat-model-analyst/references/skeletons/skeleton-threatmodel.md b/skills/threat-model-analyst/references/skeletons/skeleton-threatmodel.md
new file mode 100644
index 00000000..8f3ae4a2
--- /dev/null
+++ b/skills/threat-model-analyst/references/skeletons/skeleton-threatmodel.md
@@ -0,0 +1,65 @@
+# Skeleton: 1-threatmodel.md
+
+> **⛔ Copy the template content below VERBATIM (excluding the outer code fence). Replace `[FILL]` placeholders. Diagram in `.md` and `.mmd` must be IDENTICAL.**
+> **⛔ Data Flow Table columns: `ID | Source | Target | Protocol | Description`. DO NOT rename `Target` to `Destination`. DO NOT reorder columns.**
+> **⛔ Trust Boundary Table columns: `Boundary | Description | Contains` (3 columns). DO NOT add a `Name` column or rename `Contains` to `Components Inside`.**
+
+---
+
+````markdown
+# Threat Model
+
+## Data Flow Diagram
+
+```mermaid
+[FILL: Copy EXACT content from 1.1-threatmodel.mmd]
+```
+
+## Element Table
+
+| Element | Type | TMT Category | Description | Trust Boundary |
+|---------|------|--------------|-------------|----------------|
+[CONDITIONAL: For K8s apps with sidecars, add a `Co-located Sidecars` column after Trust Boundary]
+[REPEAT: one row per element]
+| [FILL] | [FILL: Process / External Interactor / Data Store] | [FILL: SE.P.TMCore.* / SE.EI.TMCore.* / SE.DS.TMCore.*] | [FILL] | [FILL] |
+[END-REPEAT]
+
+## Data Flow Table
+
+| ID | Source | Target | Protocol | Description |
+|----|--------|--------|----------|-------------|
+[REPEAT: one row per data flow]
+| [FILL: DF##] | [FILL] | [FILL] | [FILL] | [FILL] |
+[END-REPEAT]
+
+## Trust Boundary Table
+
+| Boundary | Description | Contains |
+|----------|-------------|----------|
+[REPEAT: one row per trust boundary]
+| [FILL] | [FILL] | [FILL: comma-separated component list] |
+[END-REPEAT]
+
+[CONDITIONAL: Include ONLY if summary diagram was generated (elements > 15 OR boundaries > 4)]
+
+## Summary View
+
+```mermaid
+[FILL: Copy EXACT content from 1.2-threatmodel-summary.mmd]
+```
+
+## Summary to Detailed Mapping
+
+| Summary Element | Contains | Summary Flows | Maps to Detailed Flows |
+|-----------------|----------|---------------|------------------------|
+[REPEAT]
+| [FILL] | [FILL] | [FILL: SDF##] | [FILL: DF##, DF##] |
+[END-REPEAT]
+
+[END-CONDITIONAL]
+````
+
+**Fixed rules:**
+- Use `DF01`, `DF02` for detailed flows; `SDF01`, `SDF02` for summary flows
+- Element Type: exactly `Process`, `External Interactor`, or `Data Store`
+- TMT Category: must be a specific ID from tmt-element-taxonomy.md (e.g., `SE.P.TMCore.WebSvc`)
diff --git a/skills/threat-model-analyst/references/tmt-element-taxonomy.md b/skills/threat-model-analyst/references/tmt-element-taxonomy.md
new file mode 100644
index 00000000..acb0aa95
--- /dev/null
+++ b/skills/threat-model-analyst/references/tmt-element-taxonomy.md
@@ -0,0 +1,187 @@
+# TMT Element Taxonomy — Code to Threat Model DFD Reference
+
+Complete reference for identifying DFD elements from source code analysis.
+Aligns with Microsoft Threat Modeling Tool (TMT) element types for TM7 compatibility.
+This is the **single authoritative file** for all TMT type classifications.
+
+**Diagram styling & rendering rules** are in: [diagram-conventions.md](./diagram-conventions.md)
+**This file covers:** What to look for in code, how to classify it, and how to name it.
+
+---
+
+## 1. Element Types
+
+**NOTE:** TMT IDs (e.g., `SE.P.TMCore.OSProcess`) are for classification reference only. **Do NOT use TMT IDs as Mermaid node IDs.** Use concise, readable PascalCase IDs (e.g., `WebServer`, `SqlDatabase`).
+
+### 1.1 Process Types
+
+| TMT ID | Name | Code Patterns to Identify |
+|--------|------|---------------------------|
+| `SE.P.TMCore.OSProcess` | OS Process | Native executables, system processes, spawned processes |
+| `SE.P.TMCore.Thread` | Thread | Thread pools, `Task`, `pthread`, worker threads |
+| `SE.P.TMCore.WinApp` | Native Application | Win32 apps, C/C++ executables, desktop apps |
+| `SE.P.TMCore.NetApp` | Managed Application | .NET apps, C# services, F# programs |
+| `SE.P.TMCore.ThickClient` | Thick Client | Desktop GUI apps, WPF, WinForms, Electron |
+| `SE.P.TMCore.BrowserClient` | Browser Client | SPAs, JavaScript apps, WebAssembly |
+| `SE.P.TMCore.WebServer` | Web Server | IIS, Apache, Nginx, Express, Kestrel |
+| `SE.P.TMCore.WebApp` | Web Application | ASP.NET, Django, Rails, Spring MVC |
+| `SE.P.TMCore.WebSvc` | Web Service | REST APIs, SOAP, GraphQL endpoints |
+| `SE.P.TMCore.VM` | Virtual Machine | VMs, containers, Docker |
+| `SE.P.TMCore.Win32Service` | Win32 Service | Windows services, `ServiceBase` |
+| `SE.P.TMCore.KernelThread` | Kernel Thread | Kernel modules, drivers, ring-0 code |
+| `SE.P.TMCore.Modern` | Windows Store Process | UWP apps, Windows Store apps, sandboxed apps |
+| `SE.P.TMCore.PlugIn` | Browser and ActiveX Plugins | Browser extensions, ActiveX, BHO plugins |
+| `SE.P.TMCore.NonMS` | Applications Running on a non Microsoft OS | Linux apps, macOS apps, Unix processes |
+
+### 1.2 External Interactor Types
+
+| TMT ID | Name | Code Patterns to Identify |
+|--------|------|---------------------------|
+| `SE.EI.TMCore.Browser` | Browser | Browser clients, user agents, web UI consumers |
+| `SE.EI.TMCore.AuthProvider` | Authorization Provider | OAuth servers, OIDC providers, IdP, SAML |
+| `SE.EI.TMCore.WebSvc` | External Web Service | External APIs, vendor services, SaaS endpoints |
+| `SE.EI.TMCore.User` | Human User | End users, operators, administrators |
+| `SE.EI.TMCore.Megaservice` | Megaservice | Large cloud platforms (Azure, AWS, GCP services) |
+| `SE.EI.TMCore.WebApp` | External Web Application | Third-party web apps, external portals |
+| `SE.EI.TMCore.CRT` | Windows Runtime | WinRT APIs, Windows runtime components |
+| `SE.EI.TMCore.NFX` | Windows .NET Runtime | .NET Framework, CLR, BCL |
+| `SE.EI.TMCore.WinRT` | Windows RT Runtime | Windows RT platform, ARM Windows apps |
+
+### 1.3 Data Store Types
+
+| TMT ID | Name | Code Patterns to Identify |
+|--------|------|---------------------------|
+| `SE.DS.TMCore.CloudStorage` | Cloud Storage | Azure Blob, S3, GCS |
+| `SE.DS.TMCore.SQL` | SQL Database | PostgreSQL, MySQL, SQL Server, SQLite |
+| `SE.DS.TMCore.NoSQL` | Non-Relational DB | MongoDB, CosmosDB, Redis, Cassandra |
+| `SE.DS.TMCore.FS` | File System | Local files, NFS, shared drives |
+| `SE.DS.TMCore.Cache` | Cache | Redis, Memcached, in-memory caches |
+| `SE.DS.TMCore.ConfigFile` | Configuration File | `.env`, `appsettings.json`, YAML configs |
+| `SE.DS.TMCore.Cookie` | Cookies | HTTP cookies, session cookies |
+| `SE.DS.TMCore.Registry` | Registry Hive | Windows Registry, system configuration stores |
+| `SE.DS.TMCore.HTML5LS` | HTML5 Local Storage | `localStorage`, `sessionStorage`, IndexedDB |
+| `SE.DS.TMCore.Device` | Device | Hardware devices, USB, peripheral storage |
+
+### 1.4 Data Flow Types
+
+| TMT ID | Name | Code Patterns to Identify |
+|--------|------|---------------------------|
+| `SE.DF.TMCore.HTTP` | HTTP | `fetch()`, `axios`, `HttpClient`, REST without TLS |
+| `SE.DF.TMCore.HTTPS` | HTTPS | TLS-secured REST, `https://` endpoints |
+| `SE.DF.TMCore.Binary` | Binary | gRPC, Protobuf, raw binary protocols |
+| `SE.DF.TMCore.NamedPipe` | Named Pipe | IPC via named pipes |
+| `SE.DF.TMCore.SMB` | SMB | SMB/CIFS file shares |
+| `SE.DF.TMCore.UDP` | UDP | UDP sockets, datagram protocols |
+| `SE.DF.TMCore.SSH` | SSH | SSH tunnels, SFTP, SCP |
+| `SE.DF.TMCore.LDAP` | LDAP | LDAP queries, AD lookups |
+| `SE.DF.TMCore.LDAPS` | LDAPS | Secure LDAP over TLS |
+| `SE.DF.TMCore.IPsec` | IPsec | VPN tunnels, IPsec-secured connections |
+| `SE.DF.TMCore.RPC` | RPC or DCOM | COM+, DCOM, RPC calls, WCF net.tcp |
+| `SE.DF.TMCore.ALPC` | ALPC | Advanced Local Procedure Call, Windows IPC |
+| `SE.DF.TMCore.IOCTL` | IOCTL Interface | Device I/O control, driver communication |
+
+### 1.5 Trust Boundary Types
+
+**Line Boundaries:**
+
+| TMT ID | Name | Code Indicators |
+|--------|------|-----------------|
+| `SE.TB.L.TMCore.Internet` | Internet Boundary | Public endpoints, API gateways |
+| `SE.TB.L.TMCore.Machine` | Machine Boundary | Process boundaries, VM separation |
+| `SE.TB.L.TMCore.Kernel` | Kernel/User Mode | Drivers, ring 0/3 transitions |
+| `SE.TB.L.TMCore.AppContainer` | AppContainer | UWP sandboxes, app containers |
+
+**Border Boundaries:**
+
+| TMT ID | Name | Code Indicators |
+|--------|------|-----------------|
+| `SE.TB.B.TMCore.CorpNet` | CorpNet | Corporate network, VPN perimeter |
+| `SE.TB.B.TMCore.Sandbox` | Sandbox | Sandboxed execution environments |
+| `SE.TB.B.TMCore.IEB` | Internet Explorer Boundaries | IE zones, IE security settings |
+| `SE.TB.B.TMCore.NonIEB` | Other Browsers Boundaries | Chrome, Firefox, Edge security contexts |
+
+---
+
+## 2. Trust Boundary Detection
+
+Create a trust boundary (`subgraph`) when code crosses:
+
+| Boundary Type | Code Indicators |
+|---------------|-----------------|
+| **Internet/Public** | Public endpoints, API gateways, load balancers |
+| **Machine** | Process boundaries, host separation |
+| **Kernel/User Mode** | Kernel calls, drivers, syscalls |
+| **AppContainer** | UWP sandboxes, containerized apps |
+| **CorpNet** | Corporate network perimeter, VPN |
+| **Sandbox** | Sandboxed execution environments |
+
+---
+
+## 3. Data Flow Detection
+
+Look for these patterns to identify flows:
+
+| Flow Type | Code Patterns |
+|-----------|---------------|
+| **HTTP/HTTPS** | `fetch()`, `axios`, `HttpClient`, REST calls |
+| **SQL Database** | ORM queries, SQL connections, `DbContext` |
+| **Message Queue** | Pub/sub, queue send/receive, Dapr pub/sub |
+| **File I/O** | File read/write, blob upload/download |
+| **gRPC** | Protobuf calls, gRPC streams |
+| **Named Pipe** | IPC via named pipes |
+| **SSH** | SSH tunnels, SFTP, SCP transfers |
+| **LDAP/LDAPS** | Directory queries, AD lookups |
+
+---
+
+## 4. Code Analysis Checklist
+
+When analyzing code, systematically identify:
+
+1. **Entry Points** → External Interactors + inbound flows
+   - API controllers, event handlers, webhook endpoints
+
+2. **Services/Logic** → Processes
+   - Business logic classes, service layers, workers
+
+3. **Data Access** → Data Stores + flows
+   - Repository classes, DB contexts, cache clients
+
+4. **External Calls** → External Interactors + outbound flows
+   - HTTP clients, SDK integrations, third-party APIs
+
+5. **Security Boundaries** → Trust Boundaries
+   - Auth middleware, network segments, deployment units
+
+6. **Kubernetes Pod Composition** → Sidecar co-location
+   - Look for Helm charts, K8s manifests, deployment YAMLs
+   - Common sidecars: Dapr, MISE, Envoy, Istio proxy, Linkerd, log collectors
+   - **Apply rules from `diagram-conventions.md` Rule 1** — annotate host nodes, never create standalone sidecar nodes
+
+---
+
+## 5. Naming Conventions
+
+See [diagram-conventions.md](./diagram-conventions.md) Naming Conventions section for the full table with quoting rules.
+
+---
+
+## 6. Output Files
+
+Generate **TWO files** for maximum flexibility:
+
+### File 1: Pure Mermaid (`.mmd`)
+- Raw Mermaid code only, no markdown wrapper
+- Used for: CLI tools, editors, CI/CD, direct rendering
+
+### File 2: Markdown (`.md`)
+- Mermaid in ` ```mermaid ` code fence
+- Include element, flow, and boundary summary tables
+- Used for: GitHub, VS Code, documentation
+
+### Format Comparison
+
+| Format | Extension | Contents | Best For |
+|--------|-----------|----------|----------|
+| Pure Mermaid | `.mmd` | Raw diagram code | CLI, editors, tools |
+| Markdown | `.md` | Diagram + tables | GitHub, docs, viewing |
diff --git a/skills/threat-model-analyst/references/verification-checklist.md b/skills/threat-model-analyst/references/verification-checklist.md
new file mode 100644
index 00000000..09df14c6
--- /dev/null
+++ b/skills/threat-model-analyst/references/verification-checklist.md
@@ -0,0 +1,639 @@
+# Verification Checklist — Post-Analysis Quality Gates
+
+This file is the **single source of truth** for all verification rules that must pass before a threat model report is finalized. It is designed to be handed to a verification sub-agent along with the output folder path.
+
+> **Authority hierarchy:** This file contains CHECKING rules (pass/fail criteria for quality gates). The AUTHORING rules that produce the content being checked are in `orchestrator.md`. Some rules appear in both files for visibility — if they ever conflict: `orchestrator.md` takes precedence for authoring decisions (how to write), this file takes precedence for pass/fail criteria (what constitutes a valid output). Do NOT remove rules from either file to "deduplicate" — the overlap is intentional for visibility.
+
+**When to use:** After ALL output files are written (0.1-architecture.md through 0-assessment.md), run every check in this file. If any check fails, fix the issue before finalizing.
+
+**Sub-agent delegation:** The orchestrator can delegate this entire file to a verification sub-agent with the prompt:
+> "Read [verification-checklist.md](./verification-checklist.md). For each check, inspect the named output file(s) and report PASS/FAIL with evidence. Fix any failures."
+
+---
+
+## Inline Quick-Checks (Run Immediately After Each File Write)
+
+> **Purpose:** These are lightweight self-checks the WRITING agent runs immediately after creating each file — NOT deferred to Step 10. Since the agent just wrote the file, the content is still in active context, making these checks highly effective.
+>
+> **How to use:** Before writing each file, read the corresponding skeleton from `skeletons/skeleton-*.md`. After each `create_file` call, scan the content you just wrote for these patterns. If any check fails, fix the file immediately before proceeding to the next step.
+>
+> **Skeleton compliance rule:** Every output file MUST follow its skeleton's section order, table column headers, and heading names. Do NOT add sections/tables not in the skeleton. Do NOT rename skeleton headings.
+
+### After writing `3-findings.md`:
+- [ ] First finding heading starts with `### FIND-01:` (not `F01`, `F-01`, or `Finding 1`)
+- [ ] Every finding has these exact row labels: `SDL Bugbar Severity`, `Remediation Effort`, `Mitigation Type`, `Exploitability Tier`, `Exploitation Prerequisites`, `Component`
+- [ ] Every CVSS value contains `CVSS:4.0/` prefix
+- [ ] Every `Related Threats` cell contains `](2-stride-analysis.md#` (hyperlink, not plain text)
+- [ ] Every finding has `#### Description`, `#### Evidence`, `#### Remediation`, and `#### Verification` sub-headings (not `Recommendation`, not `Impact`, not `Mitigation`, not bold `**Description:**` paragraphs) — exactly 4 sub-headings, no extras
+- [ ] Every `#### Description` section has at least 2 sentences of technical detail (not single-sentence stubs)
+- [ ] Every `#### Evidence` section cites specific file paths, line numbers, or config keys (not generic statements like "found in codebase")
+- [ ] Every finding has ALL 10 mandatory attribute rows: `SDL Bugbar Severity`, `CVSS 4.0`, `CWE`, `OWASP`, `Exploitation Prerequisites`, `Exploitability Tier`, `Remediation Effort`, `Mitigation Type`, `Component`, `Related Threats`
+- [ ] Every CWE value is a hyperlink: contains `](https://cwe.mitre.org/` (not plain text like `CWE-79`)
+- [ ] Every OWASP value uses `:2025` suffix (not `:2021`)
+- [ ] Findings organized by TIER (Tier 1/2/3 headings), NOT by severity (no `## Critical Findings`)
+- [ ] **Tier-Prerequisite consistency (inline)**: For each finding, use canonical mapping: `None`→T1; `Authenticated User`/`Privileged User`/`Internal Network`/`Local Process Access`→T2; `Host/OS Access`/`Admin Credentials`/`Physical Access`/`{Component} Compromise`/combos→T3. ⛔ `Application Access` and `Host Access` are FORBIDDEN.
+- [ ] Count finding headings — they must be sequential: FIND-01, FIND-02, FIND-03...
+- [ ] No time estimates: search for `~`, `Sprint`, `Phase`, `hour`, `day`, `week` — must not appear
+- [ ] **Threat Coverage Verification table** present at end of file with columns `Threat ID | Finding ID | Status`
+- [ ] **Coverage table status values** use emoji prefixes: `✅ Covered (FIND-XX)`, `✅ Mitigated (FIND-XX)`, `🔄 Mitigated by Platform` — NOT plain text like "Finding", "Mitigated", "Covered"
+- [ ] **Coverage table column names** are exactly `Threat ID | Finding ID | Status` — NOT `Threat | Finding | Status`
+
+### After writing `0-assessment.md`:
+- [ ] First `## ` heading is `## Report Files`
+- [ ] Count `## ` headings — exactly 7 with these exact names: Report Files, Executive Summary, Action Summary, Analysis Context & Assumptions, References Consulted, Report Metadata, Classification Reference
+- [ ] Heading contains `&` not `and`: search for `Analysis Context & Assumptions`
+- [ ] Count `---` separator lines — at least 5
+- [ ] `### Quick Wins` heading exists
+- [ ] `### Priority by Tier and CVSS Score` heading exists under Action Summary, BEFORE Quick Wins
+- [ ] **Priority table has max 10 rows**: Count data rows in Priority by Tier and CVSS Score table — must be ≤ 10
+- [ ] **Priority table sort order**: All Tier 1 findings come first, then Tier 2, then Tier 3. Within each tier, higher CVSS scores come first. ❌ T2 finding appearing before a T1 finding → FAIL
+- [ ] **Priority table Finding hyperlinks**: Every Finding cell is a hyperlink `[FIND-XX](3-findings.md#find-xx-title-slug)`. Search for `](3-findings.md#` in every row — must be present. ❌ Plain text `FIND-XX` without link → FAIL
+- [ ] **Priority table anchor resolution**: For each hyperlink, verify the anchor slug matches the actual `### FIND-XX:` heading in 3-findings.md AS WRITTEN. Compute the anchor from the heading text (lowercase, spaces to hyphens, strip special chars). ❌ If any heading contains status tags like `[STILL PRESENT]` or `[NEW]`, that is a FAIL — status tags must NOT appear in headings (see Phase 2 check). Anchors should be computed from clean, tag-free heading text.
+- [ ] **Action Summary tier hyperlinks**: Tier 1, Tier 2, Tier 3 cells in the Action Summary table are hyperlinks to `3-findings.md#tier-N` anchors
+- [ ] `### Needs Verification` heading exists
+- [ ] `### Finding Overrides` heading exists
+- [ ] **Action Summary has exactly 4 data rows**: Tier 1, Tier 2, Tier 3, Total. Search for `| Mitigated |` or `| Platform |` or `| Fixed |` in the Action Summary table — FAIL if found. These are NOT separate tiers.
+- [ ] **Git Commit includes date**: The `| Git Commit |` row must contain both the SHA and the commit date (e.g., `f49298ff` (`2026-03-04`)). If only the hash is shown without date → FAIL.
+- [ ] **Baseline/Target Commits include dates** (incremental mode): `| Baseline Commit |` and `| Target Commit |` rows must each include a date alongside the SHA.
+- [ ] `### Security Standards` and `### Component Documentation` headings exist (two Reference subsections)
+- [ ] `| Model |` row exists in Report Metadata table
+- [ ] `| Analysis Started |` row exists in Report Metadata table
+- [ ] `| Analysis Completed |` row exists in Report Metadata table
+- [ ] `| Duration |` row exists in Report Metadata table
+- [ ] Metadata values wrapped in backticks: check for `` ` `` in metadata value cells
+- [ ] **Report Files table first row**: `0-assessment.md` is the FIRST data row (not `0.1-architecture.md`)
+- [ ] **Report Files completeness**: Every generated `.md` and `.mmd` file in the output folder has a corresponding row in the Report Files table (`threat-inventory.json` is intentionally excluded)
+- [ ] **Report Files conditional rows**: `1.2-threatmodel-summary.mmd` and `incremental-comparison.html` rows present ONLY if those files were actually generated
+- [ ] **Note on threat counts blockquote**: Executive Summary contains `> **Note on threat counts:**` paragraph
+- [ ] **Boundary count**: Boundary count in Executive Summary matches actual Trust Boundary Table row count in `1-threatmodel.md`
+- [ ] **Action Summary tier priorities**: Tier 1 = 🔴 Critical Risk, Tier 2 = 🟠 Elevated Risk, Tier 3 = 🟡 Moderate Risk. These are FIXED — never modified based on counts.
+- [ ] **Risk Rating heading** has NO emojis: `### Risk Rating: Elevated` not `### Risk Rating: 🟠 Elevated`
+
+### After writing `0.1-architecture.md`:
+- [ ] Count `sequenceDiagram` occurrences — at least 3
+- [ ] First 3 sequence diagrams have `participant` lines and `->>` message arrows (not empty diagram blocks)
+- [ ] Key Components table row count matches Component Diagram node count
+- [ ] Every Key Components table row uses PascalCase name (not kebab-case `my-component` or snake_case `my_component`)
+- [ ] Every Key Components Type cell is one of: `Process`, `Data Store`, `External Service`, `External Interactor` — no ad-hoc types like `Role`, `Function`
+- [ ] Technology Stack table has all 5 rows filled: Languages, Frameworks, Data Stores, Infrastructure, Security
+- [ ] `## Security Infrastructure Inventory` section exists (not missing)
+- [ ] `## Repository Structure` section exists (not missing)
+
+### After writing `1.1-threatmodel.mmd`:
+- [ ] Line 1 starts with `%%{init:`
+- [ ] Contains `classDef process`, `classDef external`, `classDef datastore`
+- [ ] No Chakra UI colors (`#4299E1`, `#48BB78`, `#E53E3E`)
+- [ ] `linkStyle default stroke:#666666,stroke-width:2px` present
+- [ ] DFD uses `flowchart LR` (NOT `flowchart TB`) — search for `flowchart` and verify direction is `LR`
+- [ ] **Incremental DFD styling (incremental mode only)**: If new components exist, verify `classDef newComponent fill:#d4edda,stroke:#28a745` is present AND new component nodes use `:::newComponent` (NOT `:::process`). If removed components exist, verify `classDef removedComponent` with gray dashed styling. ❌ `newComponent fill:#6baed6` (same blue as process) → FAIL (visually invisible).
+
+### After writing `2-stride-analysis.md`:
+- [ ] `## Summary` appears BEFORE any `## ComponentName` section (check line numbers)
+- [ ] Summary table has columns: `| Component | Link | S | T | R | I | D | E | A | Total | T1 | T2 | T3 | Risk |` — search for `| S | T | R | I | D | E | A |` to verify
+- [ ] Summary table S/T/R/I/D/E/A columns contain numeric values (0, 1, 2, 3...), NOT all identical 1s for every component
+- [ ] Every component has `#### Tier 1`, `#### Tier 2`, `#### Tier 3` sub-headings
+- [ ] No `&`, `/`, `(`, `)`, `:` in `## ` headings
+- [ ] **No status tags in headings (ANY file)**: Search ALL `.md` files for `^##.+\[Existing\]`, `^##.+\[Fixed\]`, `^##.+\[Partial\]`, `^##.+\[New\]`, `^##.+\[Removed\]`, and same for `###` headings. Also check old-style: `^##.+\[STILL`, `^##.+\[NEW`, `^###.+\[STILL`, `^###.+\[NEW CODE`. ❌ Tags in headings break anchor links and pollute ToC. Status must be on first line of section body as a blockquote (`> **[Tag]**`), not in the heading.
+- [ ] **CRITICAL — A = Abuse, NEVER Authorization**: Search for `| Authorization |` in the file. If ANY match is a STRIDE category label (not inside a threat description sentence) → FIX IMMEDIATELY by replacing with `| Abuse |`. The "A" in STRIDE-A stands for "Abuse" (business logic abuse, workflow manipulation, feature misuse). This is the single most common error observed.
+- [ ] **N/A entries not counted**: If any component has `N/A — {justification}` for a STRIDE category, verify that category shows `0` (not `1`) in the Summary table
+- [ ] **STRIDE Status values**: Every threat row's Status column uses exactly one of: `Open`, `Mitigated`, `Platform`. No `Partial`, `N/A`, `Accepted`, or ad-hoc values.
+- [ ] **Platform ratio**: Count threats with `Platform` status vs total threats. If >20% (standalone) or >35% (K8s operator) → re-examine each Platform entry.
+- [ ] **STRIDE column arithmetic**: For every Summary table row, verify S+T+R+I+D+E+A = Total AND T1+T2+T3 = Total
+- [ ] **Full category names in threat tables**: Category column uses full names (`Spoofing`, `Tampering`, `Information Disclosure`, `Denial of Service`, `Elevation of Privilege`, `Abuse`) — NOT abbreviations (`S`, `T`, `DoS`, `EoP`)
+- [ ] **N/A table present**: Every component section has a `| Category | Justification |` table listing STRIDE categories with no threats — NOT prose/bullet-point format
+- [ ] **Link column is separate**: Summary table 2nd column is `Link` with `[Link](#anchor)` values — component names do NOT contain embedded hyperlinks
+- [ ] **Exploitability Tiers 4th column**: The tier definition table must have 4th column named `Assignment Rule` (NOT `Example`, `Description`, `Criteria`)
+
+### After writing `incremental-comparison.html` (incremental mode only):
+- [ ] HTML contains `Trust Boundaries` or `Boundaries` in the metrics bar — search for the text "Boundaries"
+- [ ] STRIDE heatmap has 13 columns: Component, S, T, R, I, D, E, A, Total, divider, T1, T2, T3 — search for `T1` and `T2` and `T3` in the HTML
+- [ ] Fixed/New/Previously Unidentified status information appears ONLY in colored status cards, NOT also as small inline badges in the metrics bar
+- [ ] No `| Authorization |` as a STRIDE category label in the heatmap — search for "Authorization" in heatmap rows
+- [ ] **HTML counts match markdown counts**: The Total threats in the HTML heatmap must equal the Totals row from `2-stride-analysis.md`. If they differ, regenerate the HTML heatmap from the STRIDE summary data. T1+T2+T3 totals in HTML must also match.
+- [ ] **Comparison cards present**: HTML contains `comparison-cards` div with 3 cards: baseline (hash + date + rating), target (hash + date + rating), trend (direction + duration)
+- [ ] **Commit dates from git log**: Baseline and target dates in comparison cards must match actual commit dates (NOT today's date, NOT analysis run date)
+- [ ] **Code Changes box**: 5th metrics box shows commit count and PR count (NOT "Time Between")
+- [ ] **No Time Between box**: Search for "Time Between" — must NOT appear in metrics bar
+- [ ] **Status cards are concise**: Each status card's `card-items` div must contain only a short summary sentence. ❌ Threat IDs (T06.S, T02.E), finding IDs (FIND-14), or component names listed in cards → FAIL. Search for `T\d+\.` and `FIND-\d+` inside `card-items` divs. Detailed item breakdowns belong in the Threat/Finding Status Breakdown section, not in the summary cards.
+
+### After writing any incremental report file (incremental mode — inline check):
+- [ ] **Simplified display tags only**: Search ALL `.md` files for old-style tags: `[STILL PRESENT]`, `[NEW CODE]`, `[NEW IN MODIFIED]`, `[PREVIOUSLY UNIDENTIFIED]`, `[PARTIALLY MITIGATED]`, `[REMOVED WITH COMPONENT]`, `[MODIFIED]`. ❌ Any match → FAIL. Replace with simplified tags: `[Existing]`, `[Fixed]`, `[Partial]`, `[New]`, `[Removed]`.
+- [ ] **Valid display tags**: Every finding/threat annotation uses exactly one of the 5 simplified tags: `[Existing]`, `[Fixed]`, `[Partial]`, `[New]`, `[Removed]`. Tags must appear as blockquote on first line of body: `> **[Tag]**`.
+- [ ] **Component status simplified**: Component status column uses only: `Unchanged`, `Modified`, `New`, `Removed`. ❌ `Restructured` → FAIL (use `Modified` instead).
+- [ ] **Change Summary tables use simplified tags**: Threat Status table has 4 rows (Existing/Fixed/New/Removed). Finding Status table has 5 rows (Existing/Fixed/Partial/New/Removed). ❌ Old-style rows like `Still Present`, `New (Code)`, `Partially Mitigated` → FAIL.
+
+### After writing `threat-inventory.json` (inline check):
+- [ ] **JSON threat count matches STRIDE file**: Count unique threat IDs in `2-stride-analysis.md` (grep `^\| T\d+\.`). This count MUST equal `threats` array length in the JSON. If STRIDE has MORE threats than JSON → threats were dropped during serialization. Rebuild the JSON.
+- [ ] **JSON metrics internally consistent**: `metrics.total_threats` must equal `threats` array length. `metrics.total_findings` must equal `findings` array length.
+
+### After writing `0-assessment.md` (count validation):
+- [ ] Element count in Executive Summary matches actual Element Table row count (re-read `1-threatmodel.md` if needed)
+- [ ] Finding count matches actual `### FIND-` heading count in `3-findings.md`
+- [ ] Threat count matches Total from summary table in `2-stride-analysis.md`
+
+---
+
+## Phase 0 — Common Deviation Scan
+
+These are the most frequently observed deviations across all previous runs. After output is generated, scan every output file for these specific patterns. Each check has a **WRONG** pattern to search for and a **CORRECT** expected pattern.
+
+**How to use:** For each check, grep/scan the output files for the WRONG pattern. If found → FAIL. Then verify the CORRECT pattern is present. This phase catches recurring mistakes that the generating model tends to make despite instructions.
+
+### 0.1 Structural Deviations
+
+- [ ] **Findings organized by severity instead of tier** — Search for `## Critical Findings`, `## Important Findings`, `## High Findings`. These must NOT exist. ❌ `## Critical Findings` → ✅ `## Tier 1 — Direct Exposure (No Prerequisites)`
+- [ ] **Flat STRIDE tables (no tier sub-sections)** — Each component in `2-stride-analysis.md` must have `#### Tier 1`, `#### Tier 2`, `#### Tier 3` sub-headings. ❌ Single flat table per component → ✅ Three separate tier sub-sections
+- [ ] **Missing Exploitability Tier or Remediation Effort on findings** — Every `### FIND-` block in `3-findings.md` must contain both `Exploitability Tier` and `Remediation Effort` rows. ❌ Missing either field → ✅ Both MANDATORY
+- [ ] **STRIDE summary missing tier columns** — Summary table in `2-stride-analysis.md` must include `T1`, `T2`, `T3` columns. ❌ Only S/T/R/I/D/E/A/Total → ✅ Must also have T1/T2/T3/Risk columns
+- [ ] **STRIDE Summary at bottom** — Search for the line number of `## Summary` vs first `## Component`. ❌ Summary after components → ✅ Summary BEFORE all component sections, immediately after `## Exploitability Tiers`
+- [ ] **Exploitability Tiers table columns** — The tier definition table in `2-stride-analysis.md` must have exactly these 4 columns: `Tier | Label | Prerequisites | Assignment Rule`. ❌ `Example`, `Description`, `Criteria` as 4th column → ✅ `Assignment Rule` only. The Assignment Rule cells must contain the rigid rule text, NOT deployment-specific examples.
+
+### 0.2 File Format Deviations
+
+- [ ] **`.md` wrapped in code fences** — Check if any `.md` file starts with ` ```markdown ` or ` ````markdown `. ❌ ` ```markdown\n# Title` → ✅ `# Title` on line 1
+- [ ] **`.mmd` wrapped in code fences** — Check if `.mmd` file starts with ` ```plaintext ` or ` ```mermaid `. ❌ ` ```mermaid\n%%{init:` → ✅ `%%{init:` on line 1
+- [ ] **Leaked skill directives in output** — Search ALL `.md` files for `⛔`, `RIGID TIER`, `Do NOT use subjective`, `MANDATORY`, `CRITICAL —`, `decision procedure`. These are internal skill instructions that must NOT appear in report output. ❌ Any match → ✅ Zero matches. Remove any leaked directive lines.
+- [ ] **Nested duplicate output folder** — Check if the output folder contains a subfolder with the same name (e.g., `threat-model-20260307-081613/threat-model-20260307-081613/`). ❌ Subfolder exists → ✅ Delete the nested duplicate. The output folder should contain only files, no subfolders.
+- [ ] **STRIDE-A "Authorization" instead of "Abuse"** — Search `2-stride-analysis.md` for `| Authorization |` or `**Authorization**` used as a STRIDE category name. The A in STRIDE-A is ALWAYS "Abuse", never "Authorization". ❌ Any match where Authorization is used as a STRIDE category → ✅ Replace with "Abuse". Note: do NOT replace "Authorization" when it appears inside threat descriptions (e.g., "Authorization header", "lacks authorization checks").
+
+### 0.3 Assessment Section Deviations
+
+- [ ] **Wrong section name for Action Summary** — Search for `Priority Remediation Roadmap`, `Top Recommendations`, `Key Recommendations`, `Risk Profile`. ❌ Any of those names → ✅ `## Action Summary` only
+- [ ] **Separate recommendations section** — Search for `### Key Recommendations` or `### Top Recommendations` as standalone sections. ❌ Separate section → ✅ Action Summary IS the recommendations
+- [ ] **Missing Quick Wins subsection** — Search for `### Quick Wins` under Action Summary. ❌ Missing → ✅ Present (with note if no low-effort T1 findings)
+- [ ] **Missing threat count context** — Search for `> **Note on threat counts:**` blockquote in Executive Summary. ❌ Missing → ✅ Present
+- [ ] **Missing Analysis Context & Assumptions** — Search for `## Analysis Context & Assumptions`. ❌ Missing → ✅ Present with `### Needs Verification` and `### Finding Overrides` sub-sections
+- [ ] **Missing mandatory assessment sections** — Verify ALL 7 exist: Report Files, Executive Summary, Action Summary, Analysis Context & Assumptions, References Consulted, Report Metadata, Classification Reference. ❌ Any missing → ✅ All 7 present
+
+### 0.4 References & Metadata Deviations
+
+- [ ] **References Consulted as flat table** — Search for `| Reference | Usage |` pattern. ❌ Two-column flat table → ✅ Two subsections: `### Security Standards` with `| Standard | URL | How Used |` and `### Component Documentation` with `| Component | Documentation URL | Relevant Section |`
+- [ ] **References missing URLs** — Every row in References Consulted tables must have a full `https://` URL. ❌ Missing URL column or empty URLs → ✅ Full URLs in every row
+- [ ] **Report Metadata missing Model** — Search for `| **Model** |` or `| Model |` row. ❌ Missing → ✅ Present with actual model name
+- [ ] **Report Metadata missing timestamps** — Search for `Analysis Started`, `Analysis Completed`, `Duration` rows. ❌ Any missing → ✅ All three present with computed values
+
+### 0.5 Finding Quality Deviations
+
+- [ ] **CVSS score without vector or missing prefix** — Grep each finding's CVSS field. The value MUST match pattern: `\d+\.\d+ \(CVSS:4\.0/AV:`. Specifically check for the `CVSS:4.0/` prefix — the most common deviation is outputting the vector without this prefix (bare `AV:N/AC:L/...`). ❌ `9.3` (score only) → ❌ `9.3 (AV:N/AC:L/...)` (no prefix) → ✅ `9.3 (CVSS:4.0/AV:N/AC:L/AT:N/PR:N/UI:N/VC:H/VI:H/VA:H/SC:N/SI:N/SA:N)`
+- [ ] **CWE without hyperlink** — Grep for `CWE-\d+` without preceding `[`. ❌ `CWE-78: OS Command Injection` → ✅ `[CWE-78](https://cwe.mitre.org/data/definitions/78.html): OS Command Injection`
+- [ ] **OWASP `:2021` suffix** — Grep for `:2021`. ❌ `A01:2021` → ✅ `A01:2025`
+- [ ] **Related Threats as plain text** — Grep `Related Threats` rows for pattern without `](`. ❌ `T-02, T-17, T-23` → ✅ `[T02.S](2-stride-analysis.md#component-name), [T17.I](2-stride-analysis.md#other-component)`
+- [ ] **Finding IDs out of order** — Check that FIND-NN IDs are sequential: FIND-01, FIND-02, FIND-03... ❌ `FIND-06` appearing before `FIND-04` → ✅ Sequential numbering top-to-bottom
+- [ ] **CVSS AV:L or PR:H with Tier 1** — Grep every Tier 1 finding's CVSS vector for `AV:L` or `PR:H`. ❌ Tier 1 with local-only access → ✅ Downgrade to T2/T3
+- [ ] **Localhost-only or admin-only finding in Tier 1** — Check deployment context: air-gapped, localhost, single-admin services should NOT be Tier 1. ❌ Tier 1 for admin-only → ✅ T2/T3
+- [ ] **Time estimates in output** — Grep for `~1 hour`, `Sprint`, `Phase 1`, `(hours)`, `(days)`, `(weeks)`, `Immediate`. ❌ Any scheduling language → ✅ Only `Low`/`Medium`/`High` effort labels
+- [ ] **"Accepted Risk" in Coverage table** — Grep `3-findings.md` for `Accepted Risk`. ❌ Any match → FAIL. The tool does NOT have authority to accept risks. Every `Open` threat MUST have a finding. Replace all `⚠️ Accepted Risk` with `✅ Covered` and create corresponding findings.
+
+### 0.6 Diagram Deviations
+
+- [ ] **Wrong color palette** — Grep all `#[0-9a-fA-F]{6}` in `.mmd` files and Mermaid blocks. ❌ `#4299E1`, `#48BB78`, `#E53E3E`, `#2B6CB0`, `#2D3748`, `#2F855A`, `#C53030` (Chakra UI) → ✅ Only allowed: `#6baed6`, `#2171b5`, `#fdae61`, `#d94701`, `#74c476`, `#238b45`, `#e31a1c`, `#666666`, `#ffffff`, `#000000`
+- [ ] **Custom themeVariables colors** — Search init blocks for `secondaryColor`, `tertiaryColor`, or `primaryTextColor`. ❌ `"primaryColor": "#2D3748", "secondaryColor": "#4299E1"` → ✅ Only `'background': '#ffffff', 'primaryColor': '#ffffff', 'lineColor': '#666666'` in themeVariables
+- [ ] **Missing summary MMD** — Count nodes and subgraphs in `1.1-threatmodel.mmd`. If elements > 15 OR subgraphs > 4, `1.2-threatmodel-summary.mmd` MUST exist. ❌ Threshold met but file missing → ✅ File created with summary diagram
+- [ ] **Standalone sidecar nodes (K8s only)** — Search diagrams for nodes named `MISE`, `Dapr`, `Envoy`, `Istio`, `Sidecar` as separate entries. ❌ `MISE(("MISE Sidecar"))` → ✅ `InferencingFlow(("Inferencing Flow<br/>+ MISE"))`
+- [ ] **Intra-pod localhost flows (K8s only)** — Search for `-->|"localhost"|` arrows between co-located containers. ❌ Present → ✅ Absent (implicit)
+- [ ] **Missing sequence diagrams** — First 3 scenarios in `0.1-architecture.md` must each have a `sequenceDiagram` block. ❌ Fewer than 3 → ✅ At least 3
+- [ ] **Technology-specific gaps** — For every technology in the repo (Redis, PostgreSQL, Docker, K8s, ML/LLM, NFS, etc.), verify at least one finding or documented mitigation exists. ❌ Technology present but no coverage → ✅ Each technology addressed
+
+### 0.7 Canonical Pattern Checks
+
+- [ ] **Finding heading pattern** — All finding headings match `^### FIND-\d{2}: ` (never `F01`, `F-01`, `Finding 1`)
+- [ ] **CVSS prefix pattern** — All CVSS fields match `\d+\.\d+ \(CVSS:4\.0/AV:` (never bare `AV:N/AC:L/...`)
+- [ ] **Related Threats link pattern** — Every Related Threat token matches `\[T\d{2}\.[STRIDEA]\]\(2-stride-analysis\.md#[a-z0-9-]+\)`
+- [ ] **Assessment section headings exact set** — Exactly these `##` headings in `0-assessment.md`: Report Files, Executive Summary, Action Summary, Analysis Context & Assumptions, References Consulted, Report Metadata, Classification Reference
+- [ ] **Forbidden headings absent** — No `##` or `###` headings containing: Severity Distribution, Architecture Risk Areas, Methodology Notes, Deliverables, Priority Remediation Roadmap, Key Recommendations, Top Recommendations
+
+---
+
+## Phase 1 — Per-File Structural Checks
+
+These checks validate each file independently. They can run in parallel.
+
+### 1.1 All `.md` Files
+
+- [ ] **No code-fence wrapping**: No `.md` file starts with ` ```markdown ` or ` ````markdown `. Every `.md` file must begin with a `# Heading` as its very first line. If any file is wrapped in fences, strip the first and last lines immediately.
+- [ ] **No `.mmd` code-fence wrapping**: The `.mmd` file must NOT start with ` ```plaintext ` or ` ```mermaid `. It must start with `%%{init:` as the very first characters. If wrapped, strip the fence lines.
+- [ ] **No empty files**: Every file has substantive content beyond the heading.
+
+### 1.2 `0.1-architecture.md`
+
+- [ ] **Required sections present**: System Purpose, Key Components, Component Diagram, Top Scenarios, Technology Stack, Deployment Model, Repository Structure
+- [ ] **Component Diagram exists** as a Mermaid `flowchart` inside a ` ```mermaid ` code fence
+- [ ] **Architecture styles used** — NOT DFD circles `(("Name"))`. Must use `["Name"]` or `(["Name"])` with `service`/`external`/`datastore` classDef names
+- [ ] **At least 3 scenarios** have Mermaid `sequenceDiagram` blocks
+- [ ] **No separate `.mmd` files** were created for 0.1-architecture.md — all diagrams are inline
+- [ ] **Component Diagram elements match Key Components table** — every row in the table has a corresponding node in the diagram, and vice versa. Count both and verify counts are equal.
+- [ ] **Top Scenarios reflect actual code paths**, not hypothetical use cases
+- [ ] **Deployment Model has network details** — must mention at least: port numbers OR bind addresses OR network topology
+
+### 1.3 `1.1-threatmodel.mmd`
+
+- [ ] **File exists** with pure Mermaid code (no markdown wrapper, no ` ```mermaid ` fence)
+- [ ] **Starts with** `%%{init:` block
+- [ ] **Contains** `classDef process`, `classDef external`, `classDef datastore`
+- [ ] **Uses DFD shapes**: circles `(("Name"))` for processes, rectangles `["Name"]` for externals, cylinders `[("Name")]` for data stores
+
+### 1.4 `1-threatmodel.md`
+
+- [ ] **Diagram content identical** to `1.1-threatmodel.mmd` — byte-for-byte comparison of the Mermaid block content (excluding the ` ```mermaid ` fence wrapper)
+- [ ] **Element Table** present with columns: Element, Type, TMT Category, Description, Trust Boundary
+- [ ] **Data Flow Table** present with columns: ID, Source, Target, Protocol, Description
+- [ ] **Trust Boundary Table** present with columns: Boundary, Description, Contains
+- [ ] **TMT Category IDs used** — Element Table's TMT Category column uses specific TMT element IDs from `tmt-element-taxonomy.md` (e.g., `SE.P.TMCore.WebSvc`, `SE.EI.TMCore.Browser`). NOT generic labels like `Process`, `External`.
+- [ ] **Flow IDs match DF\d{2} pattern** — Every flow ID in the Data Flow Table uses `DF01`, `DF02`, etc. format. NOT `F1`, `Flow-1`, `DataFlow1`.
+- [ ] **If >15 elements or >4 boundaries**: `1.2-threatmodel-summary.mmd` MUST exist AND `1-threatmodel.md` MUST include a "Summary View" section with the summary diagram AND a "Summary to Detailed Mapping" table. **To verify:** count nodes (lines matching `[A-Z]\d+` with shape syntax) and subgraphs in `1.1-threatmodel.mmd`. If count exceeds thresholds but `1.2-threatmodel-summary.mmd` does not exist → **FAIL — create the summary diagram before proceeding**.
+
+### 1.5 `2-stride-analysis.md`
+
+- [ ] **Exploitability Tiers section** present at top with tier definition table
+- [ ] **Summary table** appears BEFORE individual component sections (immediately after Exploitability Tiers, NOT at the bottom of the file)
+- [ ] **Summary table** includes columns: Component, Link, S, T, R, I, D, E, A, Total, T1, T2, T3, Risk
+- [ ] **Every component** has `## Component Name` heading followed by Tier 1, Tier 2, Tier 3 sub-sections (all three present even if empty)
+- [ ] **Empty tiers** use "*No Tier N threats identified for this component.*"
+- [ ] **Anchor-safe headings**: No `## ` heading in this file contains ANY of these characters: `&`, `/`, `(`, `)`, `.`, `:`, `'`, `"`, `+`, `@`, `!`. Replace: `&` → `and`, `/` → `-`, parentheses → omit, `:` → omit.
+- [ ] **Pod Co-location line** present for K8s components listing co-located sidecars
+- [ ] **STRIDE Status values** — Every threat row's Status column uses exactly one of: `Open`, `Mitigated`, `Platform`. No `Partial`, `N/A`, or other ad-hoc values.
+- [ ] **A category labeled Abuse** — Search `2-stride-analysis.md` for `| Authorization |` as a STRIDE category label. FAIL if found. The "A" in STRIDE-A is always "Abuse" (business logic abuse, workflow manipulation, feature misuse), NEVER "Authorization". Also check N/A entries: `Authorization — N/A` is WRONG, must be `Abuse — N/A`.
+- [ ] **STRIDE-Coverage Consistency** — For every threat ID, the STRIDE Status and Coverage table Status must agree:
+  - STRIDE `Open` → Coverage `✅ Covered (FIND-XX)` (finding documents vulnerability needing remediation)
+  - STRIDE `Mitigated` → Coverage `✅ Mitigated (FIND-XX)` (finding documents existing control the team built)
+  - STRIDE `Platform` → Coverage `🔄 Mitigated by Platform`
+  - If STRIDE says `Partial` but Coverage says `Mitigated by Platform` → **CONFLICT. Fix it.**
+  - If STRIDE says `Open` but Coverage says `⚠️ Needs Review` → only valid if prerequisites ≠ `None`
+
+### 1.6 `3-findings.md`
+
+- [ ] **Organized by tier** using exactly: `## Tier 1 — Direct Exposure (No Prerequisites)`, `## Tier 2 — Conditional Risk (...)`, `## Tier 3 — Defense-in-Depth (...)`
+- [ ] **NOT organized by severity** — no `## Critical Findings` or `## Important Findings` headings
+- [ ] **Every finding** has ALL mandatory attributes: SDL Bugbar Severity, CVSS 4.0, CWE, OWASP (with `:2025` suffix), Exploitation Prerequisites, Exploitability Tier, Remediation Effort, Mitigation Type, Component, Related Threats
+- [ ] **Mitigation Type valid values** — Every finding's `Mitigation Type` row is one of exactly: `Redesign`, `Standard Mitigation`, `Custom Mitigation`, `Existing Control`, `Accept Risk`, `Transfer Risk`. ❌ Abbreviated forms (`Custom`, `Accept`, `Standard`) or invented values → FAIL
+- [ ] **SDL Severity valid values** — Every finding's severity is one of: `Critical`, `Important`, `Moderate`, `Low`. ❌ `High`, `Medium`, `Info` → FAIL
+- [ ] **Remediation Effort valid values** — Every finding's effort is one of: `Low`, `Medium`, `High`. ❌ Time estimates, sprint labels → FAIL
+- [ ] **CVSS 4.0 has full vector**: Every finding's CVSS value includes BOTH the numeric score AND the full vector string (e.g., `9.3 (CVSS:4.0/AV:N/AC:L/AT:N/PR:N/UI:N/VC:H/VI:H/VA:H/SC:N/SI:N/SA:N)`). Score-only is NOT acceptable.
+- [ ] **CWE format**: Every CWE uses `CWE-NNN: Name` format (not just number)
+- [ ] **OWASP format**: Every OWASP uses `A0N:2025` format (never `:2021`)
+- [ ] **Related Threats** use individual links per threat ID: `[T01.S](2-stride-analysis.md#component-name)` — no grouped links like `[T01.S, T01.T](2-stride-analysis.md)`
+- [ ] **Exploitation Prerequisites present** — Every `### FIND-` block has a row `| Exploitation Prerequisites |`
+- [ ] **Component field present** — Every `### FIND-` block has a row `| Component |`
+- [ ] **No Tier 1 with AV:L or PR:H** — For every Tier 1 finding, verify its CVSS vector does NOT contain `AV:L` or `PR:H`. If found → tier must be downgraded to T2/T3.
+- [ ] **Tier-Prerequisite Consistency (MANDATORY)** — For EVERY finding and EVERY threat row, the tier MUST follow mechanically from the prerequisite using the canonical mapping:
+  - `None` → T1 (only valid if component's Reachability = External AND Auth = No)
+  - `Authenticated User`, `Privileged User`, `Internal Network`, `Local Process Access` → T2
+  - `Host/OS Access`, `Admin Credentials`, `Physical Access`, `{Component} Compromise`, any `A + B` → T3
+  - **⛔ FORBIDDEN values:** `Application Access`, `Host Access` → FAIL. Replace with `Local Process Access` (T2) or `Host/OS Access` (T3).
+  - **Deployment context rule (Rule 20):** If Deployment Classification is `LOCALHOST_DESKTOP` or `LOCALHOST_SERVICE`, `None` is FORBIDDEN for all components. Fix prerequisite to `Local Process Access` or `Host/OS Access`, then derive tier.
+  - **Exposure table cross-check:** For each finding, look up its Component in the Component Exposure Table. The finding's prerequisite MUST be ≥ the component's `Min Prerequisite`. The finding's tier MUST be ≥ the component's `Derived Tier`.
+  - **Mismatch = FAIL.** Fix by adjusting prerequisites to match deployment evidence, then derive tier from prerequisite.
+  - **Common violations:** `None` on a localhost-only component; `Application Access` (ambiguous); T1 with `Internal Network` prerequisite; T2 with `None` prerequisite.
+- [ ] **Threat Coverage Verification table** present at end of file mapping every threat ID → finding ID with status
+- [ ] **Coverage table valid statuses ONLY** — Every row in the Coverage table must use exactly one of these three statuses: `✅ Covered (FIND-XX)`, `✅ Mitigated (FIND-XX)`, or `🔄 Mitigated by Platform`. ❌ `⚠️ Accepted Risk` → FAIL (tool cannot accept risks). ❌ `⚠️ Needs Review` → FAIL (every threat must be resolved). ❌ `—` without a status → FAIL (unaccounted threat).
+- [ ] **Mitigated vs Platform distinction** — For every `✅ Mitigated (FIND-XX)` entry: verify the finding documents an existing security control the engineering team built (auth middleware, TLS, input validation, file permissions). For every `🔄 Mitigated by Platform`: verify the mitigation is from a genuinely EXTERNAL system (Azure AD, K8s RBAC, TPM). If "Platform" describes THIS repo's code → reclassify as `✅ Mitigated` and create a finding.
+- [ ] **Platform Mitigation Ratio Audit (MANDATORY)** — Count threats marked `🔄 Mitigated by Platform` vs total threats. If Platform > 20% → **WARNING: Likely overuse of Platform status.** For each Platform-mitigated threat, verify ALL three conditions: (1) mitigation is EXTERNAL to this repo's code, (2) managed by a different team, (3) cannot be disabled by modifying this code. Common violations: "auth middleware" (that's THIS code → should be `Mitigated`), "TLS on localhost" (THIS code → should be `Mitigated`), "file permissions" (THIS code → should be `Mitigated`).
+- [ ] **Coverage Feedback Loop Verification** — After the Coverage table is written, verify: (1) every threat with STRIDE status `Open` has a corresponding finding in the table. (2) No `—` dashes without a status. (3) If gaps exist, new findings were created to fill them. The Coverage table is a FEEDBACK LOOP — its purpose is to catch missed findings and force their creation. If gaps remain after the table is written, the loop was not executed.
+- [ ] **"Accepted Risk" in Coverage table** — Grep `3-findings.md` for `Accepted Risk`. ❌ Any match → FAIL. The tool does NOT have authority to accept risks. Every `Open` threat MUST have a finding. Every `Mitigated` threat MUST have a finding documenting the team's control.
+- [ ] **"Needs Review" in Coverage table** — Grep `3-findings.md` for `Needs Review`. ❌ Any match → FAIL. "Needs Review" has been replaced: threats are either Covered (vulnerability), Mitigated (team built a control), or Platform (external system). There is no deferred category.
+
+### 1.7 `0-assessment.md`
+
+- [ ] **Section order**: Report Files → Executive Summary → Action Summary → Analysis Context & Assumptions → References Consulted → Report Metadata → Classification Reference (last)
+- [ ] **Report Files section** is the very first section after the title
+- [ ] **Risk Rating heading** has NO emojis: `### Risk Rating: Elevated` not `### Risk Rating: 🟠 Elevated`
+- [ ] **Threat count context paragraph** present as blockquote at end of Executive Summary
+- [ ] **No separate Recommendations section** — Action Summary IS the recommendations
+- [ ] **Action Summary table** present with Tier, Description, Threats, Findings, Priority columns
+- [ ] **Action Summary is the ONLY name**: No sections titled "Priority Remediation Roadmap", "Top Recommendations", "Key Recommendations", or "Risk Profile"
+- [ ] **Quick Wins subsection** present (or explicitly omitted if no low-effort T1 findings)
+- [ ] **Needs Verification section** present under Analysis Context & Assumptions
+- [ ] **References Consulted** has two subsections: `### Security Standards` and `### Component Documentation`
+- [ ] **References Consulted tables** use three columns with full URLs: `| Standard | URL | How Used |` and `| Component | Documentation URL | Relevant Section |` — NOT a flat `| Reference | Usage |` table
+- [ ] **Finding Overrides** uses table format even when empty (never plain text)
+- [ ] **Report Metadata** is the absolute last section before Classification Reference with all required fields
+- [ ] **Metadata timestamps** came from actual command execution (not derived from folder names)
+- [ ] **Model** field present — value matches the model being used (e.g., `Claude Opus 4.6`, `GPT-5.3 Codex`, `Gemini 3 Pro`)
+- [ ] **Analysis Started** and **Analysis Completed** fields present with UTC timestamps from `Get-Date` commands
+- [ ] **Duration** field present — computed from Analysis Started and Analysis Completed timestamps
+- [ ] **Metadata values in backticks** — Every value cell in the Report Metadata table must be wrapped in backticks. Spot-check at least 5 rows.
+- [ ] **Horizontal rules between sections** — Count lines matching `---` in the file. Must be ≥ 6 (one between each pair of the 7 `## ` sections).
+- [ ] **Classification Reference is last section** — `## Classification Reference` present as the final `## ` heading. Contains a single 2-column table (`Classification | Values`) with rows for: Exploitability Tiers, STRIDE + Abuse, SDL Severity, Remediation Effort, Mitigation Type, Threat Status, CVSS, CWE, OWASP. ❌ Missing section or wrong format → FAIL.
+- [ ] **Classification Reference is static** — Values in the table must match the skeleton EXACTLY (copied verbatim). No additional rows, no modified descriptions. Compare against `skeleton-assessment.md` Classification Reference section.
+- [ ] **No forbidden section headings** — Search for: `Severity Distribution`, `Architecture Risk Areas`, `Methodology Notes`, `Deliverables`, `Priority Remediation Roadmap`, `Key Recommendations`, `Top Recommendations`. Must return 0 matches.
+- [ ] **Action Summary tier priorities are FIXED** — In the Action Summary table of `0-assessment.md`, verify the Priority column: Tier 1 = `🔴 Critical Risk`, Tier 2 = `🟠 Elevated Risk`, Tier 3 = `🟡 Moderate Risk`. ❌ Tier 1 with Low/Moderate/Elevated → FAIL. ❌ Tier 2 with Critical/Low → FAIL. These are FIXED labels that never change regardless of threat/finding counts.
+- [ ] **Action Summary has all 3 tiers** — The Action Summary table MUST have rows for Tier 1, Tier 2, AND Tier 3, even if a tier has 0 threats and 0 findings. Missing tiers → FAIL.
+
+---
+
+## Phase 2 — Diagram Rendering Checks
+
+Run against ALL Mermaid blocks across all files. Can be delegated as a focused sub-task.
+
+### 2.1 Init Blocks
+
+- [ ] **Every flowchart** has `%%{init}%%` block with `'background': '#ffffff'` as the first line
+- [ ] **Every sequence diagram** has the full `%%{init}%%` theme variables block with `'background': '#ffffff'`
+- [ ] **NO custom color keys in themeVariables** — init block must NOT contain `primaryColor` (except `#ffffff`), `secondaryColor`, or `tertiaryColor`. All element colors come from classDef only.
+
+### 2.2 Class Definitions & Color Palette
+
+- [ ] **Every `classDef`** includes `color:#000000` (explicit black text)
+- [ ] **DFD diagrams** use `process`/`external`/`datastore` class names
+- [ ] **Architecture diagrams** use `service`/`external`/`datastore` class names
+- [ ] **EXACT hex codes used** — grep all `#[0-9a-fA-F]{6}` values in `.mmd` files. The ONLY allowed fill colors are: `#6baed6`, `#fdae61`, `#74c476`, `#ffffff`, `#000000`. The ONLY allowed stroke colors are: `#2171b5`, `#d94701`, `#238b45`, `#e31a1c`, `#666666`. If ANY other hex color appears (e.g., `#4299E1`, `#48BB78`, `#E53E3E`, `#2B6CB0`), the diagram FAILS this check.
+
+### 2.3 Styling
+
+- [ ] **Every flowchart** has `linkStyle default stroke:#666666,stroke-width:2px`
+- [ ] **Trust boundary styles** use `stroke:#e31a1c,stroke-width:3px` (NOT `#ff0000` or `stroke-width:2px`)
+- [ ] **Architecture layer styles** use light fills with matching borders (not red dashed trust boundaries)
+
+### 2.4 Syntax Validation
+
+- [ ] **All labels quoted**: `["Name"]`, `(("Name"))`, `[("Name")]`, `-->|"Label"|`, `subgraph ID["Title"]`
+- [ ] **Subgraph/end pairs matched**: Every `subgraph` has a closing `end`
+- [ ] **No stray characters** or unclosed quotes in any Mermaid block
+
+### 2.5 Kubernetes Sidecar Rules
+
+Skip this section if the target system is NOT deployed on Kubernetes.
+
+- [ ] **Every K8s service node** annotated with sidecars: `<br/>+ SidecarName` in the node label
+- [ ] **Zero standalone sidecar nodes**: Search all diagrams for nodes named `MISE`, `Dapr`, `Envoy`, `Istio`, `Sidecar` — these must NOT exist as separate nodes
+- [ ] **Zero intra-pod localhost flows**: No arrows between a container and its sidecars (no `-->|"localhost"` patterns)
+- [ ] **Cross-boundary sidecar flows originate from host container**: All arrows to external targets (Azure AD, Redis, etc.) come from the host container node, not from a standalone sidecar node
+- [ ] **Element Table**: No separate rows for sidecars — described in host container's description column
+
+---
+
+## Phase 3 — Cross-File Consistency Checks
+
+These checks validate relationships between files. They require reading multiple files together.
+
+### 3.1 Component Coverage (Architecture → STRIDE → Findings)
+
+- [ ] **Every component** in `0.1-architecture.md` Key Components table has a corresponding `## Component` section in `2-stride-analysis.md`
+- [ ] **Every element** in the `1-threatmodel.md` Element Table that is a Process has a corresponding `## Component` section in `2-stride-analysis.md`
+- [ ] **No orphaned components** in `2-stride-analysis.md` that don't appear in the Element Table
+- [ ] **Summary table component count** matches the number of `## Component` sections in the file
+- [ ] **Component count exact match** — Count rows in `0.1-architecture.md` Key Components table (excluding header/separator). Count `## ` component sections in `2-stride-analysis.md` (excluding `## Exploitability Tiers`, `## Summary`). These counts MUST be equal.
+
+### 3.2 Data Flow Coverage (STRIDE ↔ DFD)
+
+- [ ] **Every Data Flow ID** (`DF01`, `DF02`, ...) from the `1-threatmodel.md` Data Flow Table appears in at least one "Affected Flow" cell in `2-stride-analysis.md`
+- [ ] **No orphaned flow IDs** in STRIDE analysis that aren't defined in the Data Flow Table
+
+### 3.3 Threat-to-Finding Traceability (STRIDE ↔ Findings)
+
+This is the most critical cross-file check. It ensures no identified threat is silently dropped.
+
+- [ ] **Every threat ID** in `2-stride-analysis.md` (e.g., T01.S, T01.T1, T02.I) is referenced by at least one finding in `3-findings.md` via its Related Threats field
+- [ ] **Collect all threat IDs** from all tier tables in `2-stride-analysis.md`
+- [ ] **Collect all threat IDs** referenced in Related Threats fields in `3-findings.md`
+- [ ] **Coverage gap report**: List any threat ID present in STRIDE but missing from findings. If gaps exist → either add a finding or group the threat into an existing related finding
+
+### 3.4 Finding-to-STRIDE Anchor Integrity (Findings → STRIDE)
+
+- [ ] **Every Related Threats link** in `3-findings.md` uses format `[ThreatID](2-stride-analysis.md#component-anchor)`
+- [ ] **Every `#component-anchor`** resolves to an actual `## Heading` in `2-stride-analysis.md`
+- [ ] **Anchor construction verified**: heading → lowercase → spaces to hyphens → strip non-alphanumeric except hyphens
+- [ ] **Spot-check at least 3 anchors** by following the link and confirming the threat ID exists under that heading
+
+### 3.5 Count Consistency (Assessment ↔ All Files)
+
+- [ ] **Element count** in Executive Summary matches actual Element Table row count in `1-threatmodel.md`
+- [ ] **Finding count** in Executive Summary matches actual finding count in `3-findings.md`
+- [ ] **Threat count** in Executive Summary matches Total from summary table in `2-stride-analysis.md`
+- [ ] **Tier counts** in threat count context paragraph match actual T1/T2/T3 totals from `2-stride-analysis.md`
+- [ ] **Action Summary tier table** counts match actual per-tier counts from `3-findings.md` (findings column) and `2-stride-analysis.md` (threats column)
+
+**Verification methods for count checks:**
+- Element count: count `|` rows in Element Table of `1-threatmodel.md`, subtract 2 (header + separator)
+- Finding count: count `### FIND-` headings in `3-findings.md`
+- Threat count: read the Totals row in `2-stride-analysis.md` Summary table, take the `Total` column value
+- Tier counts: from same Totals row, take T1, T2, T3 column values
+
+### 3.6 STRIDE Summary Table Arithmetic
+
+- [ ] **Per-row**: S + T + R + I + D + E + A = Total for every component
+- [ ] **Per-row**: T1 + T2 + T3 = Total for every component
+- [ ] **Totals row**: Each column sum across all component rows equals the Totals row value
+- [ ] **Row count cross-check**: Number of threat rows in each component's detail tables equals its Total in the summary table
+- [ ] **No artificial all-1s pattern**: Check the Summary table for the pattern where every STRIDE column (S,T,R,I,D,E,A) is exactly 1 for every component. If ALL components have exactly 1 threat in every STRIDE category → FAIL (indicates formulaic "minimum 1 per category" inflation rather than genuine analysis). A valid analysis should have varying counts per category reflecting actual attack surface: some categories may be 0 (with N/A justification), others 2-3. Uniform 1s across all components is a strong signal of artificial padding.
+- [ ] **N/A entries excluded from totals**: If any component has `N/A — {justification}` entries for STRIDE categories, verify those categories show 0 in the Summary table (not 1). N/A entries do NOT count as threats.
+
+### 3.7 Sort Order (Findings)
+
+- [ ] **Within each tier section**: Findings appear in order Critical → Important → Moderate → Low
+- [ ] **Within each severity band**: Higher-CVSS findings appear before lower-CVSS findings
+- [ ] **No misordering**: Scan sequentially and confirm no reversal
+
+### 3.8 Report Files Table (Assessment ↔ Output Folder)
+
+- [ ] **Every file listed** in the Report Files table of `0-assessment.md` exists in the output folder
+- [ ] **`0.1-architecture.md` is listed** in the Report Files table
+- [ ] **If `1.2-threatmodel-summary.mmd` was not generated**: it is omitted from the Report Files table (not listed with a "N/A" note)
+
+---
+
+## Phase 4 — Evidence Quality Checks
+
+These checks validate the substance of findings, not just structure. Ideally run by a sub-agent with code access.
+
+### 4.1 Finding Evidence
+
+- [ ] **Every finding** has an Evidence section citing specific files/lines/configs
+- [ ] **Evidence is concrete**: Shows actual code or config, not just "absence of config"
+- [ ] **For "missing security" claims**: Evidence proves the platform default is insecure (not just that explicit config is absent)
+
+### 4.2 Verify-Before-Flagging Compliance
+
+- [ ] **Security infrastructure inventory** was performed before STRIDE analysis (check for platform security defaults verification in findings)
+- [ ] **No false positive patterns**: No finding claims "missing mTLS" when Dapr Sentry is present, or "missing RBAC" on K8s ≥1.6, etc.
+- [ ] **Finding classification applied**: Every documented finding is "Confirmed" (not "Needs Verification" — those belong in `0-assessment.md`)
+
+### 4.3 Needs Verification Placement
+
+- [ ] **All "Needs Verification" items** are in `0-assessment.md` under Analysis Context & Assumptions — NOT in `3-findings.md`
+- [ ] **No ambiguous findings**: Findings in `3-findings.md` have positive evidence of a vulnerability
+
+---
+
+## Verification Summary Template
+
+After running all checks, produce a summary.
+
+Sub-agent output MUST include:
+- Phase name
+- Total checks, Passed, Failed
+- For each failure: Check ID, file, evidence, exact fix instruction
+- Re-run status after fixes
+
+Do not return "looks good" without counts.
+
+```markdown
+## Verification Results
+
+| Phase | Checks | Passed | Failed | Notes |
+|-------|--------|--------|--------|-------|
+| 0 — Common Deviation Scan | [N] | [N] | [N] | [pattern matches] |
+| 1 — Per-File Structural | [N] | [N] | [N] | [files with issues] |
+| 2 — Diagram Rendering | [N] | [N] | [N] | [specific failures] |
+| 3 — Cross-File Consistency | [N] | [N] | [N] | [gaps found] |
+| 4 — Evidence Quality | [N] | [N] | [N] | [false positive risks] |
+| 5 — JSON Schema | [N] | [N] | [N] | [schema issues] |
+
+### Failed Checks Detail
+<!-- For each failed check, list: check ID, file(s), what's wrong, suggested fix -->
+```
+
+---
+
+## Phase 5 — threat-inventory.json Schema Validation
+
+These checks validate the JSON inventory file generated in Step 8b. This file is critical for comparison mode.
+
+### 5.1 Schema Fields
+
+- [ ] **`schema_version` field** — Present and equals `"1.0"` (standalone) or `"1.1"` (incremental). If the report contains `"incremental": true`, schema_version MUST be `"1.1"`. Otherwise `"1.0"`.
+- [ ] **`commit` field** — Present (short SHA or `"Unknown"`)
+- [ ] **`components` array** — Non-empty, has at least 1 entry
+- [ ] **Component IDs** — Every component has `id` (PascalCase), `display`, `type`, `boundary`
+- [ ] **Component field name compliance** — Components use `"display"` (NOT `"display_name"`). Grep: `"display_name"` must return 0 matches.
+- [ ] **Threat field name compliance** — Threats use `"stride_category"` (NOT `"category"`). Threats have BOTH `"title"` AND `"description"` (NOT just `description` alone, NOT `"name"`). Threat→component link is inside `"identity_key"."component_id"` (NOT a top-level `"component_id"` on the threat object). Grep: top-level `"category":` outside identity_key must return 0 matches. Grep: every threat object must contain `"title":`.
+- [ ] **`boundaries` array** — Present (can be empty for flat systems)
+- [ ] **`flows` array** — Present, each flow has canonical ID format `DF_{Source}_to_{Target}`
+- [ ] **`threats` array** — Non-empty
+- [ ] **`findings` array** — Non-empty
+- [ ] **`metrics` object** — Present with `total_components`, `total_threats`, `total_findings`
+
+### 5.2 Metrics Consistency
+
+- [ ] **`metrics.total_components == components.length`** — Array length matches count
+- [ ] **`metrics.total_threats == threats.length`** — Array length matches count
+- [ ] **`metrics.total_findings == findings.length`** — Array length matches count
+- [ ] **Metrics match markdown reports** — `total_threats` equals Total from STRIDE summary table, `total_findings` equals `### FIND-` count in `3-findings.md`
+- [ ] **Truncation recovery gate** — If ANY array length mismatch was detected above, verify that the file was regenerated (not patched). Check: file size > 10KB for repos with >40 threats; threats array has entries for EVERY component that appears in `2-stride-analysis.md`
+- [ ] **Pre-write strategy compliance** — If `metrics.total_threats > 50`, verify that the JSON was written via sub-agent delegation, Python script, or chunked append — NOT a single `create_file` call. Evidence: check log for `agent` invocation or `_extract.py` script or multiple `replace_string_in_file` operations on the JSON file.
+
+### 5.3 Deterministic Identity Stability (for comparison readiness)
+
+- [ ] **Components include deterministic identity fields** — every component has `aliases` (array), `boundary_kind`, and `fingerprint`
+- [ ] **`boundary_kind` valid values** — every component's `boundary_kind` is one of: `MachineBoundary`, `NetworkBoundary`, `ClusterBoundary`, `ProcessBoundary`, `PrivilegeBoundary`, `SandboxBoundary`. ❌ Any other value (e.g., `DataStorage`, `ApplicationCore`, `deployment`, `trust`) → FAIL
+- [ ] **Boundaries include deterministic identity fields** — every boundary has `kind`, `aliases` (array), and `contains_fingerprint`
+- [ ] **Boundary `kind` valid values** — every boundary's `kind` is one of the same 6 TMT-aligned values as `boundary_kind`. ❌ Any other value → FAIL
+- [ ] **No duplicate canonical component IDs** — `components[].id` values are unique after normalization
+- [ ] **Alias mapping is coherent** — no alias appears under two unrelated component IDs in the same inventory
+- [ ] **Fingerprint evidence fields are stable-only** — `fingerprint` uses source files/topology/type/protocols, not freeform prose
+- [ ] **Deterministic ordering applied** — arrays sorted by canonical key (`components.id`, `boundaries.id`, `flows.id`, `threats.id`, `findings.id`)
+
+### 5.4 Comparison Drift Guardrails (when validating comparison outputs)
+
+- [ ] **High-confidence rename candidates are not left as add/remove** — component pairs with strong alias/source-file/topology overlap are classified as `renamed`/`modified`
+- [ ] **Boundary rename candidates use containment overlap** — same `kind` + high `contains` overlap are classified as boundary `renamed`, not `added` + `removed`
+- [ ] **Split/merge boundary transitions recognized** — one-to-many and many-to-one containment transitions are mapped to `split`/`merged` categories
+
+### 5.5 Comparison Integrity Checks (when validating comparison outputs)
+
+- [ ] **Baseline ≠ Current commit** — `metadata.json` → `baseline.commit` must differ from `current.commit`. Same-commit comparisons are invalid (zero real code changes to compare).
+- [ ] **Files changed > 0** — `metadata.json` → `git_diff_stats.files_changed` must be > 0. A comparison with 0 files changed has no code delta and is meaningless.
+- [ ] **Duration > 0** — `metadata.json` → `duration` must NOT be `"0m 0s"` or any value under 2 minutes. A genuine comparison requires reading two inventories, performing multi-signal matching, computing heatmaps, and generating HTML — this takes real time.
+- [ ] **No external folder references** — `metadata.json` and all output files must NOT contain references to `D:\One\tm` or any folder outside the repository being analyzed. Reports should only reference folders within the current repo.
+- [ ] **Anti-reuse verification** — The comparison output must be freshly generated, not copied from a prior `threat-model-compare-*` folder. Verify by checking that `metadata.json` timestamps are from the current run.
+- [ ] **Methodology drift ratio** — If `diff-result.json` → `metrics.methodology_drift_ratio` > 0.50, verify the HTML report contains a methodology drift warning banner. If ratio not computed but >50% of component renames share the same aliases/fingerprints, flag as validation failure.
+
+---
+
+## Phase 6 — Deterministic Identity & Naming Stability
+
+These checks validate that component/boundary/flow naming follows deterministic rules, enabling reproducible outputs across independent runs of the same code.
+
+### 6.1 Component ID Determinism
+
+- [ ] **Component IDs derived from code artifacts** — Every component ID in `threat-inventory.json` must trace to an actual class name, file path, deployment manifest `metadata.name`, or config key. No abstract concepts (`ConfigurationStore`, `DataLayer`, `LocalFileSystem`). Grep component IDs against source file names and class names — at least 80% should have a direct match.
+- [ ] **Component anchor verification** — Every process-type component in `threat-inventory.json` must have non-empty `fingerprint.source_files` or `fingerprint.source_directories`. If both are empty → FAIL (component has no code anchor).
+- [ ] **Helm/K8s workload naming** — For K8s-deployed components, verify the component ID matches the `metadata.name` from the Deployment/StatefulSet YAML, not the Helm template filename or directory. Example: `DevPortal` (from deployment name), NOT `templates-knowledge-deployment` (from file path).
+- [ ] **External service anchoring** — External services (no source code in repo) must anchor to their integration point: client class name, config key, or SDK dependency. Verify `fingerprint.config_keys` or `fingerprint.class_names` is populated.
+- [ ] **Forbidden naming patterns absent** — No component ID is a generic label: grep for `ConfigurationStore`, `DataLayer`, `LocalFileSystem`, `SecurityModule`, `NetworkLayer`, `DatabaseAccess`. → Must return 0 matches.
+- [ ] **Acronym consistency** — Well-known acronyms must be ALL-CAPS in PascalCase IDs: `API`, `NFS`, `LLM`, `SQL`, `DB`, `AD`, `UI`. Grep for `Api` (should be `API`), `Nfs` (should be `NFS`), `Llm` (should be `LLM`). → Must return 0 matches.
+- [ ] **Common technology naming exactness** — Verify these exact IDs where applicable: `Redis` (not `RedisCache`), `Milvus` (not `MilvusDB`), `NginxIngress` (not `IngressNginx`), `AzureAD` (not `AzureAd`), `PostgreSQL` (not `Postgres`).
+
+### 6.2 Boundary Naming Stability
+
+- [ ] **Boundary IDs are PascalCase** — Every boundary ID in `threat-inventory.json` uses PascalCase derived from deployment topology (e.g., `K8sCluster`, `External`, `Application`). NOT code architecture layers (`PresentationLayer`, `BusinessLogic`).
+- [ ] **No code-layer boundaries for single-process apps** — If the system is a single process (one .exe, one container), there should be exactly 1 `Application` boundary — NOT 4+ boundaries for Presentation/Business/Data layers. Count boundaries and verify proportion.
+- [ ] **K8s multi-service sub-boundaries** — For K8s namespaces with multiple Deployments, verify sub-boundaries exist: `BackendServices`, `DataStorage`, `MLModels`, `Agentic` (as applicable).
+
+### 6.3 Data Flow Completeness
+
+- [ ] **Bidirectional flows for ingress/reverse proxy** — If an ingress component (Nginx, Traefik) routes to backends, verify BOTH directions exist: `DF_Ingress_to_Backend` AND `DF_Backend_to_Ingress`. Count forward flows through ingress and verify matching response flows.
+- [ ] **Bidirectional flows for databases** — For every `DF_Service_to_Datastore` flow, verify a corresponding `DF_Datastore_to_Service` read flow exists. Datastores: Redis, Milvus, PostgreSQL, MongoDB, etc.
+- [ ] **Flow count stability** — Count flows in `threat-inventory.json`. Two independent runs on same code should produce same count (±3 acceptable). If flow count differs by >5 between old and HEAD analyses for unchanged components, flag as naming drift.
+
+### 6.4 Count Stability (Cross-Run Determinism)
+
+- [ ] **Component count within tolerance** — If comparing two analyses of the same code, component count must be within ±1. Difference ≥3 = FAIL.
+- [ ] **Boundary count within tolerance** — Same code → boundary count within ±1.
+- [ ] **Fingerprint completeness for process components** — Every component with `type: "process"` must have non-empty `fingerprint.source_directories` and `fingerprint.class_names`. Empty arrays for process components → FAIL.
+- [ ] **STRIDE category single-letter enforcement** — Every `threats[].stride_category` in JSON is exactly one letter: S, T, R, I, D, E, or A. Grep for full names (`"Spoofing"`, `"Tampering"`, `"Denial of Service"`) → Must return 0 matches. This prevents heatmap computation errors.
+
+---
+
+## Phase 7 — Evidence-Based Prerequisites & Coverage Completeness
+
+These checks validate that prerequisites, tiers, and coverage follow deterministic evidence-based rules.
+
+### 7.1 Prerequisite Determination Evidence
+
+- [ ] **No prerequisite without deployment evidence** — For every finding with `Exploitation Prerequisites` ≠ `None`, verify the prerequisite reflects actual deployment config (Helm values, Dockerfile, service type, ingress rules). If prerequisite says `Internal Network` but no evidence of network restriction exists → FAIL.
+- [ ] **Prerequisite consistency across same code** — If two analyses of the same code produce different prerequisites for the same vulnerability, the skill rules are insufficient. Flag for investigation.
+
+### 7.1b Deployment Classification Gate (MANDATORY)
+
+- [ ] **Deployment Classification present** — `0.1-architecture.md` must contain a `**Deployment Classification:**` line with one of: `LOCALHOST_DESKTOP`, `LOCALHOST_SERVICE`, `AIRGAPPED`, `K8S_SERVICE`, `NETWORK_SERVICE`. ❌ Missing → FAIL.
+- [ ] **Component Exposure Table present** — `0.1-architecture.md` must contain a `### Component Exposure Table` with columns: Component, Listens On, Auth Required, Reachability, Min Prerequisite, Derived Tier. ❌ Missing → FAIL.
+- [ ] **Exposure table completeness** — Every component in Key Components table has a corresponding row in the Component Exposure Table. ❌ Missing rows → FAIL.
+- [ ] **Deployment classification enforced on T1** — If Deployment Classification is `LOCALHOST_DESKTOP` or `LOCALHOST_SERVICE`:
+  - Count findings with `Exploitation Prerequisites` = `None`. ❌ Count > 0 → FAIL (must be `Local Process Access` or `Host/OS Access` minimum).
+  - Count findings in `## Tier 1`. ❌ Count > 0 → FAIL (must be T2+ for localhost/desktop apps).
+  - For each finding with `AV:N` in CVSS, check the component's `Reachability` column. ❌ `AV:N` with `Reachability ≠ External` → FAIL.
+- [ ] **Prerequisite floor enforced** — For EVERY finding, look up the finding's `Component` in the exposure table. The finding's `Exploitation Prerequisites` must be ≥ the `Min Prerequisite` in the table. The finding's tier must be ≥ the `Derived Tier`. ❌ Finding has `None` but table says `Local Process Access` → FAIL.
+- [ ] **Prerequisite basis in Evidence** — Every finding's `#### Evidence` section must contain a `**Prerequisite basis:**` line citing specific code/config that determines the prerequisite. ❌ Missing or generic ("found in codebase") → FAIL.
+
+### 7.2 Coverage Completeness
+
+- [ ] **Technology coverage check** — For each major technology in the repo (Redis, PostgreSQL, Docker, K8s, ML/LLM, NFS, etc.), verify at least one finding or documented mitigation addresses it. Scan `0.1-architecture.md` Technology Stack table → for each technology, grep `3-findings.md` for a matching finding.
+- [ ] **Minimum finding threshold** — Small repo (<20 files): ≥8 findings; Medium (20-100): ≥12; Large (100+): ≥18. Count `### FIND-` headings and verify against repo size.
+- [ ] **Platform ratio within context-aware limit** — Detect deployment pattern: if go.mod contains `controller-runtime`/`kubebuilder`/`operator-sdk` → K8s Operator (limit ≤35%); otherwise → Standalone App (limit ≤20%). Count Platform-status threats / total threats. If exceeds limit → FAIL. Document detected pattern in assessment.
+- [ ] **DoS with None prerequisites = Finding** — Every DoS threat (`.D`) with `Prerequisites: None` must have a corresponding finding. Grep STRIDE analysis for `.D` threats with None prerequisites and verify each maps to a finding ID in Coverage table.
+
+### 7.3 Security Infrastructure Awareness
+
+- [ ] **Security infrastructure inventory mentioned** — Verify `0.1-architecture.md` or `2-stride-analysis.md` references security components (service mesh, cert management, auth middleware) if they exist in the codebase. If Dapr Sentry is deployed, mTLS cannot be flagged as "missing."
+- [ ] **Burden of proof for missing-security claims** — Every finding that claims "missing X" must prove the platform default is insecure, not just that explicit config is absent. Spot-check the highest-severity "missing" finding.
+
+---
+
+## Phase 8 — Comparison HTML Report Structure (comparison outputs only)
+
+These checks validate the HTML comparison report structure.
+
+### 8.1 HTML Comparison Report Structure
+
+- [ ] **Exactly 4 `<h2>` sections** — The HTML must have exactly these `<h2>` headings in order: "Executive Summary", "Threat Tier Distribution", "STRIDE-A Heatmap (with Delta Indicators)", "Comparison Basis — Component Mapping". ❌ Extra sections like "Overall Risk Shift", "Key Delta Metrics", "Metrics Overview", "Findings Diff" as `<h2>` → FAIL (these are either inline elements or removed). ❌ Missing any of the 4 → FAIL.
+- [ ] **No Findings Diff section** — The HTML must NOT contain a "Findings Diff" `<h2>` section or any findings diff subsections (Fixed, Removed, Analysis Gaps, New, Changed, Unchanged). If present → FAIL.
+- [ ] **No delta metric cards** — The HTML must NOT contain `.risk-delta` cards (Findings Fixed, New Findings, Net Change, Removed, Analysis Gaps, Code-Verified). If present → FAIL.
+- [ ] **Risk shift and metrics bar as inline elements** — Risk shift and metrics bar (Components/Threats/Boundaries/Flows/Time) are inline card elements, NOT `<h2>` sections. If they appear as `<h2>` → FAIL.
+- [ ] **Metrics bar includes trust boundaries** — The metrics bar MUST show trust boundary counts (e.g., `2 → 2`). If boundaries are missing from the metrics bar → FAIL. Components, Threats, Trust Boundaries, Findings, and Code Changes are the 5 required metric boxes.
+- [ ] **Metrics bar 5th box is Code Changes** — The 5th metrics box MUST show commit count and PR count (e.g., `142 commits, 23 PRs`). ❌ "Time Between" → FAIL. The duration/dates are now in the comparison cards (Section 1), not the metrics bar.
+- [ ] **Comparison cards structure** — Section 1 MUST contain a `comparison-cards` div with 3 sub-cards: Baseline (hash, date, rating), Target (hash, date, rating), Trend (direction, duration). ❌ Old-style `subtitle` div with `Baseline: SHA → Target: SHA` → FAIL. ❌ Separate `risk-shift` div → FAIL (merged into comparison cards).
+- [ ] **No duplicate status indicators** — Status information (Fixed/New/Previously Unidentified counts) MUST appear in ONLY ONE place: the colored status summary cards. They MUST NOT also appear as small inline badges or text in the metrics bar. If the same counts appear in both the metrics bar AND colored cards → FAIL (remove from metrics bar, keep colored cards).
+- [ ] **Tier labels match analysis reports** — The Threat Tier Distribution section in the HTML must use EXACTLY these labels: "Tier 1 — Direct Exposure", "Tier 2 — Conditional Risk", "Tier 3 — Defense-in-Depth". ❌ "Probable Exposure", "Theoretical", "High Risk", or any invented variant → FAIL.
+- [ ] **Section title is "Comparison Basis" not "Architecture Changes"** — The component mapping section must be titled "Comparison Basis — Component Mapping", NOT "Architecture Changes".
+- [ ] **Heatmap has 13 columns** — The STRIDE-A heatmap grid must have: Component | S | T | R | I | D | E | A | Total | divider | T1 | T2 | T3. If T1/T2/T3 columns are missing → FAIL. The heatmap title must include "(with Delta Indicators)".
+
+### 8.2 Heatmap Accuracy (comparison outputs)
+
+- [ ] **Heatmap not all zeros** — Sum all `baseline.Total` and `current.Total` in `stride_heatmap.components`. If either sum is 0 but corresponding inventory has threats → FAIL (heatmap computation bug).
+- [ ] **No duplicate renamed component rows** — For every entry in `components_diff.renamed`, verify the heatmap has exactly ONE row for the renamed component (using current name), not TWO rows (one all-zero baseline, one all-zero current).
+- [ ] **Heatmap anomaly detection executed** — For every heatmap row with `baseline.Total > 0, current.Total == 0` (disappeared) and every row with `baseline.Total == 0, current.Total > 0` (appeared): verify that fingerprint cross-checking was performed. If a disappeared-appeared pair shares source files, class names, or namespace → it's a missed rename and must be reclassified. The heatmap should NOT have matching all-zero/all-new pairs with shared source files.
+- [ ] **Comparison confidence score present** — `diff-result.json` must contain `comparison_confidence` field ("high" or "low"). If more than 3 unresolved heatmap anomalies exist → confidence must be "low" with warning banner in HTML.
+- [ ] **Per-component STRIDE arithmetic** — For each heatmap row: `S+T+R+I+D+E+A == Total` AND `T1+T2+T3 == Total` for both baseline and current. Any mismatch → FAIL.
+- [ ] **Delta arrows match JSON data** — For each heatmap cell, `delta = current - baseline`. If delta == 0, no arrow. If delta > 0, ▲. If delta < 0, ▼. Spot-check at least 3 components.
+- [ ] **Component removal source file verification** — For every component in `components_diff.removed`, verify its `source_files` are genuinely absent from the current commit. If source files still exist → reclassify as renamed or methodology gap.