mirror of
https://github.com/github/awesome-copilot.git
synced 2026-04-11 02:35:55 +00:00
Add azure-architecture-autopilot skill 🤖🤖🤖 (#1158)
* Add azure-architecture-autopilot skill E2E Azure infrastructure automation skill: - Natural language → Architecture diagram → Bicep → Deploy - 70+ service types with 605+ official Azure icons - Interactive HTML diagrams (drag, zoom, click, PNG export) - Scans existing resources or designs new architecture - Modular Bicep with RBAC, Private Endpoints, DNS - Multi-language support (auto-detects user language) - Zero dependencies (diagram engine embedded) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Fix generator.py import for flat scripts/ structure + sync README Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --------- Co-authored-by: whoniiii <whoniiii@users.noreply.github.com> Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
This commit is contained in:
30
skills/azure-architecture-autopilot/.gitignore
vendored
Normal file
30
skills/azure-architecture-autopilot/.gitignore
vendored
Normal file
@@ -0,0 +1,30 @@
|
||||
# Temporary files
|
||||
*.pyc
|
||||
__pycache__/
|
||||
*.egg-info/
|
||||
.DS_Store
|
||||
Thumbs.db
|
||||
|
||||
# Test/eval outputs (not included in repository)
|
||||
evals/outputs/
|
||||
workspace/
|
||||
|
||||
# Generated artifacts (not included in repository)
|
||||
output/
|
||||
*.png
|
||||
*.svg
|
||||
!assets/*.png
|
||||
!assets/*.svg
|
||||
|
||||
# Sample diagrams (contain hardcoded example values — prevent model context contamination)
|
||||
sample_*.html
|
||||
|
||||
# Environment configuration
|
||||
.env
|
||||
*.local
|
||||
|
||||
# Package files (build artifacts)
|
||||
*.skill
|
||||
|
||||
# Development-only folder (not included in public distribution)
|
||||
dev/
|
||||
188
skills/azure-architecture-autopilot/README.md
Normal file
188
skills/azure-architecture-autopilot/README.md
Normal file
@@ -0,0 +1,188 @@
|
||||
<h1 align="center">Azure Architecture Autopilot</h1>
|
||||
|
||||
<p align="center">
|
||||
<strong>Design → Diagram → Bicep → Deploy — all from natural language</strong>
|
||||
</p>
|
||||
|
||||
<p align="center">
|
||||
<img src="https://img.shields.io/badge/GitHub_Copilot-Skill-8957e5?logo=github" alt="Copilot Skill">
|
||||
<img src="https://img.shields.io/badge/Azure-All_Services-0078D4?logo=microsoftazure&logoColor=white" alt="Azure">
|
||||
<img src="https://img.shields.io/badge/Bicep-IaC-ff6f00" alt="Bicep">
|
||||
<img src="https://img.shields.io/badge/70+-Service_Types-00bcf2" alt="Service Types">
|
||||
<img src="https://img.shields.io/badge/License-MIT-green" alt="License">
|
||||
</p>
|
||||
|
||||
<p align="center">
|
||||
<b>Azure Architecture Autopilot</b> designs Azure infrastructure from natural language,<br>
|
||||
generates interactive diagrams, produces modular Bicep templates, and deploys — all through conversation.<br>
|
||||
It also scans existing resources, visualizes them as architecture diagrams, and refines them on the fly.
|
||||
</p>
|
||||
|
||||
<!-- Hero image: interactive architecture diagram with 605+ Azure icons -->
|
||||
<p align="center">
|
||||
<img src="assets/06-architecture-diagram.png" width="100%" alt="Interactive Azure architecture diagram with 605+ official icons">
|
||||
</p>
|
||||
|
||||
<p align="center">
|
||||
<em>↑ Auto-generated interactive diagram — drag, zoom, click for details, export to PNG</em>
|
||||
</p>
|
||||
|
||||
<p align="center">
|
||||
<img src="assets/08-deployment-succeeded.png" width="80%" alt="Deployment succeeded">
|
||||
|
||||
<img src="assets/07-azure-portal-resources.png" width="80%" alt="Azure Portal — deployed resources">
|
||||
</p>
|
||||
|
||||
<p align="center">
|
||||
<em>↑ Real Azure resources deployed from the generated Bicep templates</em>
|
||||
</p>
|
||||
|
||||
<p align="center">
|
||||
<a href="#-how-it-works">How It Works</a> •
|
||||
<a href="#-features">Features</a> •
|
||||
<a href="#%EF%B8%8F-prerequisites">Prerequisites</a> •
|
||||
<a href="#-usage">Usage</a> •
|
||||
<a href="#-architecture">Architecture</a>
|
||||
</p>
|
||||
|
||||
---
|
||||
|
||||
## 🔄 How It Works
|
||||
|
||||
```
|
||||
Path A: "Build me a RAG chatbot on Azure"
|
||||
↓
|
||||
🎨 Design → 🔧 Bicep → ✅ Review → 🚀 Deploy
|
||||
|
||||
Path B: "Analyze my current Azure resources"
|
||||
↓
|
||||
🔍 Scan → 🎨 Modify → 🔧 Bicep → ✅ Review → 🚀 Deploy
|
||||
```
|
||||
|
||||
| Phase | Role | What Happens |
|
||||
|:-----:|------|--------------|
|
||||
| **0** | 🔍 Scanner | Scans existing Azure resources via `az` CLI → auto-generates architecture diagram |
|
||||
| **1** | 🎨 Advisor | Interactive design through conversation — asks targeted questions with smart defaults |
|
||||
| **2** | 🔧 Generator | Produces modular Bicep: `main.bicep` + `modules/*.bicep` + `.bicepparam` |
|
||||
| **3** | ✅ Reviewer | Compiles with `az bicep build`, checks security & best practices |
|
||||
| **4** | 🚀 Deployer | `validate` → `what-if` → preview diagram → `create` (5-step mandatory sequence) |
|
||||
|
||||
---
|
||||
|
||||
## ✨ Features
|
||||
|
||||
| | Feature | Description |
|
||||
|---|---------|-------------|
|
||||
| 📦 | **Zero Dependencies** | 605+ Azure icons bundled — no `pip install`, works offline |
|
||||
| 🎨 | **Interactive Diagrams** | Drag-and-drop HTML with zoom, click details, PNG export |
|
||||
| 🔍 | **Resource Scanning** | Analyze existing Azure infra → auto-generate architecture diagrams |
|
||||
| 💬 | **Natural Language** | *"It's slow"*, *"reduce costs"*, *"add security"* → guided resolution |
|
||||
| 📊 | **Live Verification** | API versions, SKUs, model availability fetched from MS Docs in real-time |
|
||||
| 🔒 | **Secure by Default** | Private Endpoints, RBAC, managed identity — no secrets in files |
|
||||
| ⚡ | **Parallel Preload** | Next-phase info loaded while waiting for user input |
|
||||
| 🌐 | **Multi-Language** | Auto-detects user language — responds in English, Korean, or any language |
|
||||
|
||||
---
|
||||
|
||||
## ⚙️ Prerequisites
|
||||
|
||||
| Tool | Required | Install |
|
||||
|------|:--------:|---------|
|
||||
| **GitHub Copilot CLI** | ✅ | [Install guide](https://docs.github.com/copilot/concepts/agents/about-copilot-cli) |
|
||||
| **Azure CLI** | ✅ | `winget install Microsoft.AzureCLI` / `brew install azure-cli` |
|
||||
| **Python 3.10+** | ✅ | `winget install Python.Python.3.12` / `brew install python` |
|
||||
|
||||
> No additional packages required — the diagram engine is bundled in `scripts/`.
|
||||
|
||||
### 🤖 Recommended Models
|
||||
|
||||
| | Models | Notes |
|
||||
|---|--------|-------|
|
||||
| 🏆 **Best** | Claude Opus 4.5 / 4.6 | Most reliable for all 5 phases |
|
||||
| ✅ **Recommended** | Claude Sonnet 4.5 / 4.6 | Best cost-performance balance |
|
||||
| ⚠️ **Minimum** | Claude Sonnet 4, GPT-5.1+ | May skip steps in complex architectures |
|
||||
|
||||
---
|
||||
|
||||
## 🚀 Usage
|
||||
|
||||
### Path A — Build new infrastructure
|
||||
|
||||
```
|
||||
"Build a RAG chatbot with Foundry and AI Search"
|
||||
"Create a data platform with Databricks and ADLS Gen2"
|
||||
"Deploy Fabric + ADF pipeline with private endpoints"
|
||||
"Set up a microservices architecture with AKS and Cosmos DB"
|
||||
```
|
||||
|
||||
### Path B — Analyze & modify existing resources
|
||||
|
||||
```
|
||||
"Analyze my current Azure infrastructure"
|
||||
"Scan rg-production and show me the architecture"
|
||||
"What resources are in my subscription?"
|
||||
```
|
||||
|
||||
Then modify through conversation:
|
||||
```
|
||||
"Add 3 VMs to this architecture"
|
||||
"The Foundry endpoint is slow — what can I do?"
|
||||
"Reduce costs — downgrade AI Search to Basic"
|
||||
"Add private endpoints to all services"
|
||||
```
|
||||
|
||||
### 📂 Output Structure
|
||||
|
||||
```
|
||||
<project-name>/
|
||||
├── 00_arch_current.html ← Scanned architecture (Path B)
|
||||
├── 01_arch_diagram_draft.html ← Design diagram
|
||||
├── 02_arch_diagram_preview.html ← What-if preview
|
||||
├── 03_arch_diagram_result.html ← Deployment result
|
||||
├── main.bicep ← Orchestration
|
||||
├── main.bicepparam ← Parameter values
|
||||
└── modules/
|
||||
└── *.bicep ← Per-service modules
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 📁 Architecture
|
||||
|
||||
```
|
||||
SKILL.md ← Lightweight router (~170 lines)
|
||||
│
|
||||
├── scripts/ ← Embedded diagram engine
|
||||
│ ├── generator.py ← Interactive HTML generator
|
||||
│ ├── icons.py ← 605+ Azure icons (Base64 SVG)
|
||||
│ └── cli.py ← CLI entry point
|
||||
│
|
||||
└── references/ ← Phase instructions + patterns
|
||||
├── phase0-scanner.md ← 🔍 Resource scanning
|
||||
├── phase1-advisor.md ← 🎨 Architecture design
|
||||
├── bicep-generator.md ← 🔧 Bicep generation
|
||||
├── bicep-reviewer.md ← ✅ Code review
|
||||
├── phase4-deployer.md ← 🚀 Deployment pipeline
|
||||
├── service-gotchas.md ← Required properties & PE mappings
|
||||
├── azure-common-patterns.md ← Security & naming patterns
|
||||
├── azure-dynamic-sources.md ← MS Docs URL registry
|
||||
├── architecture-guidance-sources.md
|
||||
└── ai-data.md ← AI/Data service domain pack
|
||||
```
|
||||
|
||||
> **Self-contained** — `SKILL.md` is a lightweight router. All phase logic lives in `references/`. The diagram engine is embedded in `scripts/` with no external dependencies.
|
||||
|
||||
---
|
||||
|
||||
## 📊 Supported Services (70+ types)
|
||||
|
||||
All Azure services supported. AI/Data services have optimized templates; others are auto-looked up from MS Docs.
|
||||
|
||||
**Key types:** `ai_foundry` · `openai` · `ai_search` · `storage` · `adls` · `keyvault` · `fabric` · `databricks` · `aks` · `vm` · `app_service` · `function_app` · `cosmos_db` · `sql_server` · `postgresql` · `mysql` · `synapse` · `adf` · `apim` · `service_bus` · `logic_apps` · `event_grid` · `event_hub` · `container_apps` · `app_insights` · `log_analytics` · `firewall` · `front_door` · `load_balancer` · `expressroute` · `sentinel` · `redis` · `iot_hub` · `digital_twins` · `signalr` · `acr` · `bastion` · `vpn_gateway` · `data_explorer` · `document_intelligence` ...
|
||||
|
||||
|
||||
---
|
||||
|
||||
## 📄 License
|
||||
|
||||
MIT © [Jeonghoon Lee](https://github.com/whoniiii)
|
||||
170
skills/azure-architecture-autopilot/SKILL.md
Normal file
170
skills/azure-architecture-autopilot/SKILL.md
Normal file
@@ -0,0 +1,170 @@
|
||||
---
|
||||
name: azure-architecture-autopilot
|
||||
description: >
|
||||
Design Azure infrastructure using natural language, or analyze existing Azure resources
|
||||
to auto-generate architecture diagrams, refine them through conversation, and deploy with Bicep.
|
||||
|
||||
When to use this skill:
|
||||
- "Create X on Azure", "Set up a RAG architecture" (new design)
|
||||
- "Analyze my current Azure infrastructure", "Draw a diagram for rg-xxx" (existing analysis)
|
||||
- "Foundry is slow", "I want to reduce costs", "Strengthen security" (natural language modification)
|
||||
- Azure resource deployment, Bicep template generation, IaC code generation
|
||||
- Microsoft Foundry, AI Search, OpenAI, Fabric, ADLS Gen2, Databricks, and all Azure services
|
||||
---
|
||||
|
||||
# Azure Architecture Builder
|
||||
|
||||
A pipeline that designs Azure infrastructure using natural language, or analyzes existing resources to visualize architecture and proceed through modification and deployment.
|
||||
|
||||
The diagram engine is **embedded within the skill** (`scripts/` folder).
|
||||
No `pip install` needed — it directly uses the bundled Python scripts
|
||||
to generate interactive HTML diagrams with 605+ official Azure icons.
|
||||
Ready to use immediately without network access or package installation.
|
||||
|
||||
## Automatic User Language Detection
|
||||
|
||||
**🚨 Detect the language of the user's first message and provide all subsequent responses in that language. This is the highest-priority principle.**
|
||||
|
||||
- If the user writes in Korean → respond in Korean
|
||||
- If the user writes in English → **respond in English** (ask_user, progress updates, reports, Bicep comments — all in English)
|
||||
- The instructions and examples in this document are written in English, and **all user-facing output must match the user's language**
|
||||
|
||||
**⚠️ Do not copy examples from this document verbatim to the user.**
|
||||
Use only the structure as reference, and adapt text to the user's language.
|
||||
|
||||
## Tool Usage Guide (GHCP Environment)
|
||||
|
||||
| Feature | Tool Name | Notes |
|
||||
|---------|-----------|-------|
|
||||
| Fetch URL content | `web_fetch` | For MS Docs lookups, etc. |
|
||||
| Web search | `web_search` | URL discovery |
|
||||
| Ask user | `ask_user` | `choices` must be a string array |
|
||||
| Sub-agents | `task` | explore/task/general-purpose |
|
||||
| Shell command execution | `powershell` | Windows PowerShell |
|
||||
|
||||
> All sub-agents (explore/task/general-purpose) cannot use `web_fetch` or `web_search`.
|
||||
> Fact-checking that requires MS Docs lookups must be performed **directly by the main agent**.
|
||||
|
||||
## External Tool Path Discovery
|
||||
|
||||
`az`, `python`, `bicep`, etc. are often not on PATH.
|
||||
**Discover once before starting a Phase and cache the result. Do not re-discover every time.**
|
||||
|
||||
> **⚠️ Do not use `Get-Command python`** — risk of Windows Store alias.
|
||||
> Direct filesystem discovery (`$env:LOCALAPPDATA\Programs\Python`) takes priority.
|
||||
|
||||
az CLI path:
|
||||
```powershell
|
||||
$azCmd = $null
|
||||
if (Get-Command az -ErrorAction SilentlyContinue) { $azCmd = 'az' }
|
||||
if (-not $azCmd) {
|
||||
$azExe = Get-ChildItem -Path "$env:ProgramFiles\Microsoft SDKs\Azure\CLI2\wbin", "$env:LOCALAPPDATA\Programs\Azure CLI\wbin" -Filter "az.cmd" -ErrorAction SilentlyContinue | Select-Object -First 1 -ExpandProperty FullName
|
||||
if ($azExe) { $azCmd = $azExe }
|
||||
}
|
||||
```
|
||||
|
||||
Python path + embedded diagram engine: refer to the diagram generation section in `references/phase1-advisor.md`.
|
||||
|
||||
## Progress Updates Required
|
||||
|
||||
Use blockquote + emoji + bold format:
|
||||
```markdown
|
||||
> **⏳ [Action]** — [Reason]
|
||||
> **✅ [Complete]** — [Result]
|
||||
> **⚠️ [Warning]** — [Details]
|
||||
> **❌ [Failed]** — [Cause]
|
||||
```
|
||||
|
||||
## Parallel Preload Principle
|
||||
|
||||
While waiting for user input via `ask_user`, preload information needed for the next step in parallel.
|
||||
|
||||
| ask_user Question | Preload Simultaneously |
|
||||
|---|---|
|
||||
| Project name / scan scope | Reference files, MS Docs, Python path discovery, **diagram module path verification** |
|
||||
| Model/SKU selection | MS Docs for next question choices |
|
||||
| Architecture confirmation | `az account show/list`, `az group list` |
|
||||
| Subscription selection | `az group list` |
|
||||
|
||||
---
|
||||
|
||||
## Path Branching — Automatically Determined by User Request
|
||||
|
||||
### Path A: New Design (New Build)
|
||||
|
||||
**Trigger**: "create", "set up", "deploy", "build", etc.
|
||||
```
|
||||
Phase 1 (references/phase1-advisor.md) — Interactive architecture design + diagram
|
||||
↓
|
||||
Phase 2 (references/bicep-generator.md) — Bicep code generation
|
||||
↓
|
||||
Phase 3 (references/bicep-reviewer.md) — Code review + compilation verification
|
||||
↓
|
||||
Phase 4 (references/phase4-deployer.md) — validate → what-if → deploy
|
||||
```
|
||||
|
||||
### Path B: Existing Analysis + Modification (Analyze & Modify)
|
||||
|
||||
**Trigger**: "analyze", "current resources", "scan", "draw a diagram", "show my infrastructure", etc.
|
||||
```
|
||||
Phase 0 (references/phase0-scanner.md) — Existing resource scan + diagram
|
||||
↓
|
||||
Modification conversation — "What would you like to change here?" (natural language modification request → follow-up questions)
|
||||
↓
|
||||
Phase 1 (references/phase1-advisor.md) — Confirm modifications + update diagram
|
||||
↓
|
||||
Phase 2~4 — Same as above
|
||||
```
|
||||
|
||||
### When Path Determination Is Ambiguous
|
||||
|
||||
Ask the user directly:
|
||||
```
|
||||
ask_user({
|
||||
question: "What would you like to do?",
|
||||
choices: [
|
||||
"Design a new Azure architecture (Recommended)",
|
||||
"Analyze + modify existing Azure resources"
|
||||
]
|
||||
})
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Phase Transition Rules
|
||||
|
||||
- Each Phase reads and follows the instructions in its corresponding `references/*.md` file
|
||||
- When transitioning between Phases, always inform the user about the next step
|
||||
- Do not skip Phases (especially the what-if between Phase 3 → Phase 4)
|
||||
- **🚨 Required condition for Phase 1 → Phase 2 transition**: `01_arch_diagram_draft.html` must have been generated using the embedded diagram engine and shown to the user. **Do not proceed to Bicep generation without a diagram.** Completing spec collection alone does not mean Phase 1 is done — Phase 1 includes diagram generation + user confirmation.
|
||||
- Modification request after deployment → return to Phase 1, not Phase 0 (Delta Confirmation Rule)
|
||||
|
||||
## Service Coverage & Fallback
|
||||
|
||||
### Optimized Services
|
||||
Microsoft Foundry, Azure OpenAI, AI Search, ADLS Gen2, Key Vault, Microsoft Fabric, Azure Data Factory, VNet/Private Endpoint, AML/AI Hub
|
||||
|
||||
### Other Azure Services
|
||||
All supported — MS Docs are automatically consulted to generate at the same quality standard.
|
||||
**Do not send messages that cause user anxiety such as "out of scope" or "best-effort".**
|
||||
|
||||
### Stable vs Dynamic Information Handling
|
||||
|
||||
| Category | Handling Method | Examples |
|
||||
|----------|----------------|---------|
|
||||
| **Stable** | Reference files first | `isHnsEnabled: true`, PE triple set |
|
||||
| **Dynamic** | **Always fetch MS Docs** | API version, model availability, SKU, region |
|
||||
|
||||
## Quick Reference
|
||||
|
||||
| File | Role |
|
||||
|------|------|
|
||||
| `references/phase0-scanner.md` | Existing resource scan + relationship inference + diagram |
|
||||
| `references/phase1-advisor.md` | Interactive architecture design + fact checking |
|
||||
| `references/bicep-generator.md` | Bicep code generation rules |
|
||||
| `references/bicep-reviewer.md` | Code review checklist |
|
||||
| `references/phase4-deployer.md` | validate → what-if → deploy |
|
||||
| `references/service-gotchas.md` | Required properties, PE mappings |
|
||||
| `references/azure-dynamic-sources.md` | MS Docs URL registry |
|
||||
| `references/azure-common-patterns.md` | PE/security/naming patterns |
|
||||
| `references/ai-data.md` | AI/Data service guide |
|
||||
Binary file not shown.
|
After Width: | Height: | Size: 327 KiB |
Binary file not shown.
|
After Width: | Height: | Size: 171 KiB |
Binary file not shown.
|
After Width: | Height: | Size: 41 KiB |
254
skills/azure-architecture-autopilot/references/ai-data.md
Normal file
254
skills/azure-architecture-autopilot/references/ai-data.md
Normal file
@@ -0,0 +1,254 @@
|
||||
# Domain Pack: AI/Data (v1)
|
||||
|
||||
Service configuration guide specialized for Azure AI/Data workloads.
|
||||
v1 scope: Foundry, AI Search, ADLS Gen2, Key Vault, Fabric, ADF, VNet/PE.
|
||||
|
||||
> Required properties/common mistakes → `service-gotchas.md`
|
||||
> Dynamic information (API version, SKU, region) → `azure-dynamic-sources.md`
|
||||
> Common patterns (PE, security, naming) → `azure-common-patterns.md`
|
||||
|
||||
---
|
||||
|
||||
## 1. Microsoft Foundry (CognitiveServices)
|
||||
|
||||
### Resource Hierarchy
|
||||
|
||||
```
|
||||
Microsoft.CognitiveServices/accounts (kind: 'AIServices')
|
||||
├── /projects — Foundry Project (required for portal access)
|
||||
└── /deployments — Model deployments (GPT-4o, embedding, etc.)
|
||||
```
|
||||
|
||||
### Bicep Core Structure
|
||||
|
||||
```bicep
|
||||
// Foundry resource
|
||||
resource foundry 'Microsoft.CognitiveServices/accounts@<fetch>' = {
|
||||
name: foundryName
|
||||
location: location
|
||||
kind: 'AIServices'
|
||||
sku: { name: '<confirm with user>' } // ← SKU confirmed after MS Docs check in Phase 1
|
||||
identity: { type: 'SystemAssigned' }
|
||||
properties: {
|
||||
customSubDomainName: foundryName // ← Required, globally unique. Cannot change after creation — must delete and recreate if omitted
|
||||
allowProjectManagement: true
|
||||
publicNetworkAccess: 'Disabled'
|
||||
networkAcls: { defaultAction: 'Deny' }
|
||||
}
|
||||
}
|
||||
|
||||
// Foundry Project — Must be created as a set with Foundry
|
||||
resource project 'Microsoft.CognitiveServices/accounts/projects@<fetch>' = {
|
||||
parent: foundry
|
||||
name: '${foundryName}-project'
|
||||
location: location
|
||||
sku: { name: '<same as parent>' }
|
||||
kind: 'AIServices'
|
||||
identity: { type: 'SystemAssigned' }
|
||||
properties: {}
|
||||
}
|
||||
|
||||
// Model deployment — At Foundry resource level
|
||||
resource deployment 'Microsoft.CognitiveServices/accounts/deployments@<fetch>' = {
|
||||
parent: foundry
|
||||
name: '<model-name>' // ← Confirmed with user in Phase 1
|
||||
sku: {
|
||||
name: '<deployment-type>' // ← GlobalStandard, Standard, etc. — MS Docs fetch
|
||||
capacity: <confirm with user> // ← Capacity units — verify available range from MS Docs
|
||||
}
|
||||
properties: {
|
||||
model: {
|
||||
format: 'OpenAI'
|
||||
name: '<model-name>' // ← Must verify availability (fetch)
|
||||
version: '<fetch>' // ← Version also fetched
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
> `@<fetch>`: Verify API version from the URLs in `azure-dynamic-sources.md`.
|
||||
> Model name/version/deployment type/capacity: All Dynamic — Confirmed with user after MS Docs fetch in Phase 1.
|
||||
|
||||
---
|
||||
|
||||
## 2. Azure AI Search
|
||||
|
||||
### Bicep Core Structure
|
||||
|
||||
```bicep
|
||||
resource search 'Microsoft.Search/searchServices@<fetch>' = {
|
||||
name: searchName
|
||||
location: location
|
||||
sku: { name: '<confirm with user>' }
|
||||
identity: { type: 'SystemAssigned' }
|
||||
properties: {
|
||||
hostingMode: 'default'
|
||||
publicNetworkAccess: 'disabled'
|
||||
semanticSearch: '<confirm with user>' // disabled | free | standard — verify in MS Docs
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Design Notes
|
||||
|
||||
- PE support: Basic SKU or higher (verify latest constraints in MS Docs)
|
||||
- Semantic Ranker: Activated via `semanticSearch` property (`disabled` | `free` | `standard`) — verify per-SKU support in MS Docs
|
||||
- Vector search: Supported on paid SKUs (verify in MS Docs)
|
||||
- Commonly used together with Foundry for RAG configurations
|
||||
|
||||
---
|
||||
|
||||
## 3. ADLS Gen2 (Storage Account)
|
||||
|
||||
### Bicep Core Structure
|
||||
|
||||
```bicep
|
||||
resource storage 'Microsoft.Storage/storageAccounts@<fetch>' = {
|
||||
name: storageName // Lowercase+numbers only, no hyphens
|
||||
location: location
|
||||
kind: 'StorageV2'
|
||||
sku: { name: 'Standard_LRS' }
|
||||
properties: {
|
||||
isHnsEnabled: true // ← Never omit this
|
||||
accessTier: 'Hot'
|
||||
allowBlobPublicAccess: false
|
||||
minimumTlsVersion: 'TLS1_2'
|
||||
publicNetworkAccess: 'Disabled'
|
||||
networkAcls: { defaultAction: 'Deny' }
|
||||
}
|
||||
}
|
||||
|
||||
// Container
|
||||
resource container 'Microsoft.Storage/storageAccounts/blobServices/containers@<fetch>' = {
|
||||
name: '${storage.name}/default/raw'
|
||||
}
|
||||
```
|
||||
|
||||
### Design Notes
|
||||
|
||||
- `isHnsEnabled` cannot be changed after creation → Resource must be recreated if omitted
|
||||
- PE: May need both `blob` and `dfs` PEs depending on use case
|
||||
- Common containers: `raw`, `processed`, `curated`
|
||||
|
||||
---
|
||||
|
||||
## 4. Microsoft Fabric
|
||||
|
||||
### Bicep Core Structure
|
||||
|
||||
```bicep
|
||||
resource fabric 'Microsoft.Fabric/capacities@<fetch>' = {
|
||||
name: fabricName
|
||||
location: location
|
||||
sku: { name: '<confirm with user>', tier: 'Fabric' }
|
||||
properties: {
|
||||
administration: {
|
||||
members: [ '<admin-email>' ] // ← Required, deployment fails without it
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Design Notes
|
||||
|
||||
- Only Capacity can be provisioned via Bicep
|
||||
- Workspace, Lakehouse, Warehouse, etc. must be created manually in the portal
|
||||
- Confirm admin email with the user (`ask_user`)
|
||||
|
||||
### Required Confirmation Items When Adding in Phase 1
|
||||
|
||||
When Fabric is added during conversation, the following items must be confirmed via ask_user before updating the diagram:
|
||||
|
||||
- [ ] **SKU/Capacity**: F2, F4, F8, ... — Provide choices after fetching available SKUs from MS Docs
|
||||
- [ ] **administration.members**: Admin email — Deployment fails without it
|
||||
|
||||
> Do not arbitrarily include sub-workloads (OneLake, data pipelines, Warehouse, etc.) that the user did not specify. Only Capacity can be provisioned via Bicep.
|
||||
|
||||
---
|
||||
|
||||
## 5. Azure Data Factory
|
||||
|
||||
### Bicep Core Structure
|
||||
|
||||
```bicep
|
||||
resource adf 'Microsoft.DataFactory/factories@<fetch>' = {
|
||||
name: adfName
|
||||
location: location
|
||||
identity: { type: 'SystemAssigned' }
|
||||
properties: {
|
||||
publicNetworkAccess: 'Disabled'
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Design Notes
|
||||
|
||||
- Self-hosted Integration Runtime requires manual setup outside Bicep
|
||||
- Primarily used for on-premises data ingestion scenarios
|
||||
- PE groupId: `dataFactory`
|
||||
|
||||
---
|
||||
|
||||
## 6. AML / AI Hub (MachineLearningServices)
|
||||
|
||||
### When to Use
|
||||
|
||||
```
|
||||
Decision Rule:
|
||||
├─ General AI/RAG → Use Foundry (AIServices)
|
||||
└─ ML training, open-source models needed → Consider AI Hub
|
||||
└─ Only when the user explicitly requests it
|
||||
```
|
||||
|
||||
### Bicep Core Structure
|
||||
|
||||
```bicep
|
||||
resource hub 'Microsoft.MachineLearningServices/workspaces@<fetch>' = {
|
||||
name: hubName
|
||||
location: location
|
||||
kind: 'Hub'
|
||||
sku: { name: '<confirm with user>', tier: '<confirm with user>' } // e.g., Basic/Basic — verify available SKUs in MS Docs
|
||||
identity: { type: 'SystemAssigned' }
|
||||
properties: {
|
||||
friendlyName: hubName
|
||||
storageAccount: storage.id
|
||||
keyVault: keyVault.id
|
||||
applicationInsights: appInsights.id // Required for Hub
|
||||
publicNetworkAccess: 'Disabled'
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### AI Hub Dependencies
|
||||
|
||||
Additional resources needed when using Hub:
|
||||
- Storage Account
|
||||
- Key Vault
|
||||
- Application Insights + Log Analytics Workspace
|
||||
- Container Registry (optional)
|
||||
|
||||
---
|
||||
|
||||
## 7. Common AI/Data Architecture Combinations
|
||||
|
||||
### RAG Chatbot
|
||||
|
||||
```
|
||||
Foundry (AIServices) + Project
|
||||
├── <chat-model> (chat) — Confirmed after availability check in Phase 1
|
||||
├── <embedding-model> (embedding) — Confirmed after availability check in Phase 1
|
||||
├── AI Search (vector + semantic)
|
||||
├── ADLS Gen2 (document store)
|
||||
└── Key Vault (secrets)
|
||||
+ Full VNet/PE configuration
|
||||
```
|
||||
|
||||
### Data Platform
|
||||
|
||||
```
|
||||
Fabric Capacity (analytics)
|
||||
├── ADLS Gen2 (data lake)
|
||||
├── ADF (ingestion)
|
||||
└── Key Vault (secrets)
|
||||
+ VNet/PE configuration
|
||||
```
|
||||
@@ -0,0 +1,117 @@
|
||||
# Architecture Guidance Sources (For Design Direction Decisions)
|
||||
|
||||
A source registry for using Azure official architecture guidance **only for design direction decisions**.
|
||||
|
||||
> **The URLs in this document are a list of sources for "where to look".**
|
||||
> Do not hardcode the contents of these URLs as fixed facts.
|
||||
> Do not use for SKU, API version, region, model availability, or PE mapping decisions — those are handled exclusively via `azure-dynamic-sources.md`.
|
||||
|
||||
---
|
||||
|
||||
## Purpose Separation
|
||||
|
||||
| Purpose | Document to Use | Decidable Items |
|
||||
|---------|----------------|-----------------|
|
||||
| **Design direction decisions** | This document (architecture-guidance-sources) | Architecture patterns, best practices, service combination direction, security boundary design |
|
||||
| **Deployment spec verification** | `azure-dynamic-sources.md` | API version, SKU, region, model availability, PE groupId, actual property values |
|
||||
|
||||
**What must NOT be decided using this document:**
|
||||
- API version
|
||||
- SKU names/pricing
|
||||
- Region availability
|
||||
- Model names/versions/deployment types
|
||||
- PE groupId / DNS Zone mapping
|
||||
- Specific values for resource properties
|
||||
|
||||
---
|
||||
|
||||
## Primary Sources
|
||||
|
||||
Targeted fetch targets for design direction decisions.
|
||||
|
||||
| ID | Document | URL | Purpose |
|
||||
|----|----------|-----|---------|
|
||||
| A1 | Azure Architecture Center | https://learn.microsoft.com/en-us/azure/architecture/ | Hub — Entry point for finding domain-specific documents |
|
||||
| A2 | Well-Architected Framework | https://learn.microsoft.com/en-us/azure/architecture/framework/ | Security/reliability/performance/cost/operations principles |
|
||||
| A3 | Cloud Adoption Framework / Landing Zone | https://learn.microsoft.com/en-us/azure/cloud-adoption-framework/ready/landing-zone/ | Enterprise governance, network topology, subscription structure |
|
||||
| A4 | Azure AI/ML Architecture | https://learn.microsoft.com/en-us/azure/architecture/ai-ml/ | AI/ML workload reference architecture hub |
|
||||
| A5 | Basic Foundry Chat Reference Architecture | https://learn.microsoft.com/en-us/azure/architecture/ai-ml/architecture/basic-azure-ai-foundry-chat | Basic Foundry-based chatbot structure |
|
||||
| A6 | Baseline AI Foundry Chat Reference Architecture | https://learn.microsoft.com/en-us/azure/architecture/ai-ml/architecture/baseline-openai-e2e-chat | Foundry chatbot enterprise baseline (including network isolation) |
|
||||
| A7 | RAG Solution Design Guide | https://learn.microsoft.com/en-us/azure/architecture/ai-ml/guide/rag/rag-solution-design-and-evaluation-guide | RAG pattern design guide |
|
||||
| A8 | Microsoft Fabric Overview | https://learn.microsoft.com/en-us/fabric/get-started/microsoft-fabric-overview | Fabric platform overview and workload understanding |
|
||||
| A9 | Fabric Governance / Adoption | https://learn.microsoft.com/en-us/power-bi/guidance/fabric-adoption-roadmap-governance | Fabric governance, adoption roadmap |
|
||||
|
||||
## Secondary Sources (awareness only)
|
||||
|
||||
Not direct fetch targets; referenced only for change awareness.
|
||||
|
||||
| Document | URL | Notes |
|
||||
|----------|-----|-------|
|
||||
| Azure Updates | https://azure.microsoft.com/en-us/updates/ | Service changes/new feature announcements. Not a targeted fetch target |
|
||||
|
||||
---
|
||||
|
||||
## Fetch Trigger — When to Query
|
||||
|
||||
Architecture guidance documents are **not queried on every request.** Only perform targeted fetch when the following triggers apply.
|
||||
|
||||
### Trigger Conditions
|
||||
|
||||
0. **When the user's workload type is identified in Phase 1 (automatic)**
|
||||
- Pre-query the relevant workload's reference architecture to adjust question depth
|
||||
- Triggers automatically even if the user doesn't mention "best practice" etc.
|
||||
- Purpose: Reflect official architecture-based design decision points in questions, beyond SKU/region spec questions
|
||||
1. **When the user requests design direction justification**
|
||||
- Keywords such as "best practice", "reference architecture", "recommended structure", "baseline", "well-architected", "landing zone", "enterprise pattern"
|
||||
2. **When architecture boundaries for a new service combination are ambiguous**
|
||||
- Inter-service relationships that cannot be determined from existing reference files/service-gotchas
|
||||
3. **When enterprise-level security/governance design is needed**
|
||||
- Subscription structure, network topology, landing zone patterns
|
||||
|
||||
### When Triggers Do Not Apply
|
||||
|
||||
- Simple resource creation (SKU/API version/region questions) → Use only `azure-dynamic-sources.md`
|
||||
- Service combinations already covered in domain-packs → Prioritize reference files
|
||||
- Bicep property value verification → `service-gotchas.md` or MS Docs Bicep reference
|
||||
|
||||
---
|
||||
|
||||
## Fetch Budget
|
||||
|
||||
| Scenario | Max Fetch Count |
|
||||
|----------|----------------|
|
||||
| Default (when trigger fires) | Architecture guidance documents **up to 2** |
|
||||
| Additional fetch allowed when | Conflicts between documents / core design uncertainty remains / user explicitly requests deeper justification |
|
||||
| Simple deployment spec questions | **0** (no architecture guidance queries) |
|
||||
|
||||
---
|
||||
|
||||
## Decision Rule by Question Type
|
||||
|
||||
| Question Type | Documents to Query | Design Decision Points to Extract | Documents NOT to Query |
|
||||
|--------------|-------------------|----------------------------------|----------------------|
|
||||
| RAG / chatbot / Foundry app | A5 or A6 + A7 | Network isolation level, authentication method (managed identity vs key), indexing strategy (push vs pull), monitoring scope | Do not traverse entire Architecture Center |
|
||||
| Enterprise security / governance / landing zone | A2 + A3 | Subscription structure, network topology (hub-spoke etc.), identity/governance model, security boundary | AI/ML domain documents not needed |
|
||||
| Fabric data platform | A8 + A9 | Capacity model (SKU selection criteria), governance level, data boundary (workspace separation etc.) | AI-related documents not needed |
|
||||
| Ambiguous service combination (unclear pattern) | A1 (find closest domain document from hub) + that document | Key design decision points identified from the document | Do not traverse all sub-documents |
|
||||
| Simple resource creation values (SKU/API/region) | No query | — | All architecture guidance |
|
||||
| General AI/ML architecture | A4 (hub) + closest reference architecture | Compute isolation, data boundary, model serving approach | Do not crawl entirely |
|
||||
|
||||
---
|
||||
|
||||
## URL Fallback Rule
|
||||
|
||||
1. Use `en-us` Learn URLs by default
|
||||
2. If a specific URL returns 404 / redirect / deprecated → Fall back to the parent hub page
|
||||
- Example: If A5 fails → Search for "foundry chat" keyword on A4 (AI/ML hub)
|
||||
3. If not found on the parent hub either → Search by title keyword on A1 (Architecture Center main)
|
||||
4. **Do not use the contents of a URL as fixed rules just because the URL exists**
|
||||
|
||||
---
|
||||
|
||||
## Full Traversal Prohibited
|
||||
|
||||
- Do not broadly traverse (crawl) Architecture Center sub-documents
|
||||
- Only targeted fetch 1–2 related documents according to the decision rule by question type
|
||||
- Even within fetched documents, only reference relevant sections; do not read the entire document
|
||||
- Unlimited fetching, recursive link following, and sub-page enumeration are prohibited
|
||||
@@ -0,0 +1,170 @@
|
||||
# Azure Common Patterns (Stable)
|
||||
|
||||
This file contains only **near-immutable patterns** that are repeated across Azure services.
|
||||
Dynamic information such as API version, SKU, and region is not included here → See `azure-dynamic-sources.md`.
|
||||
|
||||
---
|
||||
|
||||
## 1. Network Isolation Patterns
|
||||
|
||||
### Private Endpoint 3-Component Set
|
||||
|
||||
All services using PE must have the 3-component set configured:
|
||||
|
||||
1. **Private Endpoint** — Placed in pe-subnet
|
||||
2. **Private DNS Zone** + **VNet Link** (`registrationEnabled: false`)
|
||||
3. **DNS Zone Group** — Linked to PE
|
||||
|
||||
> If any one is missing, DNS resolution fails even with PE present, causing connection failure.
|
||||
|
||||
### PE Subnet Required Settings
|
||||
|
||||
```bicep
|
||||
resource peSubnet 'Microsoft.Network/virtualNetworks/subnets' = {
|
||||
properties: {
|
||||
addressPrefix: peSubnetPrefix // ← CIDR as parameter — prevent existing network conflicts
|
||||
privateEndpointNetworkPolicies: 'Disabled' // ← Required. PE deployment fails without it
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### publicNetworkAccess Pattern
|
||||
|
||||
Services using PE must include:
|
||||
```bicep
|
||||
properties: {
|
||||
publicNetworkAccess: 'Disabled'
|
||||
networkAcls: {
|
||||
defaultAction: 'Deny'
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 2. Security Patterns
|
||||
|
||||
### Key Vault
|
||||
|
||||
```bicep
|
||||
properties: {
|
||||
enableRbacAuthorization: true // Do not use Access Policy method
|
||||
enableSoftDelete: true
|
||||
softDeleteRetentionInDays: 90
|
||||
enablePurgeProtection: true
|
||||
}
|
||||
```
|
||||
|
||||
### Managed Identity
|
||||
|
||||
When AI services access other resources:
|
||||
```bicep
|
||||
identity: {
|
||||
type: 'SystemAssigned' // or 'UserAssigned'
|
||||
}
|
||||
```
|
||||
|
||||
### Sensitive Information
|
||||
|
||||
- Use `@secure()` decorator
|
||||
- Do not store plaintext in `.bicepparam` files
|
||||
- Use Key Vault references
|
||||
|
||||
---
|
||||
|
||||
## 3. Naming Conventions (CAF-based)
|
||||
|
||||
```
|
||||
rg-{project}-{env} Resource Group
|
||||
vnet-{project}-{env} Virtual Network
|
||||
st{project}{env} Storage Account (no special characters, lowercase+numbers only)
|
||||
kv-{project}-{env} Key Vault
|
||||
srch-{project}-{env} AI Search
|
||||
foundry-{project}-{env} Cognitive Services (Foundry)
|
||||
```
|
||||
|
||||
> Name collision prevention: Recommend using `uniqueString(resourceGroup().id)`
|
||||
> ```bicep
|
||||
> param storageName string = 'st${uniqueString(resourceGroup().id)}'
|
||||
> ```
|
||||
|
||||
---
|
||||
|
||||
## 4. Bicep Module Structure
|
||||
|
||||
```
|
||||
<project>/
|
||||
├── main.bicep # Orchestration — module calls + parameter passing
|
||||
├── main.bicepparam # Environment-specific values (excluding sensitive info)
|
||||
└── modules/
|
||||
├── network.bicep # VNet, Subnet
|
||||
├── <service>.bicep # Per-service modules
|
||||
├── keyvault.bicep # Key Vault
|
||||
└── private-endpoints.bicep # All PE + DNS Zone + VNet Link
|
||||
```
|
||||
|
||||
### Dependency Management
|
||||
|
||||
```bicep
|
||||
// ✅ Correct: Implicit dependency via resource reference
|
||||
resource project '...' = {
|
||||
properties: {
|
||||
parentId: foundry.id // foundry reference → automatically deploys foundry first
|
||||
}
|
||||
}
|
||||
|
||||
// ❌ Avoid: Explicit dependsOn (use only when necessary)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 5. PE Bicep Common Template
|
||||
|
||||
```bicep
|
||||
// ── Private Endpoint ──
|
||||
resource pe 'Microsoft.Network/privateEndpoints@<fetch>' = {
|
||||
name: 'pe-${serviceName}'
|
||||
location: location
|
||||
properties: {
|
||||
subnet: { id: peSubnetId }
|
||||
privateLinkServiceConnections: [{
|
||||
name: 'pls-${serviceName}'
|
||||
properties: {
|
||||
privateLinkServiceId: serviceId
|
||||
groupIds: ['<groupId>'] // ← Varies by service. See service-gotchas.md
|
||||
}
|
||||
}]
|
||||
}
|
||||
}
|
||||
|
||||
// ── Private DNS Zone ──
|
||||
resource dnsZone 'Microsoft.Network/privateDnsZones@<fetch>' = {
|
||||
name: '<dnsZoneName>' // ← Varies by service
|
||||
location: 'global'
|
||||
}
|
||||
|
||||
// ── VNet Link ──
|
||||
resource vnetLink 'Microsoft.Network/privateDnsZones/virtualNetworkLinks@<fetch>' = {
|
||||
parent: dnsZone
|
||||
name: '${dnsZone.name}-link'
|
||||
location: 'global'
|
||||
properties: {
|
||||
virtualNetwork: { id: vnetId }
|
||||
registrationEnabled: false // ← Must be false
|
||||
}
|
||||
}
|
||||
|
||||
// ── DNS Zone Group ──
|
||||
resource dnsGroup 'Microsoft.Network/privateEndpoints/privateDnsZoneGroups@<fetch>' = {
|
||||
parent: pe
|
||||
name: 'default'
|
||||
properties: {
|
||||
privateDnsZoneConfigs: [{
|
||||
name: 'config'
|
||||
properties: { privateDnsZoneId: dnsZone.id }
|
||||
}]
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
> `@<fetch>`: Always verify the latest stable API version from MS Docs before deployment.
|
||||
@@ -0,0 +1,93 @@
|
||||
# Azure Dynamic Sources Registry
|
||||
|
||||
This file manages **only the sources (URLs) for frequently changing information**.
|
||||
Actual values (API version, SKU, region, etc.) are not recorded here.
|
||||
Always fetch the URLs below to verify the latest information before generating Bicep.
|
||||
|
||||
---
|
||||
|
||||
## 1. Bicep API Version (Always Must Fetch)
|
||||
|
||||
Per-service MS Docs Bicep reference. Verify the latest stable apiVersion from these URLs before use.
|
||||
|
||||
| Service | MS Docs URL |
|
||||
|---------|-------------|
|
||||
| CognitiveServices (Foundry/OpenAI) | https://learn.microsoft.com/en-us/azure/templates/microsoft.cognitiveservices/accounts |
|
||||
| AI Search | https://learn.microsoft.com/en-us/azure/templates/microsoft.search/searchservices |
|
||||
| Storage Account | https://learn.microsoft.com/en-us/azure/templates/microsoft.storage/storageaccounts |
|
||||
| Key Vault | https://learn.microsoft.com/en-us/azure/templates/microsoft.keyvault/vaults |
|
||||
| Virtual Network | https://learn.microsoft.com/en-us/azure/templates/microsoft.network/virtualnetworks |
|
||||
| Private Endpoints | https://learn.microsoft.com/en-us/azure/templates/microsoft.network/privateendpoints |
|
||||
| Private DNS Zones | https://learn.microsoft.com/en-us/azure/templates/microsoft.network/privatednszones |
|
||||
| Fabric | https://learn.microsoft.com/en-us/azure/templates/microsoft.fabric/capacities |
|
||||
| Data Factory | https://learn.microsoft.com/en-us/azure/templates/microsoft.datafactory/factories |
|
||||
| Application Insights | https://learn.microsoft.com/en-us/azure/templates/microsoft.insights/components |
|
||||
| ML Workspace (Hub) | https://learn.microsoft.com/en-us/azure/templates/microsoft.machinelearningservices/workspaces |
|
||||
|
||||
> **Always verify child resources as well**: Child resources such as `accounts/projects`, `accounts/deployments`, `privateDnsZones/virtualNetworkLinks` may have different API versions from their parent. Follow child resource links from the parent page to verify.
|
||||
|
||||
### Services Not in the Table Above
|
||||
|
||||
The table above includes only v1 scope services. For other services, construct the URL in this format and fetch:
|
||||
```
|
||||
https://learn.microsoft.com/en-us/azure/templates/microsoft.{provider}/{resourceType}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 2. Model Availability (Required When Using Foundry/OpenAI Models)
|
||||
|
||||
Verify whether the model name is deployable in the target region. Do not rely on static knowledge.
|
||||
|
||||
| Verification Method | URL / Command |
|
||||
|--------------------|---------------|
|
||||
| MS Docs model availability | https://learn.microsoft.com/en-us/azure/ai-services/openai/concepts/models |
|
||||
| Azure CLI (existing resources) | `az cognitiveservices account list-models --name "<NAME>" --resource-group "<RG>" -o table` |
|
||||
|
||||
> If the model is unavailable in the target region → Notify the user and suggest available regions/alternative models. Do not substitute without user approval.
|
||||
|
||||
---
|
||||
|
||||
## 3. Private Endpoint Mapping (When Adding New Services)
|
||||
|
||||
PE groupId and DNS Zone mappings can be changed by Azure. When adding new services or verification is needed:
|
||||
|
||||
| Verification Method | URL |
|
||||
|--------------------|-----|
|
||||
| PE DNS integration official docs | https://learn.microsoft.com/en-us/azure/private-link/private-endpoint-dns |
|
||||
|
||||
> Key service mappings in `service-gotchas.md` are stable, but always re-verify from the URL above when adding new services.
|
||||
|
||||
---
|
||||
|
||||
## 4. Service Region Availability
|
||||
|
||||
Verify whether a specific service is available in a specific region:
|
||||
|
||||
| Verification Method | URL |
|
||||
|--------------------|-----|
|
||||
| Azure service-by-region availability | https://azure.microsoft.com/en-us/explore/global-infrastructure/products-by-region/ |
|
||||
|
||||
---
|
||||
|
||||
## 5. Azure Updates (Secondary Awareness)
|
||||
|
||||
The sources below are for **reference only**. The primary source is always MS Docs official documentation.
|
||||
|
||||
| Source | URL | Purpose |
|
||||
|--------|-----|---------|
|
||||
| Azure Updates | https://azure.microsoft.com/en-us/updates/ | Service change awareness |
|
||||
| What's New in Azure | Per-service What's New pages in Docs | Feature change verification |
|
||||
|
||||
---
|
||||
|
||||
## Decision Rule: When to Fetch?
|
||||
|
||||
| Information Type | Must Fetch? | Rationale |
|
||||
|-----------------|-------------|-----------|
|
||||
| API version | **Always fetch** | Changes frequently; incorrect values cause deployment failure |
|
||||
| Model availability (name, region) | **Always fetch** | Varies by region and changes frequently |
|
||||
| SKU list | **Always fetch** | Can change per service |
|
||||
| Region availability | **Always fetch** | Per-service region support changes frequently. Always verify that the user-specified region is available for the service |
|
||||
| PE groupId & DNS Zone | Can reference `service-gotchas.md` for v1 key services; **must fetch for new services or complex configurations (Monitor, etc.)** | Key service mappings are stable, but new/complex services are risky |
|
||||
| Required property patterns | Reference files first | Near-immutable (isHnsEnabled, etc.) |
|
||||
@@ -0,0 +1,421 @@
|
||||
# Bicep Generator Agent
|
||||
|
||||
Receives the finalized architecture spec from Phase 1 and generates deployable Bicep templates.
|
||||
|
||||
## Step 0: Verify Latest Specs (Required Before Bicep Generation)
|
||||
|
||||
Do not hardcode API versions in Bicep code.
|
||||
Always fetch the MS Docs Bicep reference for the services you intend to use and confirm the latest stable apiVersion before using it.
|
||||
|
||||
### Verification Steps
|
||||
1. Identify the list of services to be used
|
||||
2. Fetch the MS Docs URL for each service (using the web_fetch tool)
|
||||
3. Confirm the latest stable API version from the page
|
||||
4. Write Bicep using that version
|
||||
|
||||
### Model Deployment Availability Check (Required When Using Foundry/OpenAI Models)
|
||||
|
||||
Verify that the model name specified by the user is actually deployable in the target region **before generating Bicep**.
|
||||
Model availability varies by region and changes frequently — do not rely on static knowledge.
|
||||
|
||||
**Verification Methods (in priority order):**
|
||||
1. Check the MS Docs model availability page: https://learn.microsoft.com/en-us/azure/ai-services/openai/concepts/models
|
||||
2. Or query directly via Azure CLI:
|
||||
```powershell
|
||||
az cognitiveservices account list-models --name "<FOUNDRY_NAME>" --resource-group "<RG_NAME>" -o table
|
||||
```
|
||||
(When the Foundry resource already exists)
|
||||
|
||||
**If the model is not available in the target region:**
|
||||
- Inform the user and suggest available regions or alternative models
|
||||
- Do not substitute a different model or region without user approval
|
||||
|
||||
### Per-Service MS Docs URLs
|
||||
|
||||
The full URL registry is in `references/azure-dynamic-sources.md`. Refer to this file when fetching.
|
||||
Reference files are located under the `.github/skills/azure-architecture-autopilot/` path.
|
||||
|
||||
> **Important**: Fetch directly from the URL using web_fetch to confirm the latest stable apiVersion. Do not blindly use hardcoded versions from reference files or previous conversations.
|
||||
|
||||
> **Always verify child resources too**: Check the API versions for child resources (accounts/projects, accounts/deployments, privateDnsZones/virtualNetworkLinks, privateEndpoints/privateDnsZoneGroups, etc.) from the parent resource page. Parent and child API versions may differ.
|
||||
|
||||
> **Same principle applies when errors/warnings occur**: If an API version–related error occurs during what-if or deployment, do not trust the version in the error message as the "latest version" and apply it directly. Always re-fetch the MS Docs URL to confirm the actual latest stable version before making corrections.
|
||||
|
||||
---
|
||||
|
||||
## Information Reference Principles (Stable vs Dynamic)
|
||||
|
||||
### Always Fetch (Dynamic)
|
||||
- API version → Fetch from URLs in `azure-dynamic-sources.md`
|
||||
- Model availability (name, version, region) → Fetch
|
||||
- SKU list/pricing → Fetch
|
||||
- Region availability → Fetch
|
||||
|
||||
### Reference First (Stable)
|
||||
- Required property patterns (`isHnsEnabled`, `allowProjectManagement`, etc.) → `service-gotchas.md`
|
||||
- PE groupId & DNS Zone mappings (major services) → `service-gotchas.md`
|
||||
- PE/security/naming common patterns → `azure-common-patterns.md`
|
||||
- AI/Data service configuration guide → `ai-data.md`
|
||||
|
||||
> If unsure about stable information, re-verify with MS Docs. But there is no need to fetch every time.
|
||||
|
||||
---
|
||||
|
||||
## Unknown Service Fallback Workflow
|
||||
|
||||
When the user requests a service not covered by the v1 scope (`ai-data.md`):
|
||||
|
||||
1. **Notify the user**: "This service is outside the v1 default scope. It will be generated on a best-effort basis by referencing MS Docs."
|
||||
2. **Fetch API version**: Construct the URL in the format `https://learn.microsoft.com/en-us/azure/templates/microsoft.{provider}/{resourceType}` and fetch
|
||||
3. **Identify resource type/required properties**: Confirm the resource type and required properties from the fetched Docs
|
||||
4. **Verify PE mapping**: Fetch `https://learn.microsoft.com/en-us/azure/private-link/private-endpoint-dns` to confirm groupId/DNS Zone
|
||||
5. **Apply common patterns**: Apply security/network/naming patterns from `azure-common-patterns.md`
|
||||
6. **Write Bicep**: Generate the module based on the above information
|
||||
7. **Hand off to reviewer**: Validate compilation with `az bicep build`
|
||||
|
||||
## Input Information
|
||||
|
||||
The following information must be finalized upon completion of Phase 1:
|
||||
|
||||
```
|
||||
- services: [Service list + SKU]
|
||||
- networking: Whether private_endpoint is used
|
||||
- resource_group: Resource group name
|
||||
- location: Deployment location (confirmed with user in Phase 1)
|
||||
- subscription_id: Azure subscription ID
|
||||
```
|
||||
|
||||
## Output File Structure
|
||||
|
||||
```
|
||||
<project-name>/
|
||||
├── main.bicep # Main orchestration — module calls and parameter passing
|
||||
├── main.bicepparam # Parameter file — environment-specific values, excluding sensitive info
|
||||
└── modules/
|
||||
├── network.bicep # VNet, Subnet (including pe-subnet)
|
||||
├── ai.bicep # AI services (configured per user requirements)
|
||||
├── storage.bicep # ADLS Gen2 (isHnsEnabled: true required)
|
||||
├── fabric.bicep # Microsoft Fabric Capacity (only when needed)
|
||||
├── keyvault.bicep # Key Vault
|
||||
├── monitoring.bicep # Application Insights, Log Analytics (only needed for Hub-based configurations)
|
||||
└── private-endpoints.bicep # All PEs + Private DNS Zones + VNet Links + DNS Zone Groups
|
||||
```
|
||||
|
||||
## Module Responsibilities
|
||||
|
||||
### `network.bicep`
|
||||
- VNet — CIDR received as a parameter (to avoid conflicts with existing address spaces in the customer environment)
|
||||
- pe-subnet — `privateEndpointNetworkPolicies: 'Disabled'` required
|
||||
- Additional subnets handled via parameters as needed
|
||||
|
||||
### `ai.bicep`
|
||||
- **Microsoft Foundry resource** (`Microsoft.CognitiveServices/accounts`, `kind: 'AIServices'`) — Top-level AI resource
|
||||
- `customSubDomainName: foundryName` required — **Cannot be changed after creation. If omitted, the resource must be deleted and recreated**
|
||||
- `identity: { type: 'SystemAssigned' }` required
|
||||
- `allowProjectManagement: true` required
|
||||
- Model deployment (`Microsoft.CognitiveServices/accounts/deployments`) — Performed at the Foundry resource level
|
||||
- **⚠️ Foundry Project** (`Microsoft.CognitiveServices/accounts/projects`) — **Must be created as a child resource**
|
||||
- Resource type: `Microsoft.CognitiveServices/accounts/projects` (never create as a standalone `accounts` resource)
|
||||
- Use `parent: foundryAccount` in Bicep
|
||||
- Incorrect example: Creating a Project as a separate `kind: 'AIServices'` account → Not recognized in the portal
|
||||
- Correct example:
|
||||
```bicep
|
||||
resource foundryProject 'Microsoft.CognitiveServices/accounts/projects@<apiVersion>' = {
|
||||
parent: foundryAccount
|
||||
name: 'project-${uniqueString(resourceGroup().id)}'
|
||||
location: location
|
||||
kind: 'AIServices'
|
||||
properties: {}
|
||||
}
|
||||
```
|
||||
- **Azure AI Search** — Semantic Ranking, vector search configuration
|
||||
- Hub-based (`Microsoft.MachineLearningServices/workspaces`) should only be considered when the user explicitly requests it or when ML training/open-source models are needed. For standard AI/RAG workloads, Foundry (AIServices) is the default choice
|
||||
|
||||
**⛔ CognitiveServices Prohibited Properties:**
|
||||
- `apiProperties.statisticsEnabled` — This property does not exist. Never use it. Causes `ApiPropertiesInvalid` error during deployment
|
||||
- `apiProperties.qnaAzureSearchEndpointId` — QnA Maker only. Do not use with Foundry
|
||||
- Do not arbitrarily add unvalidated properties to `properties.apiProperties`
|
||||
|
||||
### `storage.bicep`
|
||||
- ADLS Gen2: `isHnsEnabled: true` ← **Never omit this**
|
||||
- Containers: raw, processed, curated (or as per requirements)
|
||||
- `allowBlobPublicAccess: false`, `minimumTlsVersion: 'TLS1_2'`
|
||||
|
||||
### `keyvault.bicep`
|
||||
- `enableRbacAuthorization: true` (do not use access policy model)
|
||||
- `enableSoftDelete: true`, `softDeleteRetentionInDays: 90`
|
||||
- `enablePurgeProtection: true`
|
||||
|
||||
### `monitoring.bicep`
|
||||
- Log Analytics Workspace
|
||||
- Application Insights (only needed for Hub-based configurations — not required for Foundry AIServices)
|
||||
|
||||
### `private-endpoints.bicep`
|
||||
- 3-piece set for each service:
|
||||
1. `Microsoft.Network/privateEndpoints` (placed in pe-subnet)
|
||||
2. `Microsoft.Network/privateDnsZones` + VNet Link (`registrationEnabled: false`)
|
||||
3. `Microsoft.Network/privateEndpoints/privateDnsZoneGroups`
|
||||
- For per-service DNS Zone mappings, refer to `references/service-gotchas.md`
|
||||
|
||||
**⚠️ Foundry/AIServices PE DNS Rules:**
|
||||
- PE groupId: `account`
|
||||
- DNS Zone Group must include **2 zones**:
|
||||
1. `privatelink.cognitiveservices.azure.com`
|
||||
2. `privatelink.openai.azure.com`
|
||||
- Including only one causes DNS resolution failure for OpenAI API calls → connection error
|
||||
|
||||
**⚠️ ADLS Gen2 (isHnsEnabled: true) PE Rules:**
|
||||
- 2 PEs required:
|
||||
1. `blob` → `privatelink.blob.core.windows.net`
|
||||
2. `dfs` → `privatelink.dfs.core.windows.net`
|
||||
- Without the DFS PE, Data Lake operations (file system creation, directory manipulation) will fail
|
||||
|
||||
### `rbac.bicep` (or inline in main.bicep)
|
||||
|
||||
**⚠️ RBAC Role Assignment — Never Omit**
|
||||
|
||||
**Any service with a Managed Identity (`identity.type: 'SystemAssigned'`) must have RBAC role assignments created.**
|
||||
Having an identity without role assignments causes inter-service authentication failures.
|
||||
This is not optional — it is a **mandatory item**.
|
||||
Omission will be reported as CRITICAL in Phase 3 review.
|
||||
|
||||
- Required RBAC mappings:
|
||||
|
||||
| Source Service | Target Service | Role | Role Definition ID |
|
||||
|------------|-----------|------|-------------------|
|
||||
| Foundry | Storage | `Storage Blob Data Contributor` | `ba92f5b4-2d11-453d-a403-e96b0029c9fe` |
|
||||
| Foundry | AI Search | `Search Index Data Contributor` | `8ebe5a00-799e-43f5-93ac-243d3dce84a7` |
|
||||
| Foundry | AI Search | `Search Service Contributor` | `7ca78c08-252a-4471-8644-bb5ff32d4ba0` |
|
||||
| App Service | Key Vault | `Key Vault Secrets User` | `4633458b-17de-408a-b874-0445c86b69e6` |
|
||||
| AKS (kubeletIdentity) | ACR | `AcrPull` | `7f951dda-4ed3-4680-a7ca-43fe172d538d` |
|
||||
| Data Factory | Storage | `Storage Blob Data Contributor` | `ba92f5b4-2d11-453d-a403-e96b0029c9fe` |
|
||||
| Data Factory | Key Vault | `Key Vault Secrets User` | `4633458b-17de-408a-b874-0445c86b69e6` |
|
||||
| Databricks | Storage | `Storage Blob Data Contributor` | `ba92f5b4-2d11-453d-a403-e96b0029c9fe` |
|
||||
|
||||
> **AKS Special Rule**: AKS uses `identityProfile.kubeletidentity.objectId`, not `identity.principalId`.
|
||||
|
||||
```bicep
|
||||
// RBAC Example — Foundry → Storage Blob Data Contributor
|
||||
resource foundryStorageRole 'Microsoft.Authorization/roleAssignments@2022-04-01' = {
|
||||
name: guid(storageAccount.id, foundry.id, 'ba92f5b4-2d11-453d-a403-e96b0029c9fe')
|
||||
scope: storageAccount
|
||||
properties: {
|
||||
roleDefinitionId: subscriptionResourceId('Microsoft.Authorization/roleDefinitions', 'ba92f5b4-2d11-453d-a403-e96b0029c9fe')
|
||||
principalId: foundry.identity.principalId
|
||||
principalType: 'ServicePrincipal'
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### SQL Server Rules
|
||||
- **Password management**: Declare `@secure() param sqlAdminPassword string` in main.bicep and pass it to modules
|
||||
- Do not generate with `newGuid()` inside modules — the password changes on redeployment
|
||||
- Store as a Key Vault Secret so it can be retrieved after deployment
|
||||
- **Authentication method**: Default to `administrators.azureADOnlyAuthentication: true`
|
||||
- Many organizational policies (MCAPS, etc.) block standalone SQL authentication
|
||||
- AAD-only authentication + Managed Identity is the most secure configuration
|
||||
|
||||
### Network Secret Handling
|
||||
- **VPN Gateway shared key**: `@secure() param vpnSharedKey string` — `@secure()` is mandatory
|
||||
- Never include plaintext VPN keys in `.bicepparam` — provide at deployment time or use Key Vault reference
|
||||
- This rule applies the same as for SQL passwords
|
||||
- **Applies to**: VPN shared key, ExpressRoute authorization key, Wi-Fi PSK, and all other network secrets
|
||||
- Module params must also include the `@secure()` decorator
|
||||
|
||||
### ⚠️ Network Isolation Consistency Rules
|
||||
- When setting `publicNetworkAccess: 'Disabled'`, you **must** also create the corresponding PE for that service
|
||||
- Setting publicNetworkAccess to Disabled without a PE makes the service unreachable → unusable after deployment
|
||||
- The Phase 3 reviewer must report this inconsistency as **CRITICAL**
|
||||
- When an inconsistency is found: either add a PE module or revert publicNetworkAccess to Enabled
|
||||
|
||||
## Mandatory Coding Principles
|
||||
|
||||
### Naming Conventions
|
||||
```bicep
|
||||
// Use uniqueString to prevent naming collisions — always required
|
||||
param foundryName string = 'foundry-${uniqueString(resourceGroup().id)}'
|
||||
param searchName string = 'srch-${uniqueString(resourceGroup().id)}'
|
||||
param storageName string = 'st${uniqueString(resourceGroup().id)}' // No special characters allowed
|
||||
param keyVaultName string = 'kv-${uniqueString(resourceGroup().id)}'
|
||||
```
|
||||
> **⚠️ Resources requiring `customSubDomainName` (Foundry, Cognitive Services, etc.) must include `uniqueString()`.**
|
||||
> Static strings (e.g., `'my-rag-chatbot'`) may already be in use by another tenant, causing deployment failures.
|
||||
> The same applies to Foundry Project names — `'project-${uniqueString(resourceGroup().id)}'`
|
||||
|
||||
### Network Isolation
|
||||
```bicep
|
||||
// Required for all services when using Private Endpoints
|
||||
publicNetworkAccess: 'Disabled'
|
||||
networkAcls: {
|
||||
defaultAction: 'Deny'
|
||||
ipRules: []
|
||||
virtualNetworkRules: []
|
||||
}
|
||||
```
|
||||
|
||||
### Dependency Management
|
||||
```bicep
|
||||
// Use implicit dependencies via resource references instead of explicit dependsOn
|
||||
resource aiProject '...' = {
|
||||
properties: {
|
||||
hubResourceId: aiHub.id // Reference to aiHub → aiHub is automatically deployed first
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Security
|
||||
```bicep
|
||||
// Use Key Vault references for sensitive values — never store plaintext in parameter files
|
||||
@secure()
|
||||
param adminPassword string // Do not put plaintext values in main.bicepparam
|
||||
```
|
||||
|
||||
### Code Comments
|
||||
```bicep
|
||||
// Microsoft Foundry resource — kind: 'AIServices'
|
||||
// customSubDomainName: Required, globally unique. Cannot be changed after creation — if omitted, resource must be deleted and recreated
|
||||
// allowProjectManagement: true is required or Foundry Project creation will fail
|
||||
// Replace apiVersion with the latest version fetched in Step 0
|
||||
resource foundry 'Microsoft.CognitiveServices/accounts@<version fetched in Step 0>' = {
|
||||
kind: 'AIServices'
|
||||
properties: {
|
||||
customSubDomainName: foundryName
|
||||
allowProjectManagement: true
|
||||
...
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### ⚠️ Bicep Code Quality Validation (Required After Generation)
|
||||
|
||||
**Module Declaration Validation:**
|
||||
- Verify that the `name:` property in each module block is not duplicated
|
||||
- Correct example: `name: 'deploy-sql'`
|
||||
- Incorrect example: `name: 'name: 'deploy-sql'` (duplicated name: → compilation error)
|
||||
|
||||
**Duplicate Property Prevention:**
|
||||
- If the same property name appears more than once within a single resource block, it causes a compilation error
|
||||
- Especially common in complex resources like VPN Gateway (`gatewayType`), Firewall, AKS, etc.
|
||||
- Check for `BCP025: The property "xxx" is declared multiple times` in the `az bicep build` output
|
||||
|
||||
**`az bicep build` Must Be Run:**
|
||||
- After generating all Bicep files, always run `az bicep build --file main.bicep`
|
||||
- Fix errors and recompile
|
||||
- Warnings (BCP081, etc.) can be ignored after verifying the API version in MS Docs
|
||||
|
||||
## main.bicep Base Structure
|
||||
|
||||
```bicep
|
||||
// ============================================================
|
||||
// Azure [Project Name] Infrastructure — main.bicep
|
||||
// Generated: [Date]
|
||||
// ============================================================
|
||||
|
||||
targetScope = 'resourceGroup'
|
||||
|
||||
// ── Common Parameters ─────────────────────────────────────
|
||||
param location string // Location confirmed in Phase 1 — do not hardcode
|
||||
param projectPrefix string
|
||||
param vnetAddressPrefix string // ← Confirm with user. Prevent conflicts with existing networks
|
||||
param peSubnetPrefix string // ← PE-dedicated subnet CIDR within the VNet
|
||||
|
||||
// ── Network ───────────────────────────────────────────────
|
||||
module network './modules/network.bicep' = {
|
||||
name: 'deploy-network'
|
||||
params: {
|
||||
location: location
|
||||
vnetAddressPrefix: vnetAddressPrefix
|
||||
peSubnetPrefix: peSubnetPrefix
|
||||
}
|
||||
}
|
||||
|
||||
// ── AI/Data Services ──────────────────────────────────────
|
||||
module ai './modules/ai.bicep' = {
|
||||
name: 'deploy-ai'
|
||||
params: {
|
||||
location: location
|
||||
// Add separate params if regions differ per service — verify available regions in MS Docs
|
||||
}
|
||||
dependsOn: [network]
|
||||
}
|
||||
|
||||
// ── Storage ───────────────────────────────────────────────
|
||||
module storage './modules/storage.bicep' = {
|
||||
name: 'deploy-storage'
|
||||
params: {
|
||||
location: location
|
||||
}
|
||||
}
|
||||
|
||||
// ── Key Vault ─────────────────────────────────────────────
|
||||
module keyVault './modules/keyvault.bicep' = {
|
||||
name: 'deploy-keyvault'
|
||||
params: {
|
||||
location: location
|
||||
}
|
||||
}
|
||||
|
||||
// ── Private Endpoints (All Services) ──────────────────────
|
||||
module privateEndpoints './modules/private-endpoints.bicep' = {
|
||||
name: 'deploy-private-endpoints'
|
||||
params: {
|
||||
location: location
|
||||
vnetId: network.outputs.vnetId
|
||||
peSubnetId: network.outputs.peSubnetId
|
||||
foundryId: ai.outputs.foundryId
|
||||
searchId: ai.outputs.searchId
|
||||
storageId: storage.outputs.storageId
|
||||
keyVaultId: keyVault.outputs.keyVaultId
|
||||
}
|
||||
}
|
||||
|
||||
// ── Outputs ───────────────────────────────────────────────
|
||||
output vnetId string = network.outputs.vnetId
|
||||
output foundryEndpoint string = ai.outputs.foundryEndpoint
|
||||
output searchEndpoint string = ai.outputs.searchEndpoint
|
||||
```
|
||||
|
||||
## main.bicepparam Base Structure
|
||||
|
||||
```bicep
|
||||
using './main.bicep'
|
||||
|
||||
param location = '<Location confirmed in Phase 1>'
|
||||
param projectPrefix = '<Project prefix>'
|
||||
// Do not put sensitive values here — use Key Vault references
|
||||
// Set regions after verifying per-service availability in MS Docs
|
||||
```
|
||||
|
||||
### @secure() Parameter Handling
|
||||
|
||||
When a `.bicepparam` file contains a `using` directive, additional `--parameters` flags cannot be used with `az deployment`.
|
||||
Therefore, `@secure()` parameters must follow these rules:
|
||||
|
||||
- **Set a default value when possible**: `@secure() param password string = newGuid()`
|
||||
- **If user input is required for @secure() parameters**: Generate a JSON parameter file (`main.parameters.json`) alongside instead of using `.bicepparam`
|
||||
- **Never do this**: Generate a command that uses `.bicepparam` and `--parameters key=value` simultaneously
|
||||
|
||||
## Common Mistake Checklist
|
||||
|
||||
The full checklist is in `references/service-gotchas.md`. Key summary:
|
||||
|
||||
| Item | ❌ Incorrect | ✅ Correct |
|
||||
|------|--------|----------|
|
||||
| ADLS Gen2 | `isHnsEnabled` omitted | `isHnsEnabled: true` |
|
||||
| PE Subnet | Policy not set | `privateEndpointNetworkPolicies: 'Disabled'` |
|
||||
| PE Configuration | PE only created | PE + DNS Zone + VNet Link + DNS Zone Group |
|
||||
| Foundry | `kind: 'OpenAI'` | `kind: 'AIServices'` + `allowProjectManagement: true` |
|
||||
| Foundry | `customSubDomainName` omitted | `customSubDomainName: foundryName` — cannot be changed after creation |
|
||||
| Foundry Project | Not created | Must always be created as a set with the Foundry resource |
|
||||
| Hub Usage | Used for standard AI | Only when explicitly requested by user or ML/open-source models needed |
|
||||
| Public Network | Not configured | `publicNetworkAccess: 'Disabled'` |
|
||||
| Storage Name | Contains hyphens | Lowercase + digits only, `uniqueString()` recommended |
|
||||
| API version | Copied from previous value | Fetch from MS Docs (Dynamic) |
|
||||
| Region | Hardcoded | Parameter + verify availability in MS Docs (Dynamic) |
|
||||
|
||||
## After Generation Is Complete
|
||||
|
||||
When Bicep generation is complete:
|
||||
1. Provide the user with a summary report of the generated file list and each file's role
|
||||
2. Immediately transition to Phase 3 (Bicep Reviewer)
|
||||
3. The reviewer proceeds with automated review and corrections following the `references/bicep-reviewer.md` guidelines
|
||||
144
skills/azure-architecture-autopilot/references/bicep-reviewer.md
Normal file
144
skills/azure-architecture-autopilot/references/bicep-reviewer.md
Normal file
@@ -0,0 +1,144 @@
|
||||
# Bicep Reviewer Agent
|
||||
|
||||
Reviews generated Bicep code and automatically fixes any issues found.
|
||||
|
||||
## Review Order
|
||||
|
||||
### Step 1: Bicep Compilation (Run First)
|
||||
|
||||
Run actual Bicep compilation **before** the checklist. Do not declare "pass" based on visual inspection alone.
|
||||
|
||||
```powershell
|
||||
az bicep build --file main.bicep 2>&1
|
||||
```
|
||||
|
||||
Collect all WARNINGs and ERRORs from the compilation results. This is the foundational data for the review.
|
||||
|
||||
### Step 2: Fix Compilation Errors/Warnings
|
||||
|
||||
Fix issues found in compilation results:
|
||||
- **ERROR** → Must fix and recompile
|
||||
- **WARNING** → Handle according to the criteria below
|
||||
|
||||
**🚨 WARNING Handling Criteria — Do Not Force Unnecessary Fixes:**
|
||||
|
||||
WARNINGs do not block deployment. Attempting to resolve warnings often introduces deployment errors, so use the following criteria:
|
||||
|
||||
| WARNING Type | Action | Reason |
|
||||
|---|---|---|
|
||||
| BCP081 (type not defined) | **Leave as-is** (if API version is the latest confirmed from MS Docs) | Local Bicep CLI type definitions are not yet updated. No impact on deployment |
|
||||
| BCP035 (missing property) | **Judge carefully** — Check MS Docs to verify if the property is actually required; if not, leave as-is | Adding properties can cause deployment failures due to compatibility issues (e.g., computeMode) |
|
||||
| BCP187 (sku/kind type unverified) | **Leave as-is** | Values confirmed from MS Docs will work correctly at deployment |
|
||||
| no-hardcoded-env-urls | **Leave as-is** | DNS Zone names inevitably require hardcoding |
|
||||
|
||||
**Never do the following:**
|
||||
- Downgrade API versions to resolve WARNINGs (maintain latest stable)
|
||||
- Add properties not confirmed in MS Docs to resolve WARNINGs
|
||||
- Force fixes targeting "zero warnings"
|
||||
|
||||
**Principle: Document WARNINGs in review results, but do not fix them if they don't block deployment.**
|
||||
|
||||
Common issues and responses:
|
||||
- BCP081 (type not defined) → API version is likely incorrect. Fetch MS Docs and update to the actual latest stable version
|
||||
- BCP036 (type mismatch) → Check property value casing and type, then fix
|
||||
- BCP037 (property not allowed) → Check MS Docs to verify if the property is supported in that API version
|
||||
- no-hardcoded-env-urls → Hardcoded URLs in DNS Zone names etc. are sometimes unavoidable in Bicep. Note in review results
|
||||
|
||||
### Step 3: Checklist Review
|
||||
|
||||
Review the following items after compilation passes. See `references/service-gotchas.md` for full gotchas.
|
||||
|
||||
#### Critical (Must Fix)
|
||||
- [ ] Microsoft Foundry `customSubDomainName` setting exists — **Cannot be changed after creation; if missing, resource must be deleted and recreated**
|
||||
- [ ] When using Microsoft Foundry, **Foundry Project (`accounts/projects`) must exist** — Portal access unavailable without it
|
||||
- [ ] Microsoft Foundry `identity: { type: 'SystemAssigned' }` — Project creation fails without it
|
||||
- [ ] `publicNetworkAccess: 'Disabled'` — All services using PE
|
||||
- [ ] ADLS Gen2 `isHnsEnabled: true` — Without it, becomes regular Blob Storage
|
||||
- [ ] pe-subnet `privateEndpointNetworkPolicies: 'Disabled'` — PE creation fails without it
|
||||
- [ ] Private DNS Zone Group — Must exist for every PE
|
||||
- [ ] Key Vault `enablePurgeProtection: true`
|
||||
|
||||
#### High (Recommended Fix)
|
||||
- [ ] Storage `allowBlobPublicAccess: false`, `minimumTlsVersion: 'TLS1_2'`
|
||||
- [ ] Private DNS Zone VNet Link `registrationEnabled: false`
|
||||
- [ ] Resource types and kind values per service match `references/ai-data.md` or MS Docs
|
||||
- [ ] Model deployments: Order guaranteed (`dependsOn`)
|
||||
- [ ] No sensitive values in parameter files — **Remove immediately if found**
|
||||
|
||||
#### Medium (Recommended)
|
||||
- [ ] Resource name collision prevention using `uniqueString()`
|
||||
- [ ] Leverage implicit dependencies through resource references
|
||||
|
||||
### Step 4: Hardcoding Regression Check (Prevent Dynamic Information Leakage)
|
||||
|
||||
Verify the following items are not hardcoded as literal values in the Bicep code:
|
||||
|
||||
#### Must Be Parameterized (No Hardcoding)
|
||||
- [ ] `location` — Literal region names (`'eastus'`, `'koreacentral'`, etc.) are not used directly; passed via `param location`
|
||||
- [ ] Model name/version — Not literals; use values confirmed in Phase 1 and validated for availability in Step 0
|
||||
- [ ] SKU — Use values confirmed with the user
|
||||
|
||||
#### Verify Dynamic Values Have Not Regressed Into References
|
||||
This is not directly within this review's scope, but if specific API versions, SKU lists, or region lists are hardcoded in code comments or parameter descriptions, remove them and replace with "Check MS Docs" guidance.
|
||||
|
||||
#### Decision Rule Violation Check
|
||||
- [ ] If `kind: 'OpenAI'` is used instead of Foundry → Change to `kind: 'AIServices'` unless the user explicitly requested it
|
||||
- [ ] If Hub (`MachineLearningServices`) is used for general AI/RAG → Change to Foundry unless the user explicitly requested it
|
||||
- [ ] If a standalone Azure OpenAI resource is used → Suggest reviewing Foundry usage unless the user explicitly requested it or Docs indicate it's necessary
|
||||
|
||||
### Step 5: Recompile After Fixes
|
||||
|
||||
If any changes were made in Steps 2–4, run `az bicep build` again to verify no new errors were introduced.
|
||||
|
||||
### Limitations of `az bicep build`
|
||||
|
||||
Compilation only validates syntax and types. The following items cannot be caught by compilation and are finally verified in Phase 4's `az deployment group what-if`:
|
||||
- Retired/unavailable SKU
|
||||
- Per-region service availability
|
||||
- Model name validity
|
||||
- Preview-only properties
|
||||
- Service policy changes (quota, capacity, etc.)
|
||||
|
||||
State these limitations in the review results so the user understands the importance of the what-if step.
|
||||
|
||||
### Step 6: Report Results
|
||||
|
||||
```markdown
|
||||
## Bicep Code Review Results
|
||||
|
||||
**Compilation Result**: [PASS/WARNING N items]
|
||||
**Checklist**: ✅ Passed X items / ⚠️ Warnings X items
|
||||
**Hardcoding Check**: [PASS / N violations]
|
||||
**Auto-fixed**: X items
|
||||
|
||||
### Compilation Warnings (Remaining)
|
||||
- [Warning content — including reason why it cannot be fixed]
|
||||
|
||||
### Auto-fix Details
|
||||
- [File:line number] Before → After (reason)
|
||||
|
||||
### Hardcoding Violations (If Any)
|
||||
- [File:line number] [Violation details] → [Fix method]
|
||||
|
||||
**Conclusion**: [Ready for deployment / Manual review required]
|
||||
```
|
||||
|
||||
### Step 7: Phase 4 Transition — Reassurance Message Required
|
||||
|
||||
When asking whether to proceed to Phase 4 after passing code review, **always include a message to reassure the user**.
|
||||
Users may feel uneasy about the word "deployment", so clearly communicate that what-if is a safe validation step.
|
||||
|
||||
```
|
||||
ask_user({
|
||||
question: "Code review passed! Ready to proceed to the next step?\n\n⚡ This does NOT deploy immediately:\n 1️⃣ What-if validation — Simulates what will be created (not a deployment, safe)\n 2️⃣ Preview diagram — Review the architecture to be deployed as a diagram\n 3️⃣ Final confirmation — Actual deployment only after you review the diagram and approve\n\nNothing will be deployed without your approval.",
|
||||
choices: [
|
||||
"Proceed to next step (what-if validation + preview diagram) (Recommended)",
|
||||
"Just give me the code, I'll deploy later"
|
||||
]
|
||||
})
|
||||
```
|
||||
|
||||
**Key points:**
|
||||
- Always state "This does NOT deploy immediately"
|
||||
- Explain the 3-step process: what-if → preview diagram → final confirmation
|
||||
- Reassure with "Nothing will be deployed without your approval"
|
||||
475
skills/azure-architecture-autopilot/references/phase0-scanner.md
Normal file
475
skills/azure-architecture-autopilot/references/phase0-scanner.md
Normal file
@@ -0,0 +1,475 @@
|
||||
# Phase 0: Existing Resource Scanner
|
||||
|
||||
This file contains the detailed instructions for Phase 0. When the user requests analysis of existing Azure resources (Path B), read and follow this file.
|
||||
|
||||
Scan results are visualized as an architecture diagram, and subsequent natural-language modification requests from the user are routed to Phase 1.
|
||||
|
||||
> **🚨 Output Storage Path Rule**: All outputs (scan JSON, diagram HTML, Bicep code) must be saved in **a project folder under the current working directory (cwd)**. NEVER save them inside `~/.copilot/session-state/`. The session-state directory is a temporary space and may be deleted when the session ends.
|
||||
|
||||
---
|
||||
|
||||
## Step 1: Azure Login + Scan Scope Selection
|
||||
|
||||
### 1-A: Verify Azure Login
|
||||
|
||||
```powershell
|
||||
az account show 2>&1
|
||||
```
|
||||
|
||||
- If logged in → Proceed to Step 1-B
|
||||
- If not logged in → Ask the user to run `az login`
|
||||
|
||||
### 1-B: Subscription Selection (Multiple Selection Supported)
|
||||
|
||||
```powershell
|
||||
az account list --output json
|
||||
```
|
||||
|
||||
Present the subscription list as `ask_user` choices. **Multiple subscriptions can be selected:**
|
||||
```
|
||||
ask_user({
|
||||
question: "Please select the Azure subscription(s) to analyze. (You can add more one at a time for multiple selections)",
|
||||
choices: [
|
||||
"sub-002 (Current default subscription) (Recommended)",
|
||||
"sub-001",
|
||||
"Analyze all subscriptions above"
|
||||
]
|
||||
})
|
||||
```
|
||||
|
||||
- Single subscription selected → Scan only that subscription
|
||||
- "Analyze all" selected → Scan all subscriptions
|
||||
- If the user wants additional subscriptions → Use ask_user again to add more
|
||||
|
||||
### 1-C: Scan Scope Selection (Multiple RG Selection Supported)
|
||||
|
||||
```
|
||||
ask_user({
|
||||
question: "What scope of Azure resources would you like to analyze?",
|
||||
choices: [
|
||||
"Specify a particular resource group (Recommended)",
|
||||
"Select multiple resource groups",
|
||||
"All resource groups in the current subscription"
|
||||
]
|
||||
})
|
||||
```
|
||||
|
||||
- **Specific RG** → Select from the RG list or enter manually
|
||||
- **Multiple RGs** → Repeat ask_user to add RGs one at a time. Stop when the user says "that's enough."
|
||||
Alternatively, the user can enter multiple RGs separated by commas (e.g., `rg-prod, rg-dev, rg-network`)
|
||||
- **Entire subscription** → `az group list` → Scan all RGs (warn if there are many resources that it may take time)
|
||||
|
||||
**Combining multiple subscriptions + multiple RGs is supported:**
|
||||
- rg-prod from subscription A + rg-network from subscription B → Scan both and display in a single diagram
|
||||
|
||||
---
|
||||
|
||||
## Diagram Hierarchy — Displaying Multiple Subscriptions/RGs
|
||||
|
||||
**Single subscription + single RG**: Same as before (VNet boundary only)
|
||||
**Multiple RGs (same subscription)**: Dashed boundary per RG
|
||||
**Multiple subscriptions**: Two-level boundary of Subscription > RG
|
||||
|
||||
Pass hierarchy information in the diagram JSON:
|
||||
|
||||
**Add `subscription` and `resourceGroup` fields to the services JSON:**
|
||||
```json
|
||||
{
|
||||
"id": "foundry",
|
||||
"name": "foundry-xxx",
|
||||
"type": "ai_foundry",
|
||||
"subscription": "sub-002",
|
||||
"resourceGroup": "rg-prod",
|
||||
"details": [...]
|
||||
}
|
||||
```
|
||||
|
||||
**Pass hierarchy information via the `--hierarchy` parameter:**
|
||||
```
|
||||
--hierarchy '[{"subscription":"sub-002","resourceGroups":["rg-prod","rg-dev"]},{"subscription":"sub-001","resourceGroups":["rg-network"]}]'
|
||||
```
|
||||
|
||||
Based on this information, the diagram script will:
|
||||
- Multiple RGs → Represent each RG as a cluster with a dashed boundary (label: RG name)
|
||||
- Multiple subscriptions → Nest RG boundaries inside larger subscription boundaries
|
||||
- VNet boundaries are displayed inside the RG to which the VNet belongs
|
||||
|
||||
---
|
||||
|
||||
## Step 2: Resource Scan
|
||||
|
||||
**🚨 az CLI Output Principles:**
|
||||
- az CLI output must **always be saved to a file** and then read with `view`. Direct terminal output may be truncated.
|
||||
- Bundle **no more than 3 az commands** per PowerShell call. Bundling too many may cause timeouts.
|
||||
- Use `--query` JMESPath to extract only the required fields and reduce output size.
|
||||
|
||||
```powershell
|
||||
# ✅ Correct approach — Save to file then read
|
||||
az resource list -g "<RG>" --query "[].{name:name,type:type,kind:kind,location:location}" -o json | Set-Content -Path "$outDir/resources.json"
|
||||
|
||||
# ❌ Wrong approach — Direct terminal output (may be truncated)
|
||||
az resource list -g "<RG>" -o json
|
||||
```
|
||||
|
||||
### 2-A: List All Resources + Display to User
|
||||
|
||||
```powershell
|
||||
$outDir = "<project-name>/azure-scan"
|
||||
New-Item -ItemType Directory -Path $outDir -Force | Out-Null
|
||||
|
||||
# Step 1: Basic resource list (name, type, kind, location)
|
||||
az resource list -g "<RG>" --query "[].{name:name,type:type,kind:kind,location:location,id:id}" -o json | Set-Content "$outDir/resources.json"
|
||||
```
|
||||
|
||||
**🚨 Immediately after reading resources.json, you MUST display the full resource list table to the user:**
|
||||
|
||||
```
|
||||
📋 rg-<RG> Resource List (N resources)
|
||||
|
||||
┌─────────────────────────┬──────────────────────────────────────────────┬─────────────────┐
|
||||
│ Name │ Type │ Location │
|
||||
├─────────────────────────┼──────────────────────────────────────────────┼─────────────────┤
|
||||
│ my-storage │ Microsoft.Storage/storageAccounts │ koreacentral │
|
||||
│ my-keyvault │ Microsoft.KeyVault/vaults │ koreacentral │
|
||||
│ ... │ ... │ ... │
|
||||
└─────────────────────────┴──────────────────────────────────────────────┴─────────────────┘
|
||||
|
||||
⏳ Retrieving detailed information...
|
||||
```
|
||||
|
||||
Display this table **first** before proceeding to detailed queries. Do not make the user wait without knowing what resources exist.
|
||||
|
||||
### 2-B: Dynamic Detailed Query — Based on resources.json
|
||||
|
||||
**Dynamically determine detailed query commands based on the resource types found in resources.json.**
|
||||
|
||||
Do not use a hardcoded command list. Only execute commands for types that exist in resources.json, selected from the mapping table below.
|
||||
|
||||
**Type → Detailed Query Command Mapping:**
|
||||
|
||||
| Type in resources.json | Detailed Query Command | Output File |
|
||||
|---|---|---|
|
||||
| `Microsoft.Network/virtualNetworks` | `az network vnet list -g "<RG>" --query "[].{name:name,addressSpace:addressSpace.addressPrefixes,subnets:subnets[].{name:name,prefix:addressPrefix,pePolicy:privateEndpointNetworkPolicies}}" -o json` | `vnets.json` |
|
||||
| `Microsoft.Network/privateEndpoints` | `az network private-endpoint list -g "<RG>" --query "[].{name:name,subnetId:subnet.id,targetId:privateLinkServiceConnections[0].privateLinkServiceId,groupIds:privateLinkServiceConnections[0].groupIds,state:provisioningState}" -o json` | `pe.json` |
|
||||
| `Microsoft.Network/networkSecurityGroups` | `az network nsg list -g "<RG>" --query "[].{name:name,location:location,subnets:subnets[].id,nics:networkInterfaces[].id}" -o json` | `nsg.json` |
|
||||
| `Microsoft.CognitiveServices/accounts` | `az cognitiveservices account list -g "<RG>" --query "[].{name:name,kind:kind,sku:sku.name,endpoint:properties.endpoint,publicAccess:properties.publicNetworkAccess,location:location}" -o json` | `cognitive.json` |
|
||||
| `Microsoft.Search/searchServices` | `az search service list -g "<RG>" --query "[].{name:name,sku:sku.name,publicAccess:properties.publicNetworkAccess,semanticSearch:properties.semanticSearch,location:location}" -o json 2>$null` | `search.json` |
|
||||
| `Microsoft.Compute/virtualMachines` | `az vm list -g "<RG>" --query "[].{name:name,size:hardwareProfile.vmSize,os:storageProfile.osDisk.osType,location:location,nicIds:networkProfile.networkInterfaces[].id}" -o json` | `vms.json` |
|
||||
| `Microsoft.Storage/storageAccounts` | `az storage account list -g "<RG>" --query "[].{name:name,sku:sku.name,kind:kind,hns:properties.isHnsEnabled,publicAccess:properties.publicNetworkAccess,location:location}" -o json` | `storage.json` |
|
||||
| `Microsoft.KeyVault/vaults` | `az keyvault list -g "<RG>" --query "[].{name:name,location:location}" -o json 2>$null` | `keyvault.json` |
|
||||
| `Microsoft.ContainerService/managedClusters` | `az aks list -g "<RG>" --query "[].{name:name,kubernetesVersion:kubernetesVersion,sku:sku,agentPoolProfiles:agentPoolProfiles[].{name:name,count:count,vmSize:vmSize},networkProfile:networkProfile.networkPlugin,location:location}" -o json` | `aks.json` |
|
||||
| `Microsoft.Web/sites` | `az webapp list -g "<RG>" --query "[].{name:name,kind:kind,sku:appServicePlan,state:state,defaultHostName:defaultHostName,httpsOnly:httpsOnly,location:location}" -o json` | `webapps.json` |
|
||||
| `Microsoft.Web/serverFarms` | `az appservice plan list -g "<RG>" --query "[].{name:name,sku:sku.name,tier:sku.tier,kind:kind,location:location}" -o json` | `appservice-plans.json` |
|
||||
| `Microsoft.DocumentDB/databaseAccounts` | `az cosmosdb list -g "<RG>" --query "[].{name:name,kind:kind,databaseAccountOfferType:databaseAccountOfferType,locations:locations[].locationName,publicAccess:publicNetworkAccess}" -o json` | `cosmosdb.json` |
|
||||
| `Microsoft.Sql/servers` | `az sql server list -g "<RG>" --query "[].{name:name,fullyQualifiedDomainName:fullyQualifiedDomainName,publicAccess:publicNetworkAccess,location:location}" -o json` | `sql-servers.json` |
|
||||
| `Microsoft.Databricks/workspaces` | `az databricks workspace list -g "<RG>" --query "[].{name:name,sku:sku.name,url:workspaceUrl,publicAccess:parameters.enableNoPublicIp.value,location:location}" -o json 2>$null` | `databricks.json` |
|
||||
| `Microsoft.Synapse/workspaces` | `az synapse workspace list -g "<RG>" --query "[].{name:name,sqlAdminLogin:sqlAdministratorLogin,publicAccess:publicNetworkAccess,location:location}" -o json 2>$null` | `synapse.json` |
|
||||
| `Microsoft.DataFactory/factories` | `az datafactory list -g "<RG>" --query "[].{name:name,publicAccess:publicNetworkAccess,location:location}" -o json 2>$null` | `adf.json` |
|
||||
| `Microsoft.EventHub/namespaces` | `az eventhubs namespace list -g "<RG>" --query "[].{name:name,sku:sku.name,location:location}" -o json` | `eventhub.json` |
|
||||
| `Microsoft.Cache/redis` | `az redis list -g "<RG>" --query "[].{name:name,sku:sku.name,port:port,sslPort:sslPort,publicAccess:publicNetworkAccess,location:location}" -o json` | `redis.json` |
|
||||
| `Microsoft.ContainerRegistry/registries` | `az acr list -g "<RG>" --query "[].{name:name,sku:sku.name,adminUserEnabled:adminUserEnabled,publicAccess:publicNetworkAccess,location:location}" -o json` | `acr.json` |
|
||||
| `Microsoft.MachineLearningServices/workspaces` | `az resource show --ids "<ID>" --query "{name:name,sku:sku,kind:kind,location:location,publicAccess:properties.publicNetworkAccess,hbiWorkspace:properties.hbiWorkspace,managedNetwork:properties.managedNetwork.isolationMode}" -o json` | `mlworkspace.json` |
|
||||
| `Microsoft.Insights/components` | `az monitor app-insights component show -g "<RG>" --app "<NAME>" --query "{name:name,kind:kind,instrumentationKey:instrumentationKey,workspaceResourceId:workspaceResourceId,location:location}" -o json 2>$null` | `appinsights-<NAME>.json` |
|
||||
| `Microsoft.OperationalInsights/workspaces` | `az monitor log-analytics workspace show -g "<RG>" -n "<NAME>" --query "{name:name,sku:sku.name,retentionInDays:retentionInDays,location:location}" -o json` | `log-analytics-<NAME>.json` |
|
||||
| `Microsoft.Network/applicationGateways` | `az network application-gateway list -g "<RG>" --query "[].{name:name,sku:sku,location:location}" -o json` | `appgateway.json` |
|
||||
| `Microsoft.Cdn/profiles` / `Microsoft.Network/frontDoors` | `az afd profile list -g "<RG>" --query "[].{name:name,sku:sku.name,location:location}" -o json 2>$null` | `frontdoor.json` |
|
||||
| `Microsoft.Network/azureFirewalls` | `az network firewall list -g "<RG>" --query "[].{name:name,sku:sku,threatIntelMode:threatIntelMode,location:location}" -o json` | `firewall.json` |
|
||||
| `Microsoft.Network/bastionHosts` | `az network bastion list -g "<RG>" --query "[].{name:name,sku:sku.name,location:location}" -o json` | `bastion.json` |
|
||||
|
||||
**Dynamic Query Process:**
|
||||
|
||||
1. Read `resources.json`
|
||||
2. Extract the distinct values of the `type` field
|
||||
3. Execute **only the commands for matching types** from the mapping table above (skip types not present)
|
||||
4. If a type not in the mapping table is found → Use generic query: `az resource show --ids "<ID>" --query "{name:name,sku:sku,kind:kind,location:location,properties:properties}" -o json`
|
||||
5. Execute commands in batches of 2-3 (do not run all at once)
|
||||
|
||||
### 2-C: Model Deployment Query (When Cognitive Services Exist)
|
||||
|
||||
```powershell
|
||||
# Query model deployments for each Cognitive Services resource
|
||||
az cognitiveservices account deployment list --name "<NAME>" -g "<RG>" --query "[].{name:name,model:properties.model.name,version:properties.model.version,sku:sku.name}" -o json | Set-Content "$outDir/<NAME>-deployments.json"
|
||||
```
|
||||
|
||||
### 2-D: NIC + Public IP Query (When VMs Exist)
|
||||
|
||||
```powershell
|
||||
az network nic list -g "<RG>" --query "[].{name:name,subnetId:ipConfigurations[0].subnet.id,privateIp:ipConfigurations[0].privateIPAddress,publicIpId:ipConfigurations[0].publicIPAddress.id}" -o json | Set-Content "$outDir/nics.json"
|
||||
az network public-ip list -g "<RG>" --query "[].{name:name,ip:ipAddress,sku:sku.name}" -o json | Set-Content "$outDir/public-ips.json"
|
||||
```
|
||||
|
||||
From the VNet:
|
||||
- `addressSpace.addressPrefixes` → CIDR
|
||||
- `subnets[].name`, `subnets[].addressPrefix` → Subnet information
|
||||
- `subnets[].privateEndpointNetworkPolicies` → PE policies
|
||||
|
||||
---
|
||||
|
||||
## Step 3: Inferring Relationships Between Resources
|
||||
|
||||
Automatically infer **relationships (connections)** between scanned resources to construct the connections JSON for the diagram.
|
||||
|
||||
### Relationship Inference Rules
|
||||
|
||||
**🚨 If there are insufficient connection lines, the diagram becomes meaningless. Infer as many relationships as possible.**
|
||||
|
||||
#### Confirmed Inference (Directly verifiable from resource IDs/properties)
|
||||
|
||||
| Relationship Type | Inference Method | connection type |
|
||||
|---|---|---|
|
||||
| PE → Service | Extract service ID from PE's `privateLinkServiceId` | `private` |
|
||||
| PE → VNet | Extract VNet from PE's `subnet.id` | (Represented as VNet boundary) |
|
||||
| Foundry → Project | Parent resource of `accounts/projects` | `api` |
|
||||
| VM → NIC → Subnet | Infer VNet/Subnet from NIC's `subnet.id` | (VNet boundary) |
|
||||
| NSG → Subnet | Check connected subnets from NSG's `subnets[].id` | `network` |
|
||||
| NSG → NIC | Check connected VMs from NSG's `networkInterfaces[].id` | `network` |
|
||||
| NIC → Public IP | Check PIP from NIC's `publicIPAddress.id` | (Included in details) |
|
||||
| Databricks → VNet | Workspace's VNet injection configuration | (VNet boundary) |
|
||||
|
||||
#### Reasonable Inference (Common patterns between services within the same RG)
|
||||
|
||||
| Relationship Type | Inference Condition | connection type |
|
||||
|---|---|---|
|
||||
| Foundry → AI Search | Both exist in the same RG → Infer RAG connection | `api` (label: "RAG Search") |
|
||||
| Foundry → Storage | Both exist in the same RG → Infer data connection | `data` (label: "Data") |
|
||||
| AI Search → Storage | Both exist in the same RG → Infer indexing connection | `data` (label: "Indexing") |
|
||||
| Service → Key Vault | Key Vault exists in the same RG → Infer secret management | `security` (label: "Secrets") |
|
||||
| VM → Foundry/Search | VM + AI services exist in the same RG → Infer API calls | `api` (label: "API") |
|
||||
| DI → Foundry | Document Intelligence + Foundry exist in the same RG → Infer OCR/extraction connection | `api` (label: "OCR/Extract") |
|
||||
| ADF → Storage | ADF + Storage exist in the same RG → Infer data pipeline | `data` (label: "Pipeline") |
|
||||
| ADF → SQL | ADF + SQL exist in the same RG → Infer data source | `data` (label: "Source") |
|
||||
| Databricks → Storage | Both exist in the same RG → Infer data lake connection | `data` (label: "Data Lake") |
|
||||
|
||||
#### User Confirmation After Inference
|
||||
|
||||
Show the inferred connection list to the user and request confirmation:
|
||||
```
|
||||
> **⏳ Relationships between resources have been inferred** — Please verify if the following are correct.
|
||||
|
||||
Inferred connections:
|
||||
- Foundry → AI Search (RAG Search)
|
||||
- Foundry → Storage (Data)
|
||||
- VM → Foundry (API Call)
|
||||
- Document Intelligence → Foundry (OCR/Extract)
|
||||
|
||||
Does this look correct? Let me know if you'd like to add or remove any connections.
|
||||
```
|
||||
|
||||
#### Relationships That Cannot Be Inferred
|
||||
|
||||
There may be connections that cannot be inferred using the rules above. The user can freely add additional connections.
|
||||
|
||||
### Model Deployment Query (When Foundry Resources Exist)
|
||||
|
||||
```powershell
|
||||
az cognitiveservices account deployment list --name "<FOUNDRY_NAME>" -g "<RG>" --query "[].{name:name,model:properties.model.name,version:properties.model.version,sku:sku.name}" -o json
|
||||
```
|
||||
|
||||
Add each deployment's model name, version, and SKU to the Foundry node's details.
|
||||
|
||||
---
|
||||
|
||||
## Step 4: services/connections JSON Conversion
|
||||
|
||||
Convert scan results into the input format for the built-in diagram engine.
|
||||
|
||||
### Resource Type → Diagram type Mapping
|
||||
|
||||
| Azure Resource Type | Diagram type |
|
||||
|---|---|
|
||||
| `Microsoft.CognitiveServices/accounts` (kind: AIServices) | `ai_foundry` |
|
||||
| `Microsoft.CognitiveServices/accounts` (kind: OpenAI) | `openai` |
|
||||
| `Microsoft.CognitiveServices/accounts` (kind: FormRecognizer) | `document_intelligence` |
|
||||
| `Microsoft.CognitiveServices/accounts` (kind: TextAnalytics, etc.) | `ai_foundry` (default) |
|
||||
| `Microsoft.CognitiveServices/accounts/projects` | `ai_foundry` |
|
||||
| `Microsoft.Search/searchServices` | `search` |
|
||||
| `Microsoft.Storage/storageAccounts` | `storage` |
|
||||
| `Microsoft.KeyVault/vaults` | `keyvault` |
|
||||
| `Microsoft.Databricks/workspaces` | `databricks` |
|
||||
| `Microsoft.Sql/servers` | `sql_server` |
|
||||
| `Microsoft.Sql/servers/databases` | `sql_database` |
|
||||
| `Microsoft.DocumentDB/databaseAccounts` | `cosmos_db` |
|
||||
| `Microsoft.Web/sites` | `app_service` |
|
||||
| `Microsoft.ContainerService/managedClusters` | `aks` |
|
||||
| `Microsoft.Web/sites` (kind: functionapp) | `function_app` |
|
||||
| `Microsoft.Synapse/workspaces` | `synapse` |
|
||||
| `Microsoft.Fabric/capacities` | `fabric` |
|
||||
| `Microsoft.DataFactory/factories` | `adf` |
|
||||
| `Microsoft.Compute/virtualMachines` | `vm` |
|
||||
| `Microsoft.Network/privateEndpoints` | `pe` |
|
||||
| `Microsoft.Network/virtualNetworks` | (Represented as VNet boundary — not included in services) |
|
||||
| `Microsoft.Network/networkSecurityGroups` | `nsg` |
|
||||
| `Microsoft.Network/bastionHosts` | `bastion` |
|
||||
| `Microsoft.OperationalInsights/workspaces` | `log_analytics` |
|
||||
| `Microsoft.Insights/components` | `app_insights` |
|
||||
| Other | `default` |
|
||||
|
||||
### services JSON Construction Rules
|
||||
|
||||
```json
|
||||
{
|
||||
"id": "resource name (lowercase, special characters removed)",
|
||||
"name": "actual resource name",
|
||||
"type": "determined from the mapping table above",
|
||||
"sku": "actual SKU (if available)",
|
||||
"private": true/false, // true if a PE is connected
|
||||
"details": ["property1", "property2", ...]
|
||||
}
|
||||
```
|
||||
|
||||
**Information to include in details:**
|
||||
- Endpoint URL
|
||||
- SKU/tier details
|
||||
- kind (AIServices, OpenAI, etc.)
|
||||
- Model deployment list (Foundry)
|
||||
- Key properties (isHnsEnabled, semanticSearch, etc.)
|
||||
- Region
|
||||
|
||||
### VNet Information → `--vnet-info` Parameter
|
||||
|
||||
If a VNet is found, display it in the boundary label via `--vnet-info`:
|
||||
```
|
||||
--vnet-info "10.0.0.0/16 | pe-subnet: 10.0.1.0/24 | <region>"
|
||||
```
|
||||
|
||||
### PE Node Generation
|
||||
|
||||
If PEs are found, add each PE as a separate node and connect it to the corresponding service with a `private` type:
|
||||
```json
|
||||
{"id": "pe_<serviceId>", "name": "PE: <serviceName>", "type": "pe", "details": ["groupId: <groupId>", "<status>"]}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Step 5: Diagram Generation + Presentation to User
|
||||
|
||||
Diagram filename: `<project-name>/00_arch_current.html`
|
||||
|
||||
Use the scanned RG name as the default project name:
|
||||
```
|
||||
ask_user({
|
||||
question: "Please choose a project name. (This will be the folder name for scan results)",
|
||||
choices: ["<RG-name>", "azure-analysis"]
|
||||
})
|
||||
```
|
||||
|
||||
After generating the diagram, report:
|
||||
```
|
||||
## Current Azure Architecture
|
||||
|
||||
[Interactive Diagram — 00_arch_current.html]
|
||||
|
||||
Scanned Resources (N total):
|
||||
[Summary table by resource type]
|
||||
|
||||
What would you like to change here?
|
||||
- 🔧 Performance improvement ("it's slow", "increase throughput")
|
||||
- 💰 Cost optimization ("reduce costs", "make it cheaper")
|
||||
- 🔒 Security hardening ("add PE", "block public access")
|
||||
- 🌐 Network changes ("separate VNet", "add Bastion")
|
||||
- ➕ Add/remove resources ("add a VM", "delete this")
|
||||
- 📊 Monitoring ("set up logs", "add alerts")
|
||||
- 🤔 Diagnostics ("is this architecture OK?", "what's wrong?")
|
||||
- Or just take the diagram and stop here
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Step 6: Modification Conversation → Transition to Phase 1
|
||||
|
||||
When the user requests modifications, transition to Phase 1 (phase1-advisor.md).
|
||||
This is the **Path B entry point**, using the existing scan results as the baseline.
|
||||
|
||||
### Natural Language Modification Request Handling — Clarifying Question Patterns
|
||||
|
||||
Ask clarifying questions to make the user's vague requests more specific:
|
||||
|
||||
**🔧 Performance**
|
||||
|
||||
| User Request | Clarifying Question Example |
|
||||
|---|---|
|
||||
| "It's slow" / "Response takes too long" | "Which service is slow? Should we upgrade the SKU or change the region?" |
|
||||
| "I want to increase throughput" | "Which service's throughput should we increase? Scale out? Increase DTU/RU?" |
|
||||
| "AI Search indexing is slow" | "Should we add partitions? Upgrade the SKU to S2?" |
|
||||
|
||||
**💰 Cost**
|
||||
|
||||
| User Request | Clarifying Question Example |
|
||||
|---|---|
|
||||
| "I want to reduce costs" | "Which service's cost should we reduce? SKU downgrade? Clean up unused resources?" |
|
||||
| "How much does this cost?" | Look up pricing info from MS Docs and provide estimated cost based on current SKUs |
|
||||
| "It's a dev environment, so make it cheap" | "Should we switch to Free/Basic tiers? Which services?" |
|
||||
|
||||
**🔒 Security**
|
||||
|
||||
| User Request | Clarifying Question Example |
|
||||
|---|---|
|
||||
| "Harden the security" | "Should we add PEs to services that don't have them? Check RBAC? Disable publicNetworkAccess?" |
|
||||
| "Block public access" | "Should we apply PE + publicNetworkAccess: Disabled to all services?" |
|
||||
| "Manage the keys" | "Should we add Key Vault and connect it with Managed Identity?" |
|
||||
|
||||
**🌐 Network**
|
||||
|
||||
| User Request | Clarifying Question Example |
|
||||
|---|---|
|
||||
| "Add PE" | "To which service? Should we add them to all services at once?" |
|
||||
| "Separate the VNet" | "Which subnets should we separate? Should we also add NSGs?" |
|
||||
| "Add Bastion" | "Adding Azure Bastion for VM access. Please specify the subnet CIDR." |
|
||||
|
||||
**➕ Add/Remove Resources**
|
||||
|
||||
| User Request | Clarifying Question Example |
|
||||
|---|---|
|
||||
| "Add a VM" | "How many? What SKU? Same VNet? What OS?" |
|
||||
| "Add Fabric" | "What SKU? What's the admin email?" |
|
||||
| "Delete this" | "Are you sure you want to remove [resource name]? Connected PEs will also be removed." |
|
||||
|
||||
**📊 Monitoring/Operations**
|
||||
|
||||
| User Request | Clarifying Question Example |
|
||||
|---|---|
|
||||
| "I want to see logs" | "Should we add a Log Analytics Workspace and connect Diagnostic Settings?" |
|
||||
| "Set up alerts" | "For which metrics? CPU? Error rate? Response time?" |
|
||||
| "Attach Application Insights" | "To which service? App Service? Function App?" |
|
||||
|
||||
**🔄 Migration/Changes**
|
||||
|
||||
| User Request | Clarifying Question Example |
|
||||
|---|---|
|
||||
| "Change the region" | "To which region? I'll verify that all services are available in that region." |
|
||||
| "Switch SQL to Cosmos" | "What Cosmos DB API type? (SQL/MongoDB/Cassandra) I can also provide a data migration guide." |
|
||||
| "Switch Foundry to Hub" | "Hub is suitable only when ML training/open-source models are needed. Let me verify the use case." |
|
||||
|
||||
**🤔 Diagnostics/Questions**
|
||||
|
||||
| User Request | Clarifying Question Example |
|
||||
|---|---|
|
||||
| "What's wrong?" | Analyze current configuration (publicNetworkAccess open, PE not connected, inappropriate SKU, etc.) and suggest improvements |
|
||||
| "Is this architecture OK?" | Review against the Well-Architected Framework (security, reliability, performance, cost, operations) |
|
||||
| "Is the PE connected properly?" | Check connection status with `az network private-endpoint show` and report |
|
||||
| "Just give me the diagram" | Do not transition to Phase 1; provide the 00_arch_current.html path and finish |
|
||||
|
||||
Once modifications are finalized:
|
||||
1. Apply Phase 1's Delta Confirmation Rule
|
||||
2. Fact-check (cross-verify with MS Docs)
|
||||
3. Generate updated diagram (01_arch_diagram_draft.html)
|
||||
4. User confirmation → Proceed to Phases 2–4
|
||||
|
||||
---
|
||||
|
||||
## Scan Performance Optimization
|
||||
|
||||
- If there are 50+ resources, warn the user: "There are many resources, so the scan may take some time."
|
||||
- Run `az resource list` first to determine the resource count, then proceed with detailed queries
|
||||
- Query key services first (Foundry, Search, Storage, KeyVault, VNet, PE), then collect only basic information for the rest via `az resource show`
|
||||
- Keep the user informed of progress:
|
||||
> **⏳ Scanning resources** — M of N resources completed
|
||||
|
||||
---
|
||||
|
||||
## Handling Unsupported Resources
|
||||
|
||||
For resource types not in the diagram type mapping:
|
||||
- Display with `default` type (question mark icon)
|
||||
- Include the resource name and type in details
|
||||
- Show to the user, but do not attempt relationship inference
|
||||
879
skills/azure-architecture-autopilot/references/phase1-advisor.md
Normal file
879
skills/azure-architecture-autopilot/references/phase1-advisor.md
Normal file
@@ -0,0 +1,879 @@
|
||||
# Phase 1: Architecture Advisor
|
||||
|
||||
This file contains the detailed instructions for Phase 1. When entering Phase 1 from SKILL.md, read and follow this file.
|
||||
Used in both Path A (new design) and Path B (modification after Phase 0 scan).
|
||||
|
||||
---
|
||||
|
||||
## When Entering from Path B (After Existing Resource Analysis)
|
||||
|
||||
The current architecture diagram (00_arch_current.html) scanned in Phase 0 already exists.
|
||||
In this case, skip the project name/service list confirmation in 1-1 and enter the modification conversation directly:
|
||||
|
||||
1. "What would you like to change here?" — User's natural language request
|
||||
2. Apply Delta Confirmation Rule — Confirm undecided required fields for the changes
|
||||
3. Fact check — Cross-verify with MS Docs
|
||||
4. Generate updated diagram (01_arch_diagram_draft.html)
|
||||
5. Proceed to Phase 2 after confirmation
|
||||
|
||||
---
|
||||
|
||||
**Goal of this Phase**: Accurately identify what the user wants and finalize the architecture together.
|
||||
|
||||
### 1-1. Diagram Preparation — Gathering Required Information
|
||||
|
||||
Before drawing the diagram, ask the user questions until all items below are confirmed.
|
||||
**Generate the diagram only after all items are confirmed.**
|
||||
|
||||
**First, confirm the project name:**
|
||||
|
||||
Provide a default value as a choice via `ask_user`. If the user just presses Enter, the default is applied; they can also type a custom name.
|
||||
The default is inferred from the user's request (e.g., RAG chatbot → `rag-chatbot`, data platform → `data-platform`).
|
||||
|
||||
```
|
||||
ask_user({
|
||||
question: "Please choose a project name. It will be used for the Bicep folder name, diagram path, and deployment name.",
|
||||
choices: ["<inferred-default>", "azure-project"]
|
||||
})
|
||||
```
|
||||
The project name is used for the Bicep output folder name, diagram save path, deployment name, etc.
|
||||
|
||||
**🔹 Parallel Preload Along with Project Name Question (Required):**
|
||||
|
||||
When asking the project name via `ask_user`, there is idle time while waiting for the user to respond.
|
||||
Utilize this time to **preload information needed for subsequent questions and Bicep generation in parallel**.
|
||||
|
||||
**Tools to call simultaneously with ask_user:**
|
||||
|
||||
```
|
||||
// Call ask_user + the tools below simultaneously in a single response
|
||||
[1] ask_user — Project name question
|
||||
|
||||
[2] view — Load reference files (pre-acquire Stable information)
|
||||
- references/service-gotchas.md
|
||||
- references/ai-data.md
|
||||
- references/azure-dynamic-sources.md
|
||||
- references/architecture-guidance-sources.md
|
||||
|
||||
[3] web_fetch — Pre-fetch architecture guidance (when workload type is identified)
|
||||
- Up to 2 targeted fetches based on decision rules in architecture-guidance-sources.md
|
||||
|
||||
[4] web_fetch — Fetch MS Docs for services mentioned by the user (pre-acquire Dynamic information)
|
||||
- e.g., Foundry → API version, model availability page
|
||||
- e.g., AI Search → SKU list page
|
||||
- Use URL patterns from azure-dynamic-sources.md
|
||||
```
|
||||
|
||||
**Benefits**: While the user types the project name, all information is loaded,
|
||||
so SKU/region questions can be presented with accurate choices immediately after the project name is confirmed.
|
||||
Wait time is significantly reduced compared to sequential execution.
|
||||
|
||||
**Notes:**
|
||||
- Preload targets are only information independent of the project name (nothing depends on the name)
|
||||
- web_fetch is performed only for services mentioned in the user's initial request (no guessing)
|
||||
- Azure CLI check (`az account show`) is NOT done at this point — preload at architecture finalization
|
||||
|
||||
**🔹 Utilizing Architecture Guidance (Adjusting Question Depth):**
|
||||
|
||||
Extract **design decision points** from the architecture guidance documents fetched during preload,
|
||||
and naturally incorporate them into subsequent user questions.
|
||||
|
||||
**Purpose**: Not just spec questions like SKU/region,
|
||||
but reflecting **design decision points** recommended by official architecture guidance into the questions.
|
||||
|
||||
**Example — When "RAG chatbot" is requested:**
|
||||
- Fetch Baseline Foundry Chat Architecture (A6)
|
||||
- Extract recommended design decision points from the document:
|
||||
→ Network isolation level (full private vs hybrid?)
|
||||
→ Authentication method (managed identity vs API key?)
|
||||
→ Data ingestion strategy (push vs pull indexing?)
|
||||
→ Monitoring scope (Application Insights needed?)
|
||||
- Naturally include these points in user questions
|
||||
|
||||
**Notes:**
|
||||
- What is extracted from architecture guidance is **"points to ask about"**, not "answers"
|
||||
- Deployment specs like SKU/API version/region are still determined only via `azure-dynamic-sources.md`
|
||||
- Fetch budget: maximum 2 documents. No full traversal
|
||||
|
||||
**Required confirmation items:**
|
||||
- [ ] Project name (default: `azure-project`)
|
||||
- [ ] Service list (which Azure services to use)
|
||||
- [ ] SKU/tier for each service
|
||||
- [ ] Networking method (Private Endpoint usage)
|
||||
- [ ] Deployment location (region)
|
||||
|
||||
**Questioning principles:**
|
||||
- Do not ask again for information the user has already mentioned
|
||||
- Do not ask about detailed implementation specifics not directly represented in the diagram (indexing method, query volume, etc.)
|
||||
- Do not ask too many questions at once; ask only key undecided items concisely
|
||||
- For items with obvious defaults (e.g., PE enabled), assume and just confirm. However, location MUST always be confirmed with the user
|
||||
- **When asking about SKUs, models, or service options, show ALL available choices verified from MS Docs, and provide the MS Docs URL as well.** This allows the user to reference and make their own judgment. Do not show only partial options or arbitrarily filter them out
|
||||
|
||||
**🔹 VM/Resource SKU Selection — Region Availability Pre-check Required:**
|
||||
|
||||
**Before** asking the user about VM or other resource SKUs, you MUST first query which SKUs are actually available in the target region.
|
||||
If a SKU is blocked due to capacity restrictions in a specific region, the deployment will fail.
|
||||
|
||||
**VM SKU verification method:**
|
||||
```powershell
|
||||
# Query only VM SKUs available without restrictions in the target region
|
||||
az vm list-skus --location "<LOCATION>" --size Standard_D2 --resource-type virtualMachines `
|
||||
--query "[?restrictions==``[]``].name" -o tsv
|
||||
```
|
||||
|
||||
**Principles:**
|
||||
- Do not include unverified SKUs in the choices
|
||||
- Do not recommend "commonly used SKUs" from memory — MUST verify via az cli or MS Docs
|
||||
- Include only verified SKUs in `ask_user` choices
|
||||
- Even for user-provided SKUs, verify availability before proceeding
|
||||
|
||||
**This principle applies equally not just to VMs, but to ALL resources subject to capacity restrictions (Fabric Capacity, etc.).**
|
||||
|
||||
**🔹 Service Option Exploration Principle — "Listing from Memory" is Prohibited:**
|
||||
|
||||
When the user asks about a service category ("What Spark options are there?", "What are the message queue options?"), or when you need to explore services for a specific capability:
|
||||
|
||||
**NEVER do this:**
|
||||
- Directly fetch URLs for only 2-3 services from your memory and list them
|
||||
- State definitively "In Azure, X has A and B"
|
||||
|
||||
**MUST do this:**
|
||||
1. **Explore the full category via web_search** — Search at the category level like `"Azure managed Spark options site:learn.microsoft.com"` to first discover what services exist
|
||||
2. **Cross-check with v1 scope** — Regardless of search results, check whether v1 scope services (Foundry, Fabric, AI Search, ADLS Gen2, etc.) fall under the relevant category. e.g.: "Spark" → Microsoft Fabric's Data Engineering workload also provides Spark
|
||||
3. **Targeted fetch of discovered options** — Fetch MS Docs for the services found via search to collect accurate comparison information
|
||||
4. **Present all options to the user** — Present all discovered options in a comprehensive comparison without omitting any
|
||||
|
||||
**Example — When asked "What Spark instances are available?":**
|
||||
```
|
||||
Wrong approach: Fetch only Databricks URL + Synapse URL → Compare only 2
|
||||
Correct approach: web_search("Azure managed Spark options") → Discover Databricks, Synapse, Fabric Spark, HDInsight
|
||||
→ v1 scope check: Fabric is v1 scope and provides Spark → MUST include
|
||||
→ Targeted fetch of each service's MS Docs → Present full comparison table
|
||||
```
|
||||
|
||||
This principle applies not only to service category exploration, but to all situations where the user requests "alternatives", "other options", "comparison", etc.
|
||||
|
||||
**🔹 ask_user Tool — Mandatory Usage:**
|
||||
|
||||
For questions with choices, you MUST use the `ask_user` tool. It allows users to select with arrow keys for convenience, and they can also type a custom input.
|
||||
|
||||
**ask_user usage rules:**
|
||||
- Questions with 2 or more choices **MUST** use ask_user (do not list them as text)
|
||||
- **`choices` MUST be passed as a string array (`["A", "B"]`)** — passing as a string (`"A, B"`) will cause an error
|
||||
- If there is a recommended option, place it first and append `(Recommended)` at the end
|
||||
- Include reference information in choices — e.g., `"Standard S1 - Recommended for production. Ref: https://..."`
|
||||
- **Only 1 question per call** — if multiple items need to be asked, call ask_user sequentially for each
|
||||
- Choices are limited to a maximum of 4. If there are 5 or more, include only the 3-4 most common ones (users can also type a custom input)
|
||||
- If multiple selections are needed, split them into separate questions
|
||||
|
||||
**Items requiring ask_user:**
|
||||
- Deployment location (region) selection
|
||||
- SKU/tier selection
|
||||
- Model selection (chat model, embedding model, etc.)
|
||||
- Networking method selection
|
||||
- Subscription selection (Phase 1 Step 2)
|
||||
- Resource group selection (Phase 1 Step 3)
|
||||
- Any other question requiring a user choice
|
||||
|
||||
**Usage examples:**
|
||||
```
|
||||
// Project name is free-form input so ask_user is not used (ask as text)
|
||||
// SKU, region, etc. with defined choices use ask_user:
|
||||
|
||||
// 1. SKU question
|
||||
ask_user({
|
||||
question: "Please select the SKU for AI Search. Ref: https://learn.microsoft.com/en-us/azure/search/search-sku-tier",
|
||||
choices: [
|
||||
"Standard S1 - Recommended for production (Recommended)",
|
||||
"Basic - For dev/test, up to 15 indexes",
|
||||
"Standard S2 - High-traffic production",
|
||||
"Free - Free trial, 50MB storage"
|
||||
]
|
||||
})
|
||||
|
||||
// 2. Region question (separate call — only 1 question per call)
|
||||
ask_user({
|
||||
question: "Please select the Azure region for deployment. Ref: https://learn.microsoft.com/en-us/azure/ai-services/openai/concepts/models",
|
||||
choices: [
|
||||
"Korea Central - Korea region, supports most services (Recommended)",
|
||||
"East US - US East, supports all AI models",
|
||||
"Japan East - Japan East, close to Korea"
|
||||
]
|
||||
})
|
||||
```
|
||||
|
||||
> **Note**: The SKU and region values in the examples above are for illustration only. When actually asking, dynamically compose choices based on the latest information by querying MS Docs via web_fetch. Do not hardcode.
|
||||
|
||||
**Example — When user input is insufficient:**
|
||||
```
|
||||
User: "I want to build a RAG chatbot. Using a GPT model in Foundry and AI Search."
|
||||
|
||||
→ Confirmed: Microsoft Foundry, Azure AI Search
|
||||
→ Still undecided: Project name, specific model name, embedding model, networking (PE?), SKU, deployment location
|
||||
|
||||
The agent first confirms the project name via ask_user (default: rag-chatbot).
|
||||
Then provides choices for each undecided item via the ask_user tool.
|
||||
Include MS Docs URLs in the choices so the user can reference them directly.
|
||||
```
|
||||
|
||||
**🚨🚨🚨 [HARD GATE] Spec Collection Complete → Diagram Generation Required 🚨🚨🚨**
|
||||
|
||||
**Immediately after all confirmed items are filled in, you MUST perform the following steps IN ORDER. Skipping any step means Phase 1 is incomplete.**
|
||||
|
||||
1. Compose **services JSON + connections JSON** based on the confirmed service list
|
||||
2. Use the built-in diagram engine to generate **`<project-name>/01_arch_diagram_draft.html`**
|
||||
3. Automatically open it in the browser via `Start-Process`
|
||||
4. Show the diagram to the user in the **report format** below — this MUST include a **detailed configuration table**
|
||||
5. Ask the user: **"Would you like to change or add anything?"**
|
||||
6. If the user has no changes → proceed to Phase 2 transition (ask_user with next step guidance)
|
||||
|
||||
**NEVER do this:**
|
||||
- ❌ Not generating the diagram and asking "The architecture is confirmed. Shall we proceed to the next step?"
|
||||
- ❌ Deferring diagram generation to Phase 2 or later
|
||||
- ❌ Saying "I'll create the diagram later"
|
||||
- ❌ Declaring "architecture confirmed" based solely on spec collection completion
|
||||
- ❌ Generating the diagram but NOT showing the configuration table
|
||||
- ❌ Skipping the "anything to change?" question and jumping straight to Phase 2
|
||||
|
||||
**Validation condition**: Phase 2 entry is NOT allowed if the `01_arch_diagram_draft.html` file has not been generated.
|
||||
|
||||
**Report format after diagram completion (ALL sections are MANDATORY):**
|
||||
```
|
||||
## Architecture Diagram
|
||||
|
||||
[Interactive diagram link — auto-opened in browser]
|
||||
|
||||
### Confirmed Configuration
|
||||
|
||||
| Service | Type | SKU/Tier | Details |
|
||||
|---------|------|----------|---------|
|
||||
| [Service name] | [Azure resource type] | [SKU] | [Key config: model, capacity, etc.] |
|
||||
| ... | ... | ... | ... |
|
||||
|
||||
**Networking**: [VNet + Private Endpoint / Public / etc.]
|
||||
**Location**: [confirmed region]
|
||||
```
|
||||
|
||||
**After showing the report, immediately use `ask_user` with choices:**
|
||||
```
|
||||
ask_user({
|
||||
question: "The architecture diagram and configuration are ready. What would you like to do?",
|
||||
choices: [
|
||||
"Looks good — proceed to Bicep code generation (Recommended)",
|
||||
"I want to modify the architecture",
|
||||
"Add more services"
|
||||
]
|
||||
})
|
||||
```
|
||||
|
||||
- If "proceed" → move to Phase 2 transition (collect subscription/RG info)
|
||||
- If "modify" or "add" → apply changes, regenerate diagram, show report again
|
||||
|
||||
**🚨 The configuration table is NOT optional.** The user needs to visually verify what was confirmed before proceeding. Without the table, the user cannot validate the architecture.
|
||||
|
||||
### 1-2. Interactive HTML Diagram Generation
|
||||
|
||||
Use the built-in **diagram engine** (Python scripts included in the skill) to create an interactive HTML diagram.
|
||||
No `pip install` is needed as the scripts are directly available in the `scripts/` folder, requiring no network connection or package installation.
|
||||
605+ official Azure icons are built in.
|
||||
|
||||
**Diagram file naming convention:**
|
||||
|
||||
All diagrams are generated inside the Bicep project folder (`<project-name>/`).
|
||||
They are systematically managed with numbered prefixes per stage, and previous stage files are never overwritten.
|
||||
|
||||
| Stage | File Name | When Generated |
|
||||
|-------|-----------|----------------|
|
||||
| Phase 1 design draft | `01_arch_diagram_draft.html` | When architecture design is confirmed |
|
||||
| Phase 4 What-if preview | `02_arch_diagram_preview.html` | After What-if validation |
|
||||
| Phase 4 deployment result | `03_arch_diagram_result.html` | After actual deployment completes |
|
||||
|
||||
**Built-in module path discovery + Python path discovery:**
|
||||
|
||||
**🚨 The Python path + built-in module path are verified once during Phase 1 preload, and reused for all subsequent diagram generations. Do NOT re-discover every time.**
|
||||
|
||||
```powershell
|
||||
# ─── Step 1: Python Path Discovery ───
|
||||
# ⚠️ Get-Command python may pick up the Windows Store alias, so filesystem discovery is done first
|
||||
$PythonCmd = $null
|
||||
|
||||
# Priority 1: Direct discovery of actual installation path (most reliable)
|
||||
$PythonExe = Get-ChildItem -Path "$env:LOCALAPPDATA\Programs\Python" -Filter "python.exe" -Recurse -ErrorAction SilentlyContinue |
|
||||
Where-Object { $_.FullName -notlike '*WindowsApps*' } |
|
||||
Select-Object -First 1 -ExpandProperty FullName
|
||||
if ($PythonExe) { $PythonCmd = $PythonExe }
|
||||
|
||||
# Priority 2: Program Files discovery
|
||||
if (-not $PythonCmd) {
|
||||
$PythonExe = Get-ChildItem -Path "$env:ProgramFiles\Python*", "$env:ProgramFiles(x86)\Python*" -Filter "python.exe" -Recurse -ErrorAction SilentlyContinue |
|
||||
Select-Object -First 1 -ExpandProperty FullName
|
||||
if ($PythonExe) { $PythonCmd = $PythonExe }
|
||||
}
|
||||
|
||||
# Priority 3: Find in PATH (only if not a Windows Store alias)
|
||||
if (-not $PythonCmd) {
|
||||
foreach ($cmd in @('python3', 'py')) {
|
||||
$found = Get-Command $cmd -ErrorAction SilentlyContinue
|
||||
if ($found -and $found.Source -notlike '*WindowsApps*') { $PythonCmd = $cmd; break }
|
||||
}
|
||||
}
|
||||
|
||||
if (-not $PythonCmd) {
|
||||
Write-Host ""
|
||||
Write-Host "Python is not installed or not found in PATH." -ForegroundColor Red
|
||||
Write-Host ""
|
||||
Write-Host "Please install using one of the following methods:" -ForegroundColor Yellow
|
||||
Write-Host " 1. winget install Python.Python.3.12"
|
||||
Write-Host " 2. Download from https://www.python.org/downloads/"
|
||||
Write-Host " 3. Search for 'Python 3.12' in the Microsoft Store and install"
|
||||
Write-Host ""
|
||||
Write-Host "After installation, restart your terminal and try again."
|
||||
return
|
||||
}
|
||||
|
||||
# ─── Step 2: Built-in Script Path Discovery (no pip install needed) ───
|
||||
# Priority 1: Project local skill folder
|
||||
$ScriptsDir = Get-ChildItem -Path ".github\skills\azure-architecture-autopilot" -Filter "cli.py" -Recurse -ErrorAction SilentlyContinue |
|
||||
Where-Object { $_.Directory.Name -eq 'scripts' } |
|
||||
Select-Object -First 1 -ExpandProperty DirectoryName
|
||||
# Priority 2: Global skill folder
|
||||
if (-not $ScriptsDir) {
|
||||
$ScriptsDir = Get-ChildItem -Path "$env:USERPROFILE\.copilot\skills\azure-architecture-autopilot" -Filter "cli.py" -Recurse -ErrorAction SilentlyContinue |
|
||||
Where-Object { $_.Directory.Name -eq 'scripts' } |
|
||||
Select-Object -First 1 -ExpandProperty DirectoryName
|
||||
}
|
||||
|
||||
# ─── Step 3: Diagram Generation (CLI method — direct script execution) ───
|
||||
$OutputFile = "<project-name>\01_arch_diagram_draft.html"
|
||||
|
||||
& $PythonCmd "$ScriptsDir\cli.py" `
|
||||
--services '<services_JSON>' `
|
||||
--connections '<connections_JSON>' `
|
||||
--title "Architecture Title" `
|
||||
--vnet-info "10.0.0.0/16 | pe-subnet: 10.0.1.0/24" `
|
||||
--output $OutputFile
|
||||
|
||||
# Automatically open in browser after generation
|
||||
Start-Process $OutputFile
|
||||
```
|
||||
|
||||
**Python API method is also available (alternative):**
|
||||
|
||||
When JSON is very large, you can directly call the Python API to avoid CLI argument length limitations.
|
||||
Add the scripts folder to `sys.path` to import the built-in module:
|
||||
|
||||
```python
|
||||
import sys, os
|
||||
# Add scripts folder to Python path (use built-in module without pip install)
|
||||
scripts_dir = r"<absolute path to scripts folder>" # $ScriptsDir value found in Step 2
|
||||
sys.path.insert(0, scripts_dir)
|
||||
|
||||
from generator import generate_diagram
|
||||
|
||||
services = [...] # services JSON
|
||||
connections = [...] # connections JSON
|
||||
|
||||
html = generate_diagram(
|
||||
services=services,
|
||||
connections=connections,
|
||||
title="Architecture Title",
|
||||
vnet_info="10.0.0.0/16 | pe-subnet: 10.0.1.0/24",
|
||||
hierarchy=None # Only used for multiple subscriptions/RGs
|
||||
)
|
||||
|
||||
with open("<project-name>/01_arch_diagram_draft.html", "w", encoding="utf-8") as f:
|
||||
f.write(html)
|
||||
```
|
||||
|
||||
**🔹 CLI vs Python API Selection Criteria:**
|
||||
|
||||
| Scenario | Method | Reason |
|
||||
|----------|--------|--------|
|
||||
| 10 or fewer services | CLI (`python scripts/cli.py`) | Simple and fast |
|
||||
| More than 10 services or using hierarchy | Python API (sys.path addition) | Avoids CLI argument length limits |
|
||||
| Multi-subscription/RG diagrams | Python API + `hierarchy` parameter | Hierarchical structure representation |
|
||||
|
||||
**Full list of supported service types:**
|
||||
|
||||
Available in the skill's built-in reference files under `references/`.
|
||||
Supported service type values are listed below in the services JSON format section.
|
||||
|
||||
> **Diagram generation order**: (1) Verify Python path → (2) Verify built-in module path → (3) Compose services/connections JSON → (4) Execute. If Python is not installed, guide the user to install it before composing JSON. This prevents the waste of building JSON only to fail because Python is missing.
|
||||
|
||||
> **🚨 Automatic Diagram Open (No Exceptions)**: When an HTML file is generated with the built-in diagram engine, it **MUST always** be opened in the browser regardless of the situation. Without exception, whenever a diagram is (re)generated, execute the `Start-Process` command. Diagram generation and browser opening are always executed together in a single PowerShell command block.
|
||||
>
|
||||
> **When this applies (not just these, but ALL times an HTML diagram is generated):**
|
||||
> - Phase 1 design draft (`01_arch_diagram_draft.html`)
|
||||
> - Diagram regeneration after Delta Confirmation
|
||||
> - Phase 4 What-if preview (`02_arch_diagram_preview.html`)
|
||||
> - Phase 4 deployment result (`03_arch_diagram_result.html`)
|
||||
> - Architecture changes after deployment (`04_arch_diagram_update_draft.html`)
|
||||
> - Any other case where a diagram is regenerated for any reason
|
||||
|
||||
**services JSON format:**
|
||||
|
||||
Dynamically composed based on the user's confirmed service list. Below is the JSON structure description.
|
||||
|
||||
```json
|
||||
[
|
||||
{"id": "uniqueID", "name": "Service Display Name", "type": "iconType", "sku": "SKU", "private": true/false,
|
||||
"details": ["Detail line 1", "Detail line 2"]}
|
||||
]
|
||||
```
|
||||
|
||||
| Field | Required | Type | Description |
|
||||
|-------|----------|------|-------------|
|
||||
| `id` | Yes | string | Unique identifier (kebab-case) |
|
||||
| `name` | Yes | string | Display name shown on diagram |
|
||||
| `type` | Yes | string | Service type (select from list below) |
|
||||
| `sku` | | string | SKU/tier information |
|
||||
| `private` | | boolean | Private Endpoint connected (default: false) |
|
||||
| `details` | | string[] | Additional info shown in sidebar |
|
||||
| `subscription` | | string | Subscription name (required when using hierarchy) |
|
||||
| `resourceGroup` | | string | Resource group name (required when using hierarchy) |
|
||||
|
||||
**Service Type List (by category):**
|
||||
|
||||
| Category | Types |
|
||||
|----------|-------|
|
||||
| **AI** | `ai_foundry`, `ai_hub`, `openai`, `ai_search` / `search`, `document_intelligence` / `form_recognizer`, `aml` |
|
||||
| **Data** | `storage` / `adls`, `cosmos_db`, `sql_database`, `sql_server`, `databricks`, `data_factory` / `adf`, `fabric`, `redis`, `stream_analytics`, `synapse` |
|
||||
| **Security** | `keyvault` / `kv` |
|
||||
| **Compute** | `app_service` / `appservice`, `function_app`, `vm`, `aks`, `acr` / `container_registry` |
|
||||
| **Network** | `firewall`, `bastion`, `vpn_gateway` / `vpn`, `app_gateway`, `front_door`, `cdn`, `nsg`, `pe` |
|
||||
| **IoT** | `iot_hub` |
|
||||
| **Integration** | `event_hub` |
|
||||
| **Monitoring** | `log_analytics`, `app_insights` / `appinsights`, `monitor` |
|
||||
| **DevOps** | `devops` |
|
||||
| **Other** | `jumpbox`, `user`, etc. (unrecognized types use fuzzy matching + default icon) |
|
||||
|
||||
**When Using Private Endpoints — PE Node Addition Required:**
|
||||
|
||||
If Private Endpoints are included in the architecture, a PE node MUST be added to the services JSON for each service, and connections must also include the PE links for them to appear in the diagram.
|
||||
|
||||
```json
|
||||
// Add PE node corresponding to each service
|
||||
{"id": "pe_serviceID", "name": "PE: ServiceName", "type": "pe", "details": ["groupId: correspondingGroupID"]}
|
||||
|
||||
// Add service → PE connection in connections
|
||||
{"from": "serviceID", "to": "pe_serviceID", "label": "", "type": "private"}
|
||||
```
|
||||
|
||||
**🚨🚨🚨 PE Connections and Business Logic Connections Are Separate — BOTH MUST Be Included 🚨🚨🚨**
|
||||
|
||||
PE connections (`"type": "private"`) represent network isolation. But this alone does NOT show the actual **data flow/API calls** between services in the diagram.
|
||||
|
||||
**MUST include both types of connections:**
|
||||
|
||||
1. **Business logic connections** — Actual data flow between services (api, data, security types)
|
||||
2. **PE connections** — Network isolation between service ↔ PE (private type)
|
||||
|
||||
```json
|
||||
// ✅ Correct example — Function App → Foundry
|
||||
// 1) Business logic: Function App calls Foundry for chat/embedding
|
||||
{"from": "func_app", "to": "foundry", "label": "RAG Chat + Embedding", "type": "api"}
|
||||
// 2) PE connection: Foundry's Private Endpoint
|
||||
{"from": "foundry", "to": "pe_foundry", "label": "", "type": "private"}
|
||||
|
||||
// ❌ Wrong example — Only PE connection, no business logic connection
|
||||
{"from": "foundry", "to": "pe_foundry", "label": "", "type": "private"}
|
||||
// → No connection line between Function App and Foundry in the diagram, so the architecture flow is not visible
|
||||
```
|
||||
|
||||
**NEVER do this:**
|
||||
- Create only PE connections and omit business logic connections
|
||||
- Connect `from`/`to` of business logic connections to PE nodes (use the **actual service ID**, not the PE)
|
||||
- Assume "the PE is there so the connection line will show up"
|
||||
|
||||
The PE groupId differs by service. Refer to the PE groupId & DNS Zone mapping table in `references/service-gotchas.md`.
|
||||
|
||||
> **Service naming convention**: MUST use the latest official Azure names. If uncertain about the name, verify with MS Docs.
|
||||
> For resource types and key properties per service, refer to `references/ai-data.md`.
|
||||
|
||||
**connections JSON format:**
|
||||
```json
|
||||
[
|
||||
{"from": "serviceA_ID", "to": "serviceB_ID", "label": "Connection description", "type": "api|data|security|private"}
|
||||
]
|
||||
```
|
||||
|
||||
**Connection Types:**
|
||||
|
||||
| type | Color | Style | Use For |
|
||||
|------|-------|-------|---------|
|
||||
| `api` | Blue | Solid | API calls, queries |
|
||||
| `data` | Green | Solid | Data flow, indexing |
|
||||
| `security` | Orange | Dashed | Secrets, auth |
|
||||
| `private` | Purple | Dashed | Private Endpoint connections |
|
||||
| `network` | Gray | Solid | Network routing |
|
||||
| `default` | Gray | Solid | Other |
|
||||
|
||||
**🔹 Diagram Multilingual Principle:**
|
||||
- The `name`, `details` in services and `label` in connections are written in **the user's language**
|
||||
- Example: `"label": "RAG Search"`, `"label": "Data Ingestion"`
|
||||
- Official Azure service names (Microsoft Foundry, AI Search, etc.) are always in English regardless of language
|
||||
|
||||
**🔹 VNet Node — Do NOT add to services JSON:**
|
||||
- VNet is automatically displayed as a **purple dashed boundary** in the diagram (when PEs are present)
|
||||
- Adding a separate VNet node to services JSON causes confusion by duplicating with the boundary line
|
||||
- VNet information (CIDR, subnets) is sufficiently conveyed through the sidebar VNet boundary label
|
||||
|
||||
Provide the full path of the generated HTML file to the user.
|
||||
|
||||
### 1-3. Finalizing Architecture Through Conversation
|
||||
|
||||
The architecture is finalized incrementally through conversation with the user. When the user requests changes, do NOT ask everything from scratch; instead, **reflect only the requested changes based on the current confirmed state** and regenerate the diagram.
|
||||
|
||||
**⚠️ Delta Confirmation Rule — Required Verification on Service Addition/Change:**
|
||||
|
||||
Service addition/change is not a "simple update" — it is an **event that reopens undecided required fields for that service**.
|
||||
|
||||
**Process:**
|
||||
1. Diff the current confirmed state + new request
|
||||
2. Identify the required fields for newly added services (refer to `domain-packs` or MS Docs)
|
||||
3. Fetch the region availability/options for the service from MS Docs
|
||||
4. If any required fields are undecided, **ask the user via ask_user first**
|
||||
5. **Regenerate the diagram only after confirmation is complete**
|
||||
|
||||
**NEVER do this:**
|
||||
- Finalize diagram update while required fields remain undecided
|
||||
- Arbitrarily add sub-components/workloads the user did not mention (e.g., automatically adding OneLake and data pipeline to a Fabric request)
|
||||
- Vaguely assume SKU/model like "F SKU" without confirmation
|
||||
|
||||
**Do not re-ask settings for already confirmed services.** Only confirm undecided items for newly added/changed services.
|
||||
|
||||
---
|
||||
|
||||
**🚨🚨🚨 [Top Priority Principle] Immediate Fact Check During Design Phase 🚨🚨🚨**
|
||||
|
||||
**The purpose of Phase 1 is to confirm a "feasible architecture".**
|
||||
**No matter what the user requests, before reflecting it in the diagram, you MUST fact-check whether it is actually possible by directly querying MS Docs via web_fetch.**
|
||||
|
||||
**Design Direction vs Deployment Specs — Separate Information Paths:**
|
||||
|
||||
| Decision Type | Reference Path | Examples |
|
||||
|--------------|----------------|----------|
|
||||
| **Design direction** (architecture patterns, best practices, service combinations) | `references/architecture-guidance-sources.md` → targeted fetch | "What's the recommended RAG structure?", "Enterprise baseline?" |
|
||||
| **Deployment specs** (API version, SKU, region, model, PE mapping) | `references/azure-dynamic-sources.md` → MS Docs fetch | "What's the API version?", "Is this model available in Korea Central?" |
|
||||
|
||||
- **Design direction comes from architecture guidance, actual deployment values from dynamic sources.** Do not mix these two paths.
|
||||
- Do NOT use Architecture guidance document content to determine SKU/API version/region.
|
||||
- **Do NOT crawl through all Architecture Center sub-documents for every request.** Perform trigger-based targeted fetch of at most 2 relevant documents.
|
||||
- For trigger/fetch budget/decision rules by question type, refer to `architecture-guidance-sources.md`.
|
||||
|
||||
**This principle applies to ALL requests without exception:**
|
||||
- Model addition/change → Verify in MS Docs whether the model exists and can be deployed in the target region
|
||||
- Service addition/change → Verify in MS Docs whether the service is available in the target region
|
||||
- SKU change → Verify in MS Docs whether the SKU is valid and supports the desired features
|
||||
- Feature request → Verify in MS Docs whether the feature is actually supported
|
||||
- Service combination → Verify in MS Docs whether inter-service integration is possible
|
||||
- **Any other request** → Fact-check with MS Docs
|
||||
|
||||
**MS Docs verification results:**
|
||||
- **Possible** → Reflect in diagram
|
||||
- **Not possible** → Immediately explain the reason to the user and suggest available alternatives
|
||||
|
||||
**Fact Check Process — Cross-Verification Required:**
|
||||
|
||||
Do not simply query once and move on for user requests.
|
||||
**Cross-verification using other MS Docs pages/sources MUST always be performed.**
|
||||
|
||||
> **GHCP Environment Constraint**: Sub-agents (explore/task/general-purpose) do NOT have `web_fetch`/`web_search` tools.
|
||||
> Therefore, verification requiring MS Docs queries MUST be performed **directly by the main agent**.
|
||||
|
||||
```
|
||||
[1st Verification] Main agent directly queries MS Docs via web_fetch (primary page)
|
||||
↓
|
||||
[2nd Verification] Main agent additionally fetches other/related MS Docs pages via web_fetch for cross-checking
|
||||
- e.g., Model availability → 1st: models page / 2nd: regional availability or pricing page
|
||||
- e.g., API version → 1st: Bicep reference page / 2nd: REST API reference page
|
||||
- Compare 1st and 2nd results and flag any discrepancies
|
||||
↓
|
||||
[Consolidate Results] If both verifications match, respond to the user
|
||||
- On discrepancy: Resolve with additional queries, or honestly inform the user about the uncertainty
|
||||
```
|
||||
|
||||
**Fact Check Quality Standards — Be Thorough, Not Cursory:**
|
||||
- When a MS Docs page is fetched, **check ALL relevant sections, tabs, and conditions without omission**
|
||||
- When checking model availability: Check **ALL deployment types** including Global Standard, Standard, Provisioned, Data Zone, etc. Do NOT conclude "not supported" based on only one deployment type
|
||||
- When checking SKUs: **Fully** verify the feature list supported by that SKU
|
||||
- If the page is large, fetch relevant sections **multiple times** to ensure accuracy
|
||||
- If uncertain, query additional pages. **NEVER answer based on guesswork**
|
||||
|
||||
**NEVER do this:**
|
||||
- Add to the diagram without verification
|
||||
- Defer verification with "I'll check during Bicep generation" or "It will be validated during deployment"
|
||||
- Rely only on your memory and answer "it should work" — **MUST directly query MS Docs**
|
||||
- Fetch MS Docs but rush to conclusions after only partially reading
|
||||
- Finalize based on a single query — **MUST cross-verify with another source**
|
||||
|
||||
**🚫 Sub-Agent Usage Rules:**
|
||||
|
||||
**Sub-agents in GHCP = `task` tool:**
|
||||
- `agent_type: "explore"` — Read-only tasks like codebase exploration, file search (**web_fetch/web_search NOT available**)
|
||||
- `agent_type: "task"` — Command execution like az cli, bicep build
|
||||
- `agent_type: "general-purpose"` — High-level tasks like complex Bicep generation
|
||||
|
||||
> **⚠️ Sub-agent tool constraint**: ALL sub-agents (explore/task/general-purpose) CANNOT use `web_fetch` or `web_search`.
|
||||
> Fact checks requiring MS Docs queries, API version verification, model availability checks, etc. MUST be performed **directly by the main agent**.
|
||||
|
||||
**Foreground vs Background Decision Criteria:**
|
||||
- **If results are needed before proceeding to the next step → `mode: "sync"` (default)**
|
||||
- e.g., Query SKU list then provide choices to user, verify model availability then reflect in diagram
|
||||
- Running in background here would leave the user idle waiting for results
|
||||
- **If there is other independent work that can be done while waiting for results → `mode: "background"`**
|
||||
- e.g., Simultaneously web_fetch multiple MS Docs pages for cross-verification
|
||||
|
||||
**Most fact checks should be run in foreground (`mode: "sync"`)** because the next question cannot be asked without the results.
|
||||
|
||||
**How to run cross-verification in parallel:**
|
||||
```
|
||||
// Execute 1st and 2nd verification simultaneously (main agent performs directly)
|
||||
[Simultaneously] Directly query primary MS Docs page via web_fetch (1st)
|
||||
[Simultaneously] Additionally query related MS Docs page via web_fetch (2nd)
|
||||
// Compare both results to check for discrepancies
|
||||
// e.g., Model availability → parallel fetch of models page + regional availability page
|
||||
```
|
||||
|
||||
**NEVER do this:**
|
||||
- Run in background when results are needed, then sit idle doing nothing while waiting
|
||||
- Delegate tasks requiring web_fetch/web_search to sub-agents (main agent MUST perform directly)
|
||||
- Attempt to directly read files internal to sub-agents
|
||||
|
||||
---
|
||||
|
||||
**⚠️ Important: Do NOT execute any shell commands until the user explicitly approves proceeding to the next step.**
|
||||
However, MS Docs web_fetch for the above fact checks is exceptionally allowed.
|
||||
|
||||
Once the architecture is confirmed (user said no changes to the diagram), ask the user whether to proceed to the next step.
|
||||
|
||||
**🚨 Phase 2 Transition Prerequisites — ALL of the following must be met before asking this question:**
|
||||
|
||||
1. `01_arch_diagram_draft.html` has been **generated** using the built-in diagram engine
|
||||
2. The diagram has been **opened in the browser** and **displayed to the user** in the report format with the **configuration table**
|
||||
3. The user was asked **"Would you like to change or add anything?"** and responded with **no changes**, or modifications have been reflected and **final confirmation** is given
|
||||
|
||||
**If ANY of the above conditions are not met, do NOT proceed to Phase 2.**
|
||||
If the diagram does not exist yet, **generate it right now** — follow the procedure in section 1-2.
|
||||
If the configuration table was not shown, **show it right now** before asking about changes.
|
||||
|
||||
**Following the parallel preload principle, execute `az account list` and `az group list` simultaneously with ask_user to prepare subscription/RG choices in advance.**
|
||||
|
||||
```
|
||||
// Call simultaneously in the same response:
|
||||
[1] ask_user — "The architecture is confirmed! Shall we proceed to the next step?"
|
||||
[2] powershell — az account show 2>&1 (pre-check login status)
|
||||
[3] powershell — az account list --output json (pre-prepare subscription choices)
|
||||
[4] powershell — az group list --output json (pre-prepare resource group choices)
|
||||
```
|
||||
|
||||
ask_user display format:
|
||||
```
|
||||
The architecture is confirmed! Shall we proceed to the next step?
|
||||
|
||||
✅ Confirmed architecture: [summary]
|
||||
|
||||
The following steps will proceed:
|
||||
1. [Bicep Code Generation] — AI automatically writes IaC code
|
||||
2. [Code Review] — Automated security/best practice review
|
||||
3. [Azure Deployment] — Actual resource creation (optional)
|
||||
|
||||
Shall we proceed? (If you'd like just the code without deployment, let me know)
|
||||
```
|
||||
|
||||
Once the user approves, collect information in the following order.
|
||||
**Since `az account show` + `az account list` + `az group list` were already completed during preload, subscription/RG choices can be presented immediately.**
|
||||
|
||||
**Step 1: Azure Login Verification**
|
||||
|
||||
The `az account show` result is already available from preload. No additional call needed.
|
||||
|
||||
- If logged in → Move to Step 2
|
||||
- If not logged in → Guide the user:
|
||||
```
|
||||
Azure CLI login is required. Please run the following command in your terminal:
|
||||
az login
|
||||
Please let me know once completed.
|
||||
```
|
||||
|
||||
**Step 2: Subscription Selection**
|
||||
|
||||
The `az account list` result is already available from preload. No additional call needed.
|
||||
|
||||
Provide up to 4 subscriptions from the query results as `ask_user` choices.
|
||||
If there are 5 or more, include the 3-4 most frequently used subscriptions as choices (users can also type a custom input).
|
||||
Once the user selects, execute `az account set --subscription "<ID>"`.
|
||||
|
||||
**Step 3: Resource Group Confirmation**
|
||||
|
||||
The `az group list` result is already available from preload. No additional call needed.
|
||||
|
||||
Provide up to 4 existing resource groups from the list as `ask_user` choices.
|
||||
If the user selects an existing group, use it as-is; if they type a new name as custom input, create it during Phase 4 deployment.
|
||||
|
||||
**Required confirmed items:**
|
||||
- [ ] Service list and SKUs
|
||||
- [ ] Networking method (Private Endpoint usage)
|
||||
- [ ] Subscription ID (confirmed in Step 2)
|
||||
- [ ] Resource group name (confirmed in Step 3)
|
||||
- [ ] Location (confirmed with user — regional availability per service verified via MS Docs)
|
||||
|
||||
---
|
||||
|
||||
## 🚨 Phase 1 Completion Checklist — Required Verification Before Phase 2 Entry
|
||||
|
||||
Before leaving Phase 1, verify **ALL** items below. If any are incomplete, do NOT proceed to Phase 2.
|
||||
|
||||
| # | Item | Verification Method |
|
||||
|---|------|---------------------|
|
||||
| 1 | All required specs confirmed | Project name, services, SKUs, region, and networking method are all confirmed |
|
||||
| 2 | Fact check completed | MS Docs cross-verification has been performed |
|
||||
| 3 | **Diagram generated** | `01_arch_diagram_draft.html` file has been generated using the built-in diagram engine |
|
||||
| 4 | **Configuration table shown** | Detailed table with Service/Type/SKU/Details displayed to user in report format |
|
||||
| 5 | **User reviewed diagram** | Browser auto-open + report format + "anything to change?" question asked |
|
||||
| 6 | User final approval | User confirmed no changes, then selected "proceed to next step" |
|
||||
|
||||
**⚠️ Do NOT ask item 6 while items 3-5 are incomplete.** The flow must be: diagram → table → ask changes → confirm → next step.
|
||||
|
||||
---
|
||||
|
||||
## Phase 2 Handoff: Bicep Generation Agent
|
||||
|
||||
Once the user agrees to proceed, read the `references/bicep-generator.md` instructions and generate the Bicep template.
|
||||
Alternatively, this can be delegated to a separate sub-agent.
|
||||
|
||||
**Sensitive Information Handling Principle (NEVER violate):**
|
||||
- NEVER ask for VM passwords, API keys, or other sensitive values in chat, and NEVER store them in parameter files
|
||||
- During code review, if sensitive values are found in plaintext in `main.bicepparam`, remove them immediately
|
||||
|
||||
**🔹 User-Input Sensitive Values Like VM Passwords — Complexity Validation Required:**
|
||||
|
||||
When the user inputs a VM admin password or similar, validate complexity requirements **before** sending to Azure.
|
||||
Azure VMs must satisfy ALL of the following conditions:
|
||||
- 12 characters or more
|
||||
- Contains at least 3 of: uppercase letters, lowercase letters, numbers, special characters
|
||||
|
||||
**On validation failure:** Do NOT attempt deployment; immediately ask the user to re-enter:
|
||||
> **⚠️ The password does not meet Azure complexity requirements.** It must be 12 characters or more and contain at least 3 of: uppercase + lowercase + numbers + special characters.
|
||||
|
||||
**NEVER do this:**
|
||||
- Warn "it may not meet requirements" but attempt deployment anyway — **MUST block**
|
||||
- Send to Azure without complexity validation, causing deployment failure
|
||||
|
||||
**🚨 `@secure()` Parameter and `.bicepparam` Compatibility Principle:**
|
||||
|
||||
When a `.bicepparam` file has a `using './main.bicep'` directive, additional `--parameters` flags CANNOT be used together with `az deployment group what-if/create`.
|
||||
Therefore, `@secure()` parameter handling follows these rules:
|
||||
|
||||
1. **`@secure()` parameters MUST have default values** — Use Bicep functions like `newGuid()`, `uniqueString()`
|
||||
```bicep
|
||||
@secure()
|
||||
param sqlAdminPassword string = newGuid() // Auto-generated at deployment, store in Key Vault if needed
|
||||
```
|
||||
2. **If there are `@secure()` parameters that require user-specified values:**
|
||||
- Do NOT use `.bicepparam` file; instead use `--template-file` + `--parameters` combination
|
||||
- Or generate a separate JSON parameter file (`main.parameters.json`)
|
||||
```powershell
|
||||
# When .bicepparam cannot be used — substitute with JSON parameter file
|
||||
az deployment group what-if `
|
||||
--template-file main.bicep `
|
||||
--parameters main.parameters.json `
|
||||
--parameters sqlAdminPassword='user-input-value'
|
||||
```
|
||||
3. **Do NOT use `.bicepparam` and `--parameters` simultaneously in a deployment command**
|
||||
```
|
||||
❌ az deployment group create --parameters main.bicepparam --parameters key=value
|
||||
✅ az deployment group create --parameters main.bicepparam
|
||||
✅ az deployment group create --template-file main.bicep --parameters main.parameters.json --parameters key=value
|
||||
```
|
||||
|
||||
**Decision criteria:**
|
||||
- All `@secure()` parameters have default values (newGuid, etc.) → `.bicepparam` can be used
|
||||
- Any `@secure()` parameter requires user input → Use JSON parameter file instead of `.bicepparam`
|
||||
|
||||
**When MS Docs fetch fails:**
|
||||
- If web_fetch fails due to rate limiting, etc., MUST notify the user:
|
||||
```
|
||||
⚠️ MS Docs API version lookup failed. Generating with the last known stable version.
|
||||
Verifying the actual latest version before deployment is recommended.
|
||||
Shall we continue?
|
||||
```
|
||||
- Do NOT silently proceed with a hardcoded version without user approval
|
||||
|
||||
**Pre-Bicep generation reference files:**
|
||||
- `references/service-gotchas.md` — Required properties, common mistakes, PE groupId/DNS Zone mapping
|
||||
- `references/ai-data.md` — AI/Data service configuration guide (v1 domain)
|
||||
- `references/azure-common-patterns.md` — PE/security/naming common patterns
|
||||
- `references/azure-dynamic-sources.md` — MS Docs URL registry (for API version fetch)
|
||||
- For services not covered in the above files, directly fetch MS Docs to verify resource types, properties, and PE mappings
|
||||
|
||||
**Output structure:**
|
||||
```
|
||||
<project-name>/
|
||||
├── main.bicep # Main orchestration
|
||||
├── main.bicepparam # Parameters (environment-specific values)
|
||||
└── modules/
|
||||
├── network.bicep # VNet, Subnet (including private endpoint subnet)
|
||||
├── ai.bicep # AI services (configured per user requirements)
|
||||
├── storage.bicep # ADLS Gen2 (isHnsEnabled: true)
|
||||
├── fabric.bicep # Microsoft Fabric (if needed)
|
||||
├── keyvault.bicep # Key Vault
|
||||
└── private-endpoints.bicep # All PEs + DNS Zones
|
||||
```
|
||||
|
||||
**Bicep mandatory principles:**
|
||||
- Parameterize all resource names — `param openAiName string = 'oai-${uniqueString(resourceGroup().id)}'`
|
||||
- Private services MUST have `publicNetworkAccess: 'Disabled'`
|
||||
- Set `privateEndpointNetworkPolicies: 'Disabled'` on pe-subnet
|
||||
- Private DNS Zone + VNet Link + DNS Zone Group — all 3 required
|
||||
- When using Microsoft Foundry, **Foundry Project (`accounts/projects`) MUST be created alongside** — without it, the portal is unusable
|
||||
- ADLS Gen2 MUST have `isHnsEnabled: true` (omitting this creates a regular Blob Storage)
|
||||
- Store secrets in Key Vault, reference via `@secure()` parameters
|
||||
- Add English comments explaining the purpose of each section
|
||||
|
||||
Immediately transition to Phase 3 after generation is complete.
|
||||
|
||||
---
|
||||
|
||||
## Phase 3 Handoff: Bicep Review Agent
|
||||
|
||||
Review according to `references/bicep-reviewer.md` instructions.
|
||||
|
||||
**⚠️ Key Point: Do NOT just visually inspect and say "pass". You MUST run `az bicep build` to verify actual compilation results.**
|
||||
|
||||
```powershell
|
||||
az bicep build --file main.bicep 2>&1
|
||||
```
|
||||
|
||||
1. Compilation errors/warnings → Fix
|
||||
2. Checklist review → Fix
|
||||
3. Re-compile to confirm
|
||||
4. Report results (including compilation results)
|
||||
|
||||
For detailed checklists and fix procedures, see `references/bicep-reviewer.md`.
|
||||
|
||||
After review is complete, show the user the results before transitioning to Phase 4, and **MUST guide the user on the next steps.**
|
||||
|
||||
**🚨 Required Report Format When Phase 3 Is Complete:**
|
||||
|
||||
```
|
||||
## Bicep Code Review Complete
|
||||
|
||||
[Review result summary — bicep-reviewer.md Step 6 format]
|
||||
|
||||
---
|
||||
|
||||
**Next Step: Phase 4 (Azure Deployment)**
|
||||
|
||||
The review is complete. The following steps will proceed:
|
||||
1. **What-if Validation** — Preview planned resources without making actual changes
|
||||
2. **Preview Diagram** — Architecture visualization based on What-if results (02_arch_diagram_preview.html)
|
||||
3. **Actual Deployment** — Create resources in Azure after user confirmation
|
||||
|
||||
Shall we proceed with deployment? (If you'd like just the code without deployment, let me know)
|
||||
```
|
||||
|
||||
**NEVER do this:**
|
||||
- Completing Phase 3 and just providing the `az deployment group create` command without further guidance
|
||||
- Deploying directly without What-if validation, or telling the user to run commands themselves
|
||||
- Skipping the Phase 4 steps (What-if → Preview Diagram → Deployment)
|
||||
@@ -0,0 +1,318 @@
|
||||
# Phase 4: Deployment Agent
|
||||
|
||||
This file contains detailed instructions for Phase 4. Read and follow this file when the user approves deployment after Phase 3 (code review) is complete.
|
||||
|
||||
---
|
||||
|
||||
**🚨🚨🚨 Phase 4 Mandatory Execution Order — Never Skip Any Step 🚨🚨🚨**
|
||||
|
||||
The following 5 steps must be executed **strictly in order**. No step may be omitted or skipped.
|
||||
Even if the user requests deployment with "deploy it", "go ahead", "do it", etc., always proceed from Step 1 in order.
|
||||
|
||||
```
|
||||
Step 1: Verify prerequisites (az login, subscription, resource group)
|
||||
↓
|
||||
Step 2: What-if validation (az deployment group what-if) ← Must execute
|
||||
↓
|
||||
Step 3: Generate preview diagram (02_arch_diagram_preview.html) ← Must generate
|
||||
↓
|
||||
Step 4: Actual deployment after user final confirmation (az deployment group create)
|
||||
↓
|
||||
Step 5: Generate deployment result diagram (03_arch_diagram_result.html)
|
||||
```
|
||||
|
||||
**Never do the following:**
|
||||
- Execute `az deployment group create` directly without What-if
|
||||
- Skip generating the preview diagram (`02_arch_diagram_preview.html`)
|
||||
- Proceed with deployment without showing What-if results to the user
|
||||
- Only provide `az` commands for the user to run manually
|
||||
|
||||
---
|
||||
|
||||
### Step 1: Verify Prerequisites
|
||||
|
||||
```powershell
|
||||
# Verify az CLI installation and login
|
||||
az account show 2>&1
|
||||
```
|
||||
|
||||
If not logged in, ask the user to run `az login`.
|
||||
The agent must never enter or store credentials directly.
|
||||
|
||||
Create resource group:
|
||||
```powershell
|
||||
az group create --name "<RG_NAME>" --location "<LOCATION>" # Location confirmed in Phase 1
|
||||
```
|
||||
→ Proceed to next step after confirming success
|
||||
|
||||
### Step 2: Validate → What-if Validation — 🚨 Mandatory
|
||||
|
||||
**Do not skip this step. Always execute it no matter how urgently the user requests deployment.**
|
||||
|
||||
**Step 2-A: Run Validate First (Quick Pre-validation)**
|
||||
|
||||
`what-if` can **hang indefinitely without error messages** when there are Azure policy violations, resource reference errors, etc.
|
||||
To prevent this, **always run `validate` first**. Validate returns errors quickly.
|
||||
|
||||
```powershell
|
||||
# validate — Quickly catches policy violations, schema errors, parameter issues
|
||||
az deployment group validate `
|
||||
--resource-group "<RG_NAME>" `
|
||||
--parameters main.bicepparam
|
||||
```
|
||||
|
||||
- **Validate succeeds** → Proceed to Step 2-B (what-if)
|
||||
- **Validate fails** → Analyze error messages, fix Bicep, recompile, re-validate
|
||||
- Azure Policy violation (`RequestDisallowedByPolicy`) → Reflect policy requirements in Bicep (e.g., `azureADOnlyAuthentication: true`)
|
||||
- Schema error → Fix API version/properties
|
||||
- Parameter error → Fix parameter file
|
||||
|
||||
**Step 2-B: Run What-if**
|
||||
|
||||
Run what-if after validate passes.
|
||||
|
||||
**Choose parameter passing method:**
|
||||
- If all `@secure()` parameters have default values → Use `.bicepparam`
|
||||
- If `@secure()` parameters require user input → Use `--template-file` + JSON parameter file
|
||||
|
||||
```powershell
|
||||
# Method 1: Use .bicepparam (when all @secure() parameters have defaults)
|
||||
az deployment group what-if `
|
||||
--resource-group "<RG_NAME>" `
|
||||
--parameters main.bicepparam
|
||||
|
||||
# Method 2: Use JSON parameter file (when @secure() parameters require user input)
|
||||
az deployment group what-if `
|
||||
--resource-group "<RG_NAME>" `
|
||||
--template-file main.bicep `
|
||||
--parameters main.parameters.json `
|
||||
--parameters secureParam='value'
|
||||
```
|
||||
→ Summarize the What-if results and present them to the user.
|
||||
|
||||
**⏱️ What-if Execution Method and Timeout Handling:**
|
||||
|
||||
What-if performs resource validation on the Azure server side, so it may take time depending on the service/region.
|
||||
**Always execute with `initial_wait: 300` (5 minutes).** If not completed within 5 minutes, it automatically times out.
|
||||
|
||||
```powershell
|
||||
# Always set initial_wait: 300 when calling the powershell tool
|
||||
# mode: "sync", initial_wait: 300
|
||||
az deployment group what-if `
|
||||
--resource-group "<RG_NAME>" `
|
||||
--parameters main.bicepparam
|
||||
```
|
||||
|
||||
**Completed within 5 minutes** → Proceed normally (summarize results → preview diagram → deployment confirmation)
|
||||
|
||||
**Not completed within 5 minutes (timeout)** → Immediately stop with `stop_powershell` and offer choices to the user:
|
||||
|
||||
```
|
||||
ask_user({
|
||||
question: "What-if validation did not complete within 5 minutes. The Azure server response is delayed. How would you like to proceed?",
|
||||
choices: [
|
||||
"Retry (Recommended)",
|
||||
"Skip What-if and deploy directly"
|
||||
]
|
||||
})
|
||||
```
|
||||
|
||||
**If "Retry" is selected:** Re-execute the same command with `initial_wait: 300`. Retry up to 2 times maximum.
|
||||
**If "Skip What-if and deploy directly" is selected:**
|
||||
- Generate the preview diagram based on the Phase 1 draft
|
||||
- Inform the user of the risks:
|
||||
> **⚠️ Deploying without What-if validation.** Unexpected resource changes may occur. Please verify in the Azure Portal after deployment.
|
||||
|
||||
**Never do the following:**
|
||||
- Execute without setting `initial_wait`, causing indefinite waiting
|
||||
- Let the agent arbitrarily decide "what-if is optional" and skip it
|
||||
- Automatically switch to deployment without asking the user on timeout
|
||||
- Skip what-if for reasons like "deployment is faster"
|
||||
|
||||
### Step 3: Preview Diagram Based on What-if Results — 🚨 Mandatory
|
||||
|
||||
**Do not skip this step. Always generate the preview diagram when What-if succeeds.**
|
||||
|
||||
Regenerate the diagram using the actual resources to be deployed (resource names, types, locations, counts) from the What-if results.
|
||||
Keep the draft from Phase 1 (`01_arch_diagram_draft.html`) as-is, and generate the preview as `02_arch_diagram_preview.html`.
|
||||
The draft can be reopened at any time.
|
||||
|
||||
```
|
||||
## Architecture to Be Deployed (Based on What-if)
|
||||
|
||||
[Interactive diagram link — 02_arch_diagram_preview.html]
|
||||
(Design draft: 01_arch_diagram_draft.html)
|
||||
|
||||
Resources to be created (N items):
|
||||
[What-if results summary table]
|
||||
|
||||
Deploy these resources? (Yes/No)
|
||||
```
|
||||
|
||||
Proceed to Step 4 when the user confirms. **Do not proceed to deployment without the preview diagram.**
|
||||
|
||||
### Step 4: Actual Deployment
|
||||
|
||||
Execute only when the user has reviewed the preview diagram and What-if results and approved the deployment.
|
||||
**Use the same parameter passing method used in What-if.**
|
||||
|
||||
```powershell
|
||||
$deployName = "deploy-$(Get-Date -Format 'yyyyMMdd-HHmmss')"
|
||||
|
||||
# Method 1: Use .bicepparam
|
||||
az deployment group create `
|
||||
--resource-group "<RG_NAME>" `
|
||||
--parameters main.bicepparam `
|
||||
--name $deployName `
|
||||
2>&1 | Tee-Object -FilePath deployment.log
|
||||
|
||||
# Method 2: Use JSON parameter file
|
||||
az deployment group create `
|
||||
--resource-group "<RG_NAME>" `
|
||||
--template-file main.bicep `
|
||||
--parameters main.parameters.json `
|
||||
--name $deployName `
|
||||
2>&1 | Tee-Object -FilePath deployment.log
|
||||
```
|
||||
|
||||
Periodically monitor progress during deployment:
|
||||
```powershell
|
||||
az deployment group show `
|
||||
--resource-group "<RG_NAME>" `
|
||||
--name "<DEPLOYMENT_NAME>" `
|
||||
--query "{status:properties.provisioningState, duration:properties.duration}" `
|
||||
-o table
|
||||
```
|
||||
|
||||
### Handling Deployment Failures
|
||||
|
||||
When deployment fails, some resources may remain in a 'Failed' state. Redeploying in this state causes errors like `AccountIsNotSucceeded`.
|
||||
|
||||
**⚠️ Resource deletion is a destructive command. Always explain the situation to the user and obtain approval before executing.**
|
||||
|
||||
```
|
||||
[Resource name] failed during deployment.
|
||||
To redeploy, the failed resources must be deleted first.
|
||||
|
||||
Delete and redeploy? (Yes/No)
|
||||
```
|
||||
|
||||
Delete failed resources and redeploy once the user approves.
|
||||
|
||||
**🔹 Handling Soft-deleted Resources (Prevent Redeployment Blocking):**
|
||||
|
||||
When a resource group is deleted after a failed deployment, Cognitive Services (Foundry), Key Vault, etc. remain in a **soft-delete state**.
|
||||
Redeploying with the same name causes `FlagMustBeSetForRestore`, `Conflict` errors.
|
||||
|
||||
**Always check before redeployment:**
|
||||
```powershell
|
||||
# Check soft-deleted Cognitive Services
|
||||
az cognitiveservices account list-deleted -o table
|
||||
|
||||
# Check soft-deleted Key Vault
|
||||
az keyvault list-deleted -o table
|
||||
```
|
||||
|
||||
**Resolution options (provide choices to the user):**
|
||||
```
|
||||
ask_user({
|
||||
question: "Soft-deleted resources from a previous deployment were found. How would you like to handle this?",
|
||||
choices: [
|
||||
"Purge and redeploy (Recommended) - Clean delete then create new",
|
||||
"Redeploy in restore mode - Recover existing resources"
|
||||
]
|
||||
})
|
||||
```
|
||||
|
||||
**Caution — Key Vault with `enablePurgeProtection: true`:**
|
||||
- Cannot be purged (must wait until retention period expires)
|
||||
- Cannot recreate with the same name
|
||||
- **Solution: Change the Key Vault name** and redeploy (e.g., add timestamp to `uniqueString()` seed)
|
||||
- Explain the situation to the user and guide them on the name change
|
||||
|
||||
### Step 5: Deployment Complete — Generate Diagram from Actual Resources and Report
|
||||
|
||||
Once deployment is complete, query the actually deployed resources and generate the final architecture diagram.
|
||||
|
||||
**Step 1: Query Deployed Resources**
|
||||
```powershell
|
||||
az resource list --resource-group "<RG_NAME>" --output json
|
||||
```
|
||||
|
||||
**Step 2: Generate Diagram from Actual Resources**
|
||||
|
||||
Extract resource names, types, SKUs, and endpoints from the query results and generate the final diagram using the built-in diagram engine.
|
||||
Be careful with file names to avoid overwriting previous diagrams:
|
||||
- `01_arch_diagram_draft.html` — Design draft (keep)
|
||||
- `02_arch_diagram_preview.html` — What-if preview (keep)
|
||||
- `03_arch_diagram_result.html` — Deployment result final version
|
||||
|
||||
Populate the diagram's services JSON with actual deployed resource information:
|
||||
- `name`: Actual resource name (e.g., `foundry-duru57kxgqzxs`)
|
||||
- `sku`: Actual SKU
|
||||
- `details`: Actual values such as endpoints, location, etc.
|
||||
|
||||
**Step 3: Report**
|
||||
```
|
||||
## Deployment Complete!
|
||||
|
||||
[Interactive architecture diagram — 03_arch_diagram_result.html]
|
||||
(Design draft: 01_arch_diagram_draft.html | What-if preview: 02_arch_diagram_preview.html)
|
||||
|
||||
Created resources (N items):
|
||||
[Dynamically extracted resource names, types, and endpoints from actual deployment results]
|
||||
|
||||
## Next Steps
|
||||
1. Verify resources in Azure Portal
|
||||
2. Check Private Endpoint connection status
|
||||
3. Additional configuration guidance if needed
|
||||
|
||||
## Cleanup Command (If Needed)
|
||||
az group delete --name <RG_NAME> --yes --no-wait
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Handling Architecture Change Requests After Deployment
|
||||
|
||||
**When the user requests resource additions/changes/deletions after deployment is complete, do NOT go directly to Bicep/deployment.**
|
||||
Always return to Phase 1 and update the architecture first.
|
||||
|
||||
**Process:**
|
||||
|
||||
1. **Confirm user intent** — Ask first whether they want to add to the existing deployed architecture:
|
||||
```
|
||||
Would you like to add a VM to the currently deployed architecture?
|
||||
Current configuration: [Deployed services summary]
|
||||
```
|
||||
|
||||
2. **Return to Phase 1 — Apply Delta Confirmation Rule**
|
||||
- Use the existing deployment result (`03_arch_diagram_result.html`) as the current state baseline
|
||||
- Verify required fields for new services (SKU, networking, region availability, etc.)
|
||||
- Confirm undecided items via ask_user
|
||||
- Fact-check (MS Docs fetch + cross-validation)
|
||||
|
||||
3. **Generate Updated Architecture Diagram**
|
||||
- Combine existing deployed resources + new resources into `04_arch_diagram_update_draft.html`
|
||||
- Show to the user and get confirmation:
|
||||
```
|
||||
## Updated Architecture
|
||||
|
||||
[Interactive diagram — 04_arch_diagram_update_draft.html]
|
||||
(Previous deployment result: 03_arch_diagram_result.html)
|
||||
|
||||
**Changes:**
|
||||
- Added: [New services list]
|
||||
- Removed: [Removed services list] (if any)
|
||||
|
||||
Proceed with this configuration?
|
||||
```
|
||||
|
||||
4. **After confirmation, proceed through Phase 2 → 3 → 4 in order**
|
||||
- Incrementally add new resource modules to existing Bicep
|
||||
- Review → What-if → Deploy (incremental deployment)
|
||||
|
||||
**Never do the following:**
|
||||
- Jump directly to Bicep generation without updating the architecture diagram when a change is requested after deployment
|
||||
- Ignore the existing deployment state and create new resources in isolation
|
||||
- Proceed without confirming with the user whether to add to the existing architecture
|
||||
@@ -0,0 +1,113 @@
|
||||
# Service Gotchas (Stable)
|
||||
|
||||
Per-service summary of **non-intuitive required properties**, **common mistakes**, and **PE mappings**.
|
||||
Only near-immutable patterns are included here. Dynamic values such as API version, SKU lists, and region are not included.
|
||||
|
||||
---
|
||||
|
||||
## 1. Required Properties (Deployment Failure or Functional Issues If Omitted)
|
||||
|
||||
| Service | Required Property | Result If Omitted | Notes |
|
||||
|---------|------------------|-------------------|-------|
|
||||
| ADLS Gen2 | `isHnsEnabled: true` | Becomes regular Blob Storage. Cannot be reversed | `kind: 'StorageV2'` required |
|
||||
| Storage Account | No special characters/hyphens in name | Deployment failure | Lowercase+numbers only, 3-24 characters |
|
||||
| Foundry (AIServices) | `customSubDomainName: foundryName` | Cannot create Project, cannot change after creation → Must delete and recreate resource | Globally unique value |
|
||||
| Foundry (AIServices) | `allowProjectManagement: true` | Cannot create Foundry Project | `kind: 'AIServices'` |
|
||||
| Foundry (AIServices) | `identity: { type: 'SystemAssigned' }` | Project creation fails | |
|
||||
| Foundry Project | Must be created as a set with Foundry resource | Cannot use from portal | `accounts/projects` |
|
||||
| Key Vault | `enableRbacAuthorization: true` | Risk of mixed Access Policy usage | |
|
||||
| Key Vault | `enablePurgeProtection: true` | Required for production | |
|
||||
| Fabric Capacity | `administration.members` required | Deployment failure | Admin email |
|
||||
| PE Subnet | `privateEndpointNetworkPolicies: 'Disabled'` | PE deployment failure | |
|
||||
| PE DNS Zone | `registrationEnabled: false` (VNet Link) | Possible DNS conflict | |
|
||||
| PE Configuration | 3-component set (PE + DNS Zone + VNet Link + Zone Group) | DNS resolution fails even with PE present | |
|
||||
|
||||
---
|
||||
|
||||
## 2. PE groupId & DNS Zone Mapping (Key Services)
|
||||
|
||||
The mappings below are stable, but re-verify from the PE DNS integration document in `azure-dynamic-sources.md` when adding new services.
|
||||
|
||||
| Service | groupId | Private DNS Zone |
|
||||
|---------|---------|-----------------|
|
||||
| Azure OpenAI / CognitiveServices | `account` | `privatelink.cognitiveservices.azure.com` |
|
||||
| ⚠️ (Foundry/AIServices additional) | `account` | `privatelink.openai.azure.com` ← **Both zones must be included in DNS Zone Group. OpenAI API DNS resolution fails if omitted** |
|
||||
| Azure AI Search | `searchService` | `privatelink.search.windows.net` |
|
||||
| Storage (Blob/ADLS) | `blob` | `privatelink.blob.core.windows.net` |
|
||||
| Storage (DFS/ADLS Gen2) | `dfs` | `privatelink.dfs.core.windows.net` |
|
||||
| Key Vault | `vault` | `privatelink.vaultcore.azure.net` |
|
||||
| Azure ML / AI Hub | `amlworkspace` | `privatelink.api.azureml.ms` |
|
||||
| Container Registry | `registry` | `privatelink.azurecr.io` |
|
||||
| Cosmos DB (SQL) | `Sql` | `privatelink.documents.azure.com` |
|
||||
| Azure Cache for Redis | `redisCache` | `privatelink.redis.cache.windows.net` |
|
||||
| Data Factory | `dataFactory` | `privatelink.datafactory.azure.net` |
|
||||
| API Management | `Gateway` | `privatelink.azure-api.net` |
|
||||
| Event Hub | `namespace` | `privatelink.servicebus.windows.net` |
|
||||
| Service Bus | `namespace` | `privatelink.servicebus.windows.net` |
|
||||
| Monitor (AMPLS) | ⚠️ Complex configuration — see below | ⚠️ Multiple DNS Zones required — see below |
|
||||
|
||||
> **ADLS Gen2 Note**: When `isHnsEnabled: true`, **both `blob` and `dfs` PEs are required**.
|
||||
> - With only the `blob` PE, Blob API works, but Data Lake operations (file system creation, directory manipulation, `abfss://` protocol) will fail.
|
||||
> - DFS PE: groupId `dfs`, DNS Zone `privatelink.dfs.core.windows.net`
|
||||
>
|
||||
> **⚠️ Azure Monitor Private Link (AMPLS) Note**: Azure Monitor cannot be configured with a single PE + single DNS Zone. It connects through Azure Monitor Private Link Scope (AMPLS), and all **5 DNS Zones** are required:
|
||||
> - `privatelink.monitor.azure.com`
|
||||
> - `privatelink.oms.opinsights.azure.com`
|
||||
> - `privatelink.ods.opinsights.azure.com`
|
||||
> - `privatelink.agentsvc.azure-automation.net`
|
||||
> - `privatelink.blob.core.windows.net` (for Log Analytics data ingestion)
|
||||
>
|
||||
> This mapping is complex and subject to change, so always fetch and verify MS Docs when configuring Monitor PE:
|
||||
> https://learn.microsoft.com/en-us/azure/azure-monitor/logs/private-link-configure
|
||||
|
||||
---
|
||||
|
||||
## 3. Common Mistakes Checklist
|
||||
|
||||
| Item | ❌ Incorrect Example | ✅ Correct Example |
|
||||
|------|---------------------|-------------------|
|
||||
| ADLS Gen2 HNS | `isHnsEnabled` omitted or `false` | `isHnsEnabled: true` |
|
||||
| PE Subnet | Policy not set | `privateEndpointNetworkPolicies: 'Disabled'` |
|
||||
| DNS Zone Group | Only PE created | PE + DNS Zone + VNet Link + DNS Zone Group |
|
||||
| Foundry resource | `kind: 'OpenAI'` | `kind: 'AIServices'` + `allowProjectManagement: true` |
|
||||
| Foundry resource | `customSubDomainName` omitted | `customSubDomainName: foundryName` — Cannot change after creation |
|
||||
| Foundry Project | Only Foundry exists without Project | Must create as a set |
|
||||
| Key Vault auth | Access Policy | `enableRbacAuthorization: true` |
|
||||
| Public network | Not configured | `publicNetworkAccess: 'Disabled'` |
|
||||
| Storage name | `st-my-storage` | `stmystorage` or `st${uniqueString(...)}` |
|
||||
| API version | Copied from previous conversation/error | Verify latest stable from MS Docs |
|
||||
| Region | Hardcoded (`'eastus'`) | Pass as parameter (`param location`) |
|
||||
| Sensitive values | Plaintext in `.bicepparam` | `@secure()` + Key Vault reference |
|
||||
|
||||
---
|
||||
|
||||
## 4. Service Relationship Decision Rules
|
||||
|
||||
Described as **default selection rules** rather than absolute determinations.
|
||||
|
||||
### Foundry vs Azure OpenAI vs AI Hub
|
||||
|
||||
```
|
||||
Default rules:
|
||||
├─ AI/RAG workloads → Use Microsoft Foundry (kind: 'AIServices')
|
||||
│ ├─ Create Foundry resource + Foundry Project as a set
|
||||
│ └─ Model deployment is performed at the Foundry resource level (accounts/deployments)
|
||||
│
|
||||
├─ ML/open-source model training needed → Consider AI Hub (MachineLearningServices)
|
||||
│ └─ Only when the user explicitly requests it or features not supported in Foundry are needed
|
||||
│
|
||||
└─ Standalone Azure OpenAI resource →
|
||||
Consider only when the user explicitly requests it or
|
||||
official documentation requires a separate resource
|
||||
```
|
||||
|
||||
> These rules are a **default selection guide** reflecting current MS recommendations.
|
||||
> Azure product relationships can change, so check MS Docs when uncertain.
|
||||
|
||||
### Monitoring
|
||||
|
||||
```
|
||||
Default rules:
|
||||
├─ Foundry (AIServices) → Application Insights not required
|
||||
└─ AI Hub (MachineLearningServices) → Application Insights + Log Analytics required
|
||||
```
|
||||
153
skills/azure-architecture-autopilot/scripts/cli.py
Normal file
153
skills/azure-architecture-autopilot/scripts/cli.py
Normal file
@@ -0,0 +1,153 @@
|
||||
#!/usr/bin/env python3
|
||||
"""CLI for azure-architecture-autopilot diagram engine."""
|
||||
import argparse
|
||||
import json
|
||||
import sys
|
||||
import os
|
||||
import subprocess
|
||||
import shutil
|
||||
from pathlib import Path
|
||||
|
||||
sys.path.insert(0, os.path.dirname(os.path.abspath(__file__)))
|
||||
from generator import generate_diagram
|
||||
|
||||
|
||||
def main():
|
||||
parser = argparse.ArgumentParser(
|
||||
description="Generate interactive Azure architecture diagrams",
|
||||
prog="azure-architecture-autopilot"
|
||||
)
|
||||
parser.add_argument("-s", "--services", help="Services JSON (string or file path)")
|
||||
parser.add_argument("-c", "--connections", help="Connections JSON (string or file path)")
|
||||
parser.add_argument("-t", "--title", default="Azure Architecture", help="Diagram title")
|
||||
parser.add_argument("-o", "--output", default="azure-architecture.html", help="Output file path")
|
||||
parser.add_argument("-f", "--format", choices=["html", "png", "both"], default="html",
|
||||
help="Output format: html (default), png, or both (html+png)")
|
||||
parser.add_argument("--vnet-info", default="", help="VNet CIDR info")
|
||||
parser.add_argument("--hierarchy", default="", help="Subscription/RG hierarchy JSON")
|
||||
|
||||
args = parser.parse_args()
|
||||
|
||||
if not args.services or not args.connections:
|
||||
parser.error("-s/--services and -c/--connections are required")
|
||||
|
||||
services = _load_json(args.services, "services")
|
||||
connections = _load_json(args.connections, "connections")
|
||||
hierarchy = None
|
||||
if args.hierarchy:
|
||||
hierarchy = _load_json(args.hierarchy, "hierarchy")
|
||||
|
||||
services = _normalize_services(services)
|
||||
connections = _normalize_connections(connections)
|
||||
|
||||
html = generate_diagram(
|
||||
services=services,
|
||||
connections=connections,
|
||||
title=args.title,
|
||||
vnet_info=args.vnet_info,
|
||||
hierarchy=hierarchy,
|
||||
)
|
||||
|
||||
# Determine output paths
|
||||
out = Path(args.output)
|
||||
html_path = out.with_suffix(".html")
|
||||
png_path = out.with_suffix(".png")
|
||||
svg_path = out.with_suffix(".svg")
|
||||
|
||||
if args.format in ("html", "both"):
|
||||
html_path.write_text(html, encoding="utf-8")
|
||||
print(f"HTML saved: {html_path}")
|
||||
|
||||
if args.format in ("png", "both"):
|
||||
# Write temp HTML then screenshot with puppeteer/playwright
|
||||
tmp_html = html_path if args.format == "both" else Path(str(png_path) + ".tmp.html")
|
||||
if args.format != "both":
|
||||
tmp_html.write_text(html, encoding="utf-8")
|
||||
|
||||
success = _html_to_png(tmp_html, png_path)
|
||||
|
||||
if args.format != "both" and tmp_html.exists():
|
||||
tmp_html.unlink()
|
||||
|
||||
if success:
|
||||
print(f"PNG saved: {png_path}")
|
||||
else:
|
||||
print(f"WARNING: PNG export failed. Install puppeteer (npm i puppeteer) for PNG support.", file=sys.stderr)
|
||||
print(f"HTML saved instead: {html_path}")
|
||||
if not html_path.exists():
|
||||
html_path.write_text(html, encoding="utf-8")
|
||||
|
||||
|
||||
def _html_to_png(html_path, png_path, width=1920, height=1080):
|
||||
"""Convert HTML to PNG using puppeteer (Node.js)."""
|
||||
node = shutil.which("node")
|
||||
if not node:
|
||||
return False
|
||||
|
||||
# Try multiple puppeteer locations
|
||||
script = f"""
|
||||
let puppeteer;
|
||||
const paths = [
|
||||
'puppeteer',
|
||||
process.env.TEMP + '/node_modules/puppeteer',
|
||||
process.env.HOME + '/node_modules/puppeteer',
|
||||
'./node_modules/puppeteer'
|
||||
];
|
||||
for (const p of paths) {{ try {{ puppeteer = require(p); break; }} catch(e) {{}} }}
|
||||
if (!puppeteer) {{ console.error('puppeteer not found'); process.exit(1); }}
|
||||
(async () => {{
|
||||
const browser = await puppeteer.launch({{headless: 'new'}});
|
||||
const page = await browser.newPage();
|
||||
await page.setViewport({{width: {width}, height: {height}}});
|
||||
await page.goto('file:///{html_path.resolve().as_posix()}', {{waitUntil: 'networkidle0'}});
|
||||
await new Promise(r => setTimeout(r, 2000));
|
||||
await page.screenshot({{path: '{png_path.resolve().as_posix()}'}});
|
||||
await browser.close();
|
||||
}})();
|
||||
"""
|
||||
try:
|
||||
result = subprocess.run([node, "-e", script], capture_output=True, text=True, timeout=30)
|
||||
return result.returncode == 0 and png_path.exists()
|
||||
except (subprocess.TimeoutExpired, FileNotFoundError):
|
||||
return False
|
||||
|
||||
|
||||
def _load_json(value, name):
|
||||
"""Load JSON from string or file path. Extracts named key from combined JSON if present."""
|
||||
data = None
|
||||
if os.path.isfile(value):
|
||||
with open(value, "r", encoding="utf-8") as f:
|
||||
data = json.load(f)
|
||||
else:
|
||||
try:
|
||||
data = json.loads(value)
|
||||
except json.JSONDecodeError as e:
|
||||
print(f"ERROR: Invalid JSON for --{name}: {e}", file=sys.stderr)
|
||||
sys.exit(1)
|
||||
|
||||
# If data is a dict with the named key, extract it (combined JSON file support)
|
||||
if isinstance(data, dict) and name in data:
|
||||
return data[name]
|
||||
return data
|
||||
|
||||
|
||||
def _normalize_services(services):
|
||||
"""Normalize service fields for tolerance."""
|
||||
for svc in services:
|
||||
if isinstance(svc.get("details"), str):
|
||||
svc["details"] = [svc["details"]]
|
||||
if isinstance(svc.get("private"), str):
|
||||
svc["private"] = bool(svc["private"])
|
||||
return services
|
||||
|
||||
|
||||
def _normalize_connections(connections):
|
||||
"""Normalize connection fields for tolerance."""
|
||||
for conn in connections:
|
||||
if "type" not in conn:
|
||||
conn["type"] = "default"
|
||||
return connections
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
1968
skills/azure-architecture-autopilot/scripts/generator.py
Normal file
1968
skills/azure-architecture-autopilot/scripts/generator.py
Normal file
File diff suppressed because it is too large
Load Diff
3190
skills/azure-architecture-autopilot/scripts/icons.py
Normal file
3190
skills/azure-architecture-autopilot/scripts/icons.py
Normal file
File diff suppressed because one or more lines are too long
Reference in New Issue
Block a user