mirror of https://github.com/github/awesome-copilot.git synced 2026-04-12 19:25:55 +00:00

Files

Jeonghoon Lee 3e01810473 Add azure-architecture-autopilot skill 🤖🤖🤖 (#1158 )

* Add azure-architecture-autopilot skill

E2E Azure infrastructure automation skill:
- Natural language → Architecture diagram → Bicep → Deploy
- 70+ service types with 605+ official Azure icons
- Interactive HTML diagrams (drag, zoom, click, PNG export)
- Scans existing resources or designs new architecture
- Modular Bicep with RBAC, Private Endpoints, DNS
- Multi-language support (auto-detects user language)
- Zero dependencies (diagram engine embedded)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Fix generator.py import for flat scripts/ structure + sync README

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

---------

Co-authored-by: whoniiii <whoniiii@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

2026-03-25 10:20:38 +11:00

25 KiB

Raw Blame History

Phase 0: Existing Resource Scanner

This file contains the detailed instructions for Phase 0. When the user requests analysis of existing Azure resources (Path B), read and follow this file.

Scan results are visualized as an architecture diagram, and subsequent natural-language modification requests from the user are routed to Phase 1.

🚨 Output Storage Path Rule: All outputs (scan JSON, diagram HTML, Bicep code) must be saved in a project folder under the current working directory (cwd). NEVER save them inside ~/.copilot/session-state/. The session-state directory is a temporary space and may be deleted when the session ends.

az account show 2>&1

If logged in → Proceed to Step 1-B
If not logged in → Ask the user to run az login

1-B: Subscription Selection (Multiple Selection Supported)

az account list --output json

Present the subscription list as ask_user choices. Multiple subscriptions can be selected:

ask_user({
  question: "Please select the Azure subscription(s) to analyze. (You can add more one at a time for multiple selections)",
  choices: [
    "sub-002 (Current default subscription) (Recommended)",
    "sub-001",
    "Analyze all subscriptions above"
  ]
})

Single subscription selected → Scan only that subscription
"Analyze all" selected → Scan all subscriptions
If the user wants additional subscriptions → Use ask_user again to add more

1-C: Scan Scope Selection (Multiple RG Selection Supported)

ask_user({
  question: "What scope of Azure resources would you like to analyze?",
  choices: [
    "Specify a particular resource group (Recommended)",
    "Select multiple resource groups",
    "All resource groups in the current subscription"
  ]
})

Specific RG → Select from the RG list or enter manually
Multiple RGs → Repeat ask_user to add RGs one at a time. Stop when the user says "that's enough." Alternatively, the user can enter multiple RGs separated by commas (e.g., rg-prod, rg-dev, rg-network)
Entire subscription → az group list → Scan all RGs (warn if there are many resources that it may take time)

Combining multiple subscriptions + multiple RGs is supported:

rg-prod from subscription A + rg-network from subscription B → Scan both and display in a single diagram

Diagram Hierarchy — Displaying Multiple Subscriptions/RGs

Single subscription + single RG: Same as before (VNet boundary only) Multiple RGs (same subscription): Dashed boundary per RG Multiple subscriptions: Two-level boundary of Subscription > RG

Pass hierarchy information in the diagram JSON:

Add subscription and resourceGroup fields to the services JSON:

{
  "id": "foundry",
  "name": "foundry-xxx",
  "type": "ai_foundry",
  "subscription": "sub-002",
  "resourceGroup": "rg-prod",
  "details": [...]
}

Pass hierarchy information via the --hierarchy parameter:

--hierarchy '[{"subscription":"sub-002","resourceGroups":["rg-prod","rg-dev"]},{"subscription":"sub-001","resourceGroups":["rg-network"]}]'

Based on this information, the diagram script will:

Multiple RGs → Represent each RG as a cluster with a dashed boundary (label: RG name)
Multiple subscriptions → Nest RG boundaries inside larger subscription boundaries
VNet boundaries are displayed inside the RG to which the VNet belongs

Step 2: Resource Scan

🚨 az CLI Output Principles:

az CLI output must always be saved to a file and then read with view. Direct terminal output may be truncated.
Bundle no more than 3 az commands per PowerShell call. Bundling too many may cause timeouts.
Use --query JMESPath to extract only the required fields and reduce output size.

# ✅ Correct approach — Save to file then read
az resource list -g "<RG>" --query "[].{name:name,type:type,kind:kind,location:location}" -o json | Set-Content -Path "$outDir/resources.json"

# ❌ Wrong approach — Direct terminal output (may be truncated)
az resource list -g "<RG>" -o json

2-A: List All Resources + Display to User

$outDir = "<project-name>/azure-scan"
New-Item -ItemType Directory -Path $outDir -Force | Out-Null

# Step 1: Basic resource list (name, type, kind, location)
az resource list -g "<RG>" --query "[].{name:name,type:type,kind:kind,location:location,id:id}" -o json | Set-Content "$outDir/resources.json"

🚨 Immediately after reading resources.json, you MUST display the full resource list table to the user:

📋 rg-<RG> Resource List (N resources)

┌─────────────────────────┬──────────────────────────────────────────────┬─────────────────┐
│ Name                    │ Type                                         │ Location        │
├─────────────────────────┼──────────────────────────────────────────────┼─────────────────┤
│ my-storage              │ Microsoft.Storage/storageAccounts             │ koreacentral    │
│ my-keyvault             │ Microsoft.KeyVault/vaults                    │ koreacentral    │
│ ...                     │ ...                                          │ ...             │
└─────────────────────────┴──────────────────────────────────────────────┴─────────────────┘

⏳ Retrieving detailed information...

Display this table first before proceeding to detailed queries. Do not make the user wait without knowing what resources exist.

2-B: Dynamic Detailed Query — Based on resources.json

Dynamically determine detailed query commands based on the resource types found in resources.json.

Do not use a hardcoded command list. Only execute commands for types that exist in resources.json, selected from the mapping table below.

Type → Detailed Query Command Mapping:

Type in resources.json	Detailed Query Command	Output File
`Microsoft.Network/virtualNetworks`	`az network vnet list -g "<RG>" --query "[].{name:name,addressSpace:addressSpace.addressPrefixes,subnets:subnets[].{name:name,prefix:addressPrefix,pePolicy:privateEndpointNetworkPolicies}}" -o json`	`vnets.json`
`Microsoft.Network/privateEndpoints`	`az network private-endpoint list -g "<RG>" --query "[].{name:name,subnetId:subnet.id,targetId:privateLinkServiceConnections[0].privateLinkServiceId,groupIds:privateLinkServiceConnections[0].groupIds,state:provisioningState}" -o json`	`pe.json`
`Microsoft.Network/networkSecurityGroups`	`az network nsg list -g "<RG>" --query "[].{name:name,location:location,subnets:subnets[].id,nics:networkInterfaces[].id}" -o json`	`nsg.json`
`Microsoft.CognitiveServices/accounts`	`az cognitiveservices account list -g "<RG>" --query "[].{name:name,kind:kind,sku:sku.name,endpoint:properties.endpoint,publicAccess:properties.publicNetworkAccess,location:location}" -o json`	`cognitive.json`
`Microsoft.Search/searchServices`	`az search service list -g "<RG>" --query "[].{name:name,sku:sku.name,publicAccess:properties.publicNetworkAccess,semanticSearch:properties.semanticSearch,location:location}" -o json 2>$null`	`search.json`
`Microsoft.Compute/virtualMachines`	`az vm list -g "<RG>" --query "[].{name:name,size:hardwareProfile.vmSize,os:storageProfile.osDisk.osType,location:location,nicIds:networkProfile.networkInterfaces[].id}" -o json`	`vms.json`
`Microsoft.Storage/storageAccounts`	`az storage account list -g "<RG>" --query "[].{name:name,sku:sku.name,kind:kind,hns:properties.isHnsEnabled,publicAccess:properties.publicNetworkAccess,location:location}" -o json`	`storage.json`
`Microsoft.KeyVault/vaults`	`az keyvault list -g "<RG>" --query "[].{name:name,location:location}" -o json 2>$null`	`keyvault.json`
`Microsoft.ContainerService/managedClusters`	`az aks list -g "<RG>" --query "[].{name:name,kubernetesVersion:kubernetesVersion,sku:sku,agentPoolProfiles:agentPoolProfiles[].{name:name,count:count,vmSize:vmSize},networkProfile:networkProfile.networkPlugin,location:location}" -o json`	`aks.json`
`Microsoft.Web/sites`	`az webapp list -g "<RG>" --query "[].{name:name,kind:kind,sku:appServicePlan,state:state,defaultHostName:defaultHostName,httpsOnly:httpsOnly,location:location}" -o json`	`webapps.json`
`Microsoft.Web/serverFarms`	`az appservice plan list -g "<RG>" --query "[].{name:name,sku:sku.name,tier:sku.tier,kind:kind,location:location}" -o json`	`appservice-plans.json`
`Microsoft.DocumentDB/databaseAccounts`	`az cosmosdb list -g "<RG>" --query "[].{name:name,kind:kind,databaseAccountOfferType:databaseAccountOfferType,locations:locations[].locationName,publicAccess:publicNetworkAccess}" -o json`	`cosmosdb.json`
`Microsoft.Sql/servers`	`az sql server list -g "<RG>" --query "[].{name:name,fullyQualifiedDomainName:fullyQualifiedDomainName,publicAccess:publicNetworkAccess,location:location}" -o json`	`sql-servers.json`
`Microsoft.Databricks/workspaces`	`az databricks workspace list -g "<RG>" --query "[].{name:name,sku:sku.name,url:workspaceUrl,publicAccess:parameters.enableNoPublicIp.value,location:location}" -o json 2>$null`	`databricks.json`
`Microsoft.Synapse/workspaces`	`az synapse workspace list -g "<RG>" --query "[].{name:name,sqlAdminLogin:sqlAdministratorLogin,publicAccess:publicNetworkAccess,location:location}" -o json 2>$null`	`synapse.json`
`Microsoft.DataFactory/factories`	`az datafactory list -g "<RG>" --query "[].{name:name,publicAccess:publicNetworkAccess,location:location}" -o json 2>$null`	`adf.json`
`Microsoft.EventHub/namespaces`	`az eventhubs namespace list -g "<RG>" --query "[].{name:name,sku:sku.name,location:location}" -o json`	`eventhub.json`
`Microsoft.Cache/redis`	`az redis list -g "<RG>" --query "[].{name:name,sku:sku.name,port:port,sslPort:sslPort,publicAccess:publicNetworkAccess,location:location}" -o json`	`redis.json`
`Microsoft.ContainerRegistry/registries`	`az acr list -g "<RG>" --query "[].{name:name,sku:sku.name,adminUserEnabled:adminUserEnabled,publicAccess:publicNetworkAccess,location:location}" -o json`	`acr.json`
`Microsoft.MachineLearningServices/workspaces`	`az resource show --ids "<ID>" --query "{name:name,sku:sku,kind:kind,location:location,publicAccess:properties.publicNetworkAccess,hbiWorkspace:properties.hbiWorkspace,managedNetwork:properties.managedNetwork.isolationMode}" -o json`	`mlworkspace.json`
`Microsoft.Insights/components`	`az monitor app-insights component show -g "<RG>" --app "<NAME>" --query "{name:name,kind:kind,instrumentationKey:instrumentationKey,workspaceResourceId:workspaceResourceId,location:location}" -o json 2>$null`	`appinsights-<NAME>.json`
`Microsoft.OperationalInsights/workspaces`	`az monitor log-analytics workspace show -g "<RG>" -n "<NAME>" --query "{name:name,sku:sku.name,retentionInDays:retentionInDays,location:location}" -o json`	`log-analytics-<NAME>.json`
`Microsoft.Network/applicationGateways`	`az network application-gateway list -g "<RG>" --query "[].{name:name,sku:sku,location:location}" -o json`	`appgateway.json`
`Microsoft.Cdn/profiles` / `Microsoft.Network/frontDoors`	`az afd profile list -g "<RG>" --query "[].{name:name,sku:sku.name,location:location}" -o json 2>$null`	`frontdoor.json`
`Microsoft.Network/azureFirewalls`	`az network firewall list -g "<RG>" --query "[].{name:name,sku:sku,threatIntelMode:threatIntelMode,location:location}" -o json`	`firewall.json`
`Microsoft.Network/bastionHosts`	`az network bastion list -g "<RG>" --query "[].{name:name,sku:sku.name,location:location}" -o json`	`bastion.json`

Dynamic Query Process:

Read resources.json
Extract the distinct values of the type field
Execute only the commands for matching types from the mapping table above (skip types not present)
If a type not in the mapping table is found → Use generic query: az resource show --ids "<ID>" --query "{name:name,sku:sku,kind:kind,location:location,properties:properties}" -o json
Execute commands in batches of 2-3 (do not run all at once)

2-C: Model Deployment Query (When Cognitive Services Exist)

# Query model deployments for each Cognitive Services resource
az cognitiveservices account deployment list --name "<NAME>" -g "<RG>" --query "[].{name:name,model:properties.model.name,version:properties.model.version,sku:sku.name}" -o json | Set-Content "$outDir/<NAME>-deployments.json"

2-D: NIC + Public IP Query (When VMs Exist)

az network nic list -g "<RG>" --query "[].{name:name,subnetId:ipConfigurations[0].subnet.id,privateIp:ipConfigurations[0].privateIPAddress,publicIpId:ipConfigurations[0].publicIPAddress.id}" -o json | Set-Content "$outDir/nics.json"
az network public-ip list -g "<RG>" --query "[].{name:name,ip:ipAddress,sku:sku.name}" -o json | Set-Content "$outDir/public-ips.json"

From the VNet:

addressSpace.addressPrefixes → CIDR
subnets[].name, subnets[].addressPrefix → Subnet information
subnets[].privateEndpointNetworkPolicies → PE policies

Step 3: Inferring Relationships Between Resources

Automatically infer relationships (connections) between scanned resources to construct the connections JSON for the diagram.

Relationship Inference Rules

🚨 If there are insufficient connection lines, the diagram becomes meaningless. Infer as many relationships as possible.

Confirmed Inference (Directly verifiable from resource IDs/properties)

Relationship Type	Inference Method	connection type
PE → Service	Extract service ID from PE's `privateLinkServiceId`	`private`
PE → VNet	Extract VNet from PE's `subnet.id`	(Represented as VNet boundary)
Foundry → Project	Parent resource of `accounts/projects`	`api`
VM → NIC → Subnet	Infer VNet/Subnet from NIC's `subnet.id`	(VNet boundary)
NSG → Subnet	Check connected subnets from NSG's `subnets[].id`	`network`
NSG → NIC	Check connected VMs from NSG's `networkInterfaces[].id`	`network`
NIC → Public IP	Check PIP from NIC's `publicIPAddress.id`	(Included in details)
Databricks → VNet	Workspace's VNet injection configuration	(VNet boundary)

Reasonable Inference (Common patterns between services within the same RG)

Relationship Type	Inference Condition	connection type
Foundry → AI Search	Both exist in the same RG → Infer RAG connection	`api` (label: "RAG Search")
Foundry → Storage	Both exist in the same RG → Infer data connection	`data` (label: "Data")
AI Search → Storage	Both exist in the same RG → Infer indexing connection	`data` (label: "Indexing")
Service → Key Vault	Key Vault exists in the same RG → Infer secret management	`security` (label: "Secrets")
VM → Foundry/Search	VM + AI services exist in the same RG → Infer API calls	`api` (label: "API")
DI → Foundry	Document Intelligence + Foundry exist in the same RG → Infer OCR/extraction connection	`api` (label: "OCR/Extract")
ADF → Storage	ADF + Storage exist in the same RG → Infer data pipeline	`data` (label: "Pipeline")
ADF → SQL	ADF + SQL exist in the same RG → Infer data source	`data` (label: "Source")
Databricks → Storage	Both exist in the same RG → Infer data lake connection	`data` (label: "Data Lake")

User Confirmation After Inference

Show the inferred connection list to the user and request confirmation:

> **⏳ Relationships between resources have been inferred** — Please verify if the following are correct.

Inferred connections:
- Foundry → AI Search (RAG Search)
- Foundry → Storage (Data)
- VM → Foundry (API Call)
- Document Intelligence → Foundry (OCR/Extract)

Does this look correct? Let me know if you'd like to add or remove any connections.

Relationships That Cannot Be Inferred

There may be connections that cannot be inferred using the rules above. The user can freely add additional connections.

Model Deployment Query (When Foundry Resources Exist)

az cognitiveservices account deployment list --name "<FOUNDRY_NAME>" -g "<RG>" --query "[].{name:name,model:properties.model.name,version:properties.model.version,sku:sku.name}" -o json

Add each deployment's model name, version, and SKU to the Foundry node's details.

Step 4: services/connections JSON Conversion

Convert scan results into the input format for the built-in diagram engine.

Resource Type → Diagram type Mapping

Azure Resource Type	Diagram type
`Microsoft.CognitiveServices/accounts` (kind: AIServices)	`ai_foundry`
`Microsoft.CognitiveServices/accounts` (kind: OpenAI)	`openai`
`Microsoft.CognitiveServices/accounts` (kind: FormRecognizer)	`document_intelligence`
`Microsoft.CognitiveServices/accounts` (kind: TextAnalytics, etc.)	`ai_foundry` (default)
`Microsoft.CognitiveServices/accounts/projects`	`ai_foundry`
`Microsoft.Search/searchServices`	`search`
`Microsoft.Storage/storageAccounts`	`storage`
`Microsoft.KeyVault/vaults`	`keyvault`
`Microsoft.Databricks/workspaces`	`databricks`
`Microsoft.Sql/servers`	`sql_server`
`Microsoft.Sql/servers/databases`	`sql_database`
`Microsoft.DocumentDB/databaseAccounts`	`cosmos_db`
`Microsoft.Web/sites`	`app_service`
`Microsoft.ContainerService/managedClusters`	`aks`
`Microsoft.Web/sites` (kind: functionapp)	`function_app`
`Microsoft.Synapse/workspaces`	`synapse`
`Microsoft.Fabric/capacities`	`fabric`
`Microsoft.DataFactory/factories`	`adf`
`Microsoft.Compute/virtualMachines`	`vm`
`Microsoft.Network/privateEndpoints`	`pe`
`Microsoft.Network/virtualNetworks`	(Represented as VNet boundary — not included in services)
`Microsoft.Network/networkSecurityGroups`	`nsg`
`Microsoft.Network/bastionHosts`	`bastion`
`Microsoft.OperationalInsights/workspaces`	`log_analytics`
`Microsoft.Insights/components`	`app_insights`
Other	`default`

services JSON Construction Rules

{
  "id": "resource name (lowercase, special characters removed)",
  "name": "actual resource name",
  "type": "determined from the mapping table above",
  "sku": "actual SKU (if available)",
  "private": true/false,  // true if a PE is connected
  "details": ["property1", "property2", ...]
}

Information to include in details:

Endpoint URL
SKU/tier details
kind (AIServices, OpenAI, etc.)
Model deployment list (Foundry)
Key properties (isHnsEnabled, semanticSearch, etc.)
Region

VNet Information → `--vnet-info` Parameter

If a VNet is found, display it in the boundary label via --vnet-info:

--vnet-info "10.0.0.0/16 | pe-subnet: 10.0.1.0/24 | <region>"

PE Node Generation

If PEs are found, add each PE as a separate node and connect it to the corresponding service with a private type:

{"id": "pe_<serviceId>", "name": "PE: <serviceName>", "type": "pe", "details": ["groupId: <groupId>", "<status>"]}

Step 5: Diagram Generation + Presentation to User

Diagram filename: <project-name>/00_arch_current.html

Use the scanned RG name as the default project name:

ask_user({
  question: "Please choose a project name. (This will be the folder name for scan results)",
  choices: ["<RG-name>", "azure-analysis"]
})

After generating the diagram, report:

## Current Azure Architecture

[Interactive Diagram — 00_arch_current.html]

Scanned Resources (N total):
[Summary table by resource type]

What would you like to change here?
- 🔧 Performance improvement ("it's slow", "increase throughput")
- 💰 Cost optimization ("reduce costs", "make it cheaper")
- 🔒 Security hardening ("add PE", "block public access")
- 🌐 Network changes ("separate VNet", "add Bastion")
- ➕ Add/remove resources ("add a VM", "delete this")
- 📊 Monitoring ("set up logs", "add alerts")
- 🤔 Diagnostics ("is this architecture OK?", "what's wrong?")
- Or just take the diagram and stop here

Step 6: Modification Conversation → Transition to Phase 1

When the user requests modifications, transition to Phase 1 (phase1-advisor.md). This is the Path B entry point, using the existing scan results as the baseline.

Natural Language Modification Request Handling — Clarifying Question Patterns

Ask clarifying questions to make the user's vague requests more specific:

🔧 Performance

User Request	Clarifying Question Example
"It's slow" / "Response takes too long"	"Which service is slow? Should we upgrade the SKU or change the region?"
"I want to increase throughput"	"Which service's throughput should we increase? Scale out? Increase DTU/RU?"
"AI Search indexing is slow"	"Should we add partitions? Upgrade the SKU to S2?"

💰 Cost

User Request	Clarifying Question Example
"I want to reduce costs"	"Which service's cost should we reduce? SKU downgrade? Clean up unused resources?"
"How much does this cost?"	Look up pricing info from MS Docs and provide estimated cost based on current SKUs
"It's a dev environment, so make it cheap"	"Should we switch to Free/Basic tiers? Which services?"

🔒 Security

User Request	Clarifying Question Example
"Harden the security"	"Should we add PEs to services that don't have them? Check RBAC? Disable publicNetworkAccess?"
"Block public access"	"Should we apply PE + publicNetworkAccess: Disabled to all services?"
"Manage the keys"	"Should we add Key Vault and connect it with Managed Identity?"

🌐 Network

User Request	Clarifying Question Example
"Add PE"	"To which service? Should we add them to all services at once?"
"Separate the VNet"	"Which subnets should we separate? Should we also add NSGs?"
"Add Bastion"	"Adding Azure Bastion for VM access. Please specify the subnet CIDR."

➕ Add/Remove Resources

User Request	Clarifying Question Example
"Add a VM"	"How many? What SKU? Same VNet? What OS?"
"Add Fabric"	"What SKU? What's the admin email?"
"Delete this"	"Are you sure you want to remove [resource name]? Connected PEs will also be removed."

📊 Monitoring/Operations

User Request	Clarifying Question Example
"I want to see logs"	"Should we add a Log Analytics Workspace and connect Diagnostic Settings?"
"Set up alerts"	"For which metrics? CPU? Error rate? Response time?"
"Attach Application Insights"	"To which service? App Service? Function App?"

🔄 Migration/Changes

User Request	Clarifying Question Example
"Change the region"	"To which region? I'll verify that all services are available in that region."
"Switch SQL to Cosmos"	"What Cosmos DB API type? (SQL/MongoDB/Cassandra) I can also provide a data migration guide."
"Switch Foundry to Hub"	"Hub is suitable only when ML training/open-source models are needed. Let me verify the use case."

🤔 Diagnostics/Questions

User Request	Clarifying Question Example
"What's wrong?"	Analyze current configuration (publicNetworkAccess open, PE not connected, inappropriate SKU, etc.) and suggest improvements
"Is this architecture OK?"	Review against the Well-Architected Framework (security, reliability, performance, cost, operations)
"Is the PE connected properly?"	Check connection status with `az network private-endpoint show` and report
"Just give me the diagram"	Do not transition to Phase 1; provide the 00_arch_current.html path and finish

Once modifications are finalized:

Apply Phase 1's Delta Confirmation Rule
Fact-check (cross-verify with MS Docs)
Generate updated diagram (01_arch_diagram_draft.html)
User confirmation → Proceed to Phases 2–4

Scan Performance Optimization

If there are 50+ resources, warn the user: "There are many resources, so the scan may take some time."
Run az resource list first to determine the resource count, then proceed with detailed queries
Query key services first (Foundry, Search, Storage, KeyVault, VNet, PE), then collect only basic information for the rest via az resource show
Keep the user informed of progress:

⏳ Scanning resources — M of N resources completed

Handling Unsupported Resources

For resource types not in the diagram type mapping:

Display with default type (question mark icon)
Include the resource name and type in details
Show to the user, but do not attempt relationship inference

25 KiB Raw Blame History Unescape Escape