feat: add FlowStudio monitoring + governance skills, update debug + build + mcp (#1304)

- **New skill: flowstudio-power-automate-monitoring** — flow health, failure
  rates, maker inventory, Power Apps, environment/connection counts via
  FlowStudio MCP cached store tools.
- **New skill: flowstudio-power-automate-governance** — 10 CoE-aligned
  governance workflows: compliance review, orphan detection, archive scoring,
  connector audit, notification management, classification/tagging, maker
  offboarding, security review, environment governance, governance dashboard.
- **Updated flowstudio-power-automate-debug** — purely live API tools (no
  store dependencies), mandatory action output inspection step, resubmit
  clarified as working for ALL trigger types.
- **Updated flowstudio-power-automate-build** — Step 1 uses list_live_flows
  (not list_store_flows) for the duplicate check, resubmit-first testing.
- **Updated flowstudio-power-automate-mcp** — store tool catalog, response
  shapes verified against real API calls, set_store_flow_state shape fix.
- Plugin version bumped to 2.0.0, all 5 skills listed in plugin.json.
- Generated docs regenerated via npm start.

All response shapes verified against real FlowStudio MCP API calls.
All 10 governance workflows validated with real tenant data.

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
Catherine Han
2026-04-09 10:39:58 +10:00
committed by GitHub
parent 49fd3f3faf
commit 82c6b786ea
12 changed files with 1249 additions and 115 deletions

View File

@@ -2,11 +2,20 @@
name: flowstudio-power-automate-debug
description: >-
Debug failing Power Automate cloud flows using the FlowStudio MCP server.
The Graph API only shows top-level status codes. This skill gives your agent
action-level inputs and outputs to find the actual root cause.
Load this skill when asked to: debug a flow, investigate a failed run, why is
this flow failing, inspect action outputs, find the root cause of a flow error,
fix a broken Power Automate flow, diagnose a timeout, trace a DynamicOperationRequestFailure,
check connector auth errors, read error details from a run, or troubleshoot
expression failures. Requires a FlowStudio MCP subscription — see https://mcp.flowstudio.app
metadata:
openclaw:
requires:
env:
- FLOWSTUDIO_MCP_TOKEN
primaryEnv: FLOWSTUDIO_MCP_TOKEN
homepage: https://mcp.flowstudio.app
---
# Power Automate Debugging with FlowStudio MCP
@@ -14,6 +23,10 @@ description: >-
A step-by-step diagnostic process for investigating failing Power Automate
cloud flows through the FlowStudio MCP server.
> **Real debugging examples**: [Expression error in child flow](https://github.com/ninihen1/power-automate-mcp-skills/blob/main/examples/fix-expression-error.md) |
> [Data entry, not a flow bug](https://github.com/ninihen1/power-automate-mcp-skills/blob/main/examples/data-not-flow.md) |
> [Null value crashes child flow](https://github.com/ninihen1/power-automate-mcp-skills/blob/main/examples/null-child-flow.md)
**Prerequisite**: A FlowStudio MCP server must be reachable with a valid JWT.
See the `flowstudio-power-automate-mcp` skill for connection setup.
Subscribe at https://mcp.flowstudio.app
@@ -59,46 +72,6 @@ ENV = "<environment-id>" # e.g. Default-xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx
---
## FlowStudio for Teams: Fast-Path Diagnosis (Skip Steps 24)
If you have a FlowStudio for Teams subscription, `get_store_flow_errors`
returns per-run failure data including action names and remediation hints
in a single call — no need to walk through live API steps.
```python
# Quick failure summary
summary = mcp("get_store_flow_summary", environmentName=ENV, flowName=FLOW_ID)
# {"totalRuns": 100, "failRuns": 10, "failRate": 0.1,
# "averageDurationSeconds": 29.4, "maxDurationSeconds": 158.9,
# "firstFailRunRemediation": "<hint or null>"}
print(f"Fail rate: {summary['failRate']:.0%} over {summary['totalRuns']} runs")
# Per-run error details (requires active monitoring to be configured)
errors = mcp("get_store_flow_errors", environmentName=ENV, flowName=FLOW_ID)
if errors:
for r in errors[:3]:
print(r["startTime"], "|", r.get("failedActions"), "|", r.get("remediationHint"))
# If errors confirms the failing action → jump to Step 6 (apply fix)
else:
# Store doesn't have run-level detail for this flow — use live tools (Steps 25)
pass
```
For the full governance record (description, complexity, tier, connector list):
```python
record = mcp("get_store_flow", environmentName=ENV, flowName=FLOW_ID)
# {"displayName": "My Flow", "state": "Started",
# "runPeriodTotal": 100, "runPeriodFailRate": 0.1, "runPeriodFails": 10,
# "runPeriodDurationAverage": 29410.8, ← milliseconds
# "runError": "{\"code\": \"EACCES\", ...}", ← JSON string, parse it
# "description": "...", "tier": "Premium", "complexity": "{...}"}
if record.get("runError"):
last_err = json.loads(record["runError"])
print("Last run error:", last_err)
```
---
## Step 1 — Locate the Flow
```python
@@ -134,6 +107,13 @@ RUN_ID = next(r["name"] for r in runs if r["status"] == "Failed")
## Step 3 — Get the Top-Level Error
> **CRITICAL**: `get_live_flow_run_error` tells you **which** action failed.
> `get_live_flow_run_action_outputs` tells you **why**. You must call BOTH.
> Never stop at the error alone — error codes like `ActionFailed`,
> `NotSpecified`, and `InternalServerError` are generic wrappers. The actual
> root cause (wrong field, null value, HTTP 500 body, stack trace) is only
> visible in the action's inputs and outputs.
```python
err = mcp("get_live_flow_run_error",
environmentName=ENV, flowName=FLOW_ID, runName=RUN_ID)
@@ -164,7 +144,86 @@ print(f"Root action: {root['actionName']} → code: {root.get('code')}")
---
## Step 4 — Read the Flow Definition
## Step 4 — Inspect the Failing Action's Inputs and Outputs
> **This is the most important step.** `get_live_flow_run_error` only gives
> you a generic error code. The actual error detail — HTTP status codes,
> response bodies, stack traces, null values — lives in the action's runtime
> inputs and outputs. **Always inspect the failing action immediately after
> identifying it.**
```python
# Get the root failing action's full inputs and outputs
root_action = err["failedActions"][-1]["actionName"]
detail = mcp("get_live_flow_run_action_outputs",
environmentName=ENV,
flowName=FLOW_ID,
runName=RUN_ID,
actionName=root_action)
out = detail[0] if detail else {}
print(f"Action: {out.get('actionName')}")
print(f"Status: {out.get('status')}")
# For HTTP actions, the real error is in outputs.body
if isinstance(out.get("outputs"), dict):
status_code = out["outputs"].get("statusCode")
body = out["outputs"].get("body", {})
print(f"HTTP {status_code}")
print(json.dumps(body, indent=2)[:500])
# Error bodies are often nested JSON strings — parse them
if isinstance(body, dict) and "error" in body:
err_detail = body["error"]
if isinstance(err_detail, str):
err_detail = json.loads(err_detail)
print(f"Error: {err_detail.get('message', err_detail)}")
# For expression errors, the error is in the error field
if out.get("error"):
print(f"Error: {out['error']}")
# Also check inputs — they show what expression/URL/body was used
if out.get("inputs"):
print(f"Inputs: {json.dumps(out['inputs'], indent=2)[:500]}")
```
### What the action outputs reveal (that error codes don't)
| Error code from `get_live_flow_run_error` | What `get_live_flow_run_action_outputs` reveals |
|---|---|
| `ActionFailed` | Which nested action actually failed and its HTTP response |
| `NotSpecified` | The HTTP status code + response body with the real error |
| `InternalServerError` | The server's error message, stack trace, or API error JSON |
| `InvalidTemplate` | The exact expression that failed and the null/wrong-type value |
| `BadRequest` | The request body that was sent and why the server rejected it |
### Example: HTTP action returning 500
```
Error code: "InternalServerError" ← this tells you nothing
Action outputs reveal:
HTTP 500
body: {"error": "Cannot read properties of undefined (reading 'toLowerCase')
at getClientParamsFromConnectionString (storage.js:20)"}
← THIS tells you the Azure Function crashed because a connection string is undefined
```
### Example: Expression error on null
```
Error code: "BadRequest" ← generic
Action outputs reveal:
inputs: "body('HTTP_GetTokenFromStore')?['token']?['access_token']"
outputs: "" ← empty string, the path resolved to null
← THIS tells you the response shape changed — token is at body.access_token, not body.token.access_token
```
---
## Step 5 — Read the Flow Definition
```python
defn = mcp("get_live_flow", environmentName=ENV, flowName=FLOW_ID)
@@ -177,41 +236,48 @@ to understand what data it expects.
---
## Step 5Inspect Action Outputs (Walk Back from Failure)
## Step 6 — Walk Back from the Failure
For each action **leading up to** the failure, inspect its runtime output:
When the failing action's inputs reference upstream actions, inspect those
too. Walk backward through the chain until you find the source of the
bad data:
```python
for action_name in ["Compose_WeekEnd", "HTTP_Get_Data", "Parse_JSON"]:
# Inspect multiple actions leading up to the failure
for action_name in [root_action, "Compose_WeekEnd", "HTTP_Get_Data"]:
result = mcp("get_live_flow_run_action_outputs",
environmentName=ENV,
flowName=FLOW_ID,
runName=RUN_ID,
actionName=action_name)
# Returns an array — single-element when actionName is provided
out = result[0] if result else {}
print(action_name, out.get("status"))
print(json.dumps(out.get("outputs", {}), indent=2)[:500])
print(f"\n--- {action_name} ({out.get('status')}) ---")
print(f"Inputs: {json.dumps(out.get('inputs', ''), indent=2)[:300]}")
print(f"Outputs: {json.dumps(out.get('outputs', ''), indent=2)[:300]}")
```
> ⚠️ Output payloads from array-processing actions can be very large.
> Always slice (e.g. `[:500]`) before printing.
> **Tip**: Omit `actionName` to get ALL actions in a single call.
> This returns every action's inputs/outputs — useful when you're not sure
> which upstream action produced the bad data. But use 120s+ timeout as
> the response can be very large.
---
## Step 6 — Pinpoint the Root Cause
## Step 7 — Pinpoint the Root Cause
### Expression Errors (e.g. `split` on null)
If the error mentions `InvalidTemplate` or a function name:
1. Find the action in the definition
2. Check what upstream action/expression it reads
3. Inspect that upstream action's output for null / missing fields
3. **Inspect that upstream action's output** for null / missing fields
```python
# Example: action uses split(item()?['Name'], ' ')
# → null Name in the source data
result = mcp("get_live_flow_run_action_outputs", ..., actionName="Compose_Names")
# Returns a single-element array; index [0] to get the action object
if not result:
print("No outputs returned for Compose_Names")
names = []
@@ -223,9 +289,20 @@ print(f"{len(nulls)} records with null Name")
### Wrong Field Path
Expression `triggerBody()?['fieldName']` returns null → `fieldName` is wrong.
Check the trigger output shape with:
**Inspect the trigger output** to see the actual field names:
```python
mcp("get_live_flow_run_action_outputs", ..., actionName="<trigger-action-name>")
result = mcp("get_live_flow_run_action_outputs", ..., actionName="<trigger-action-name>")
print(json.dumps(result[0].get("outputs"), indent=2)[:500])
```
### HTTP Actions Returning Errors
The error code says `InternalServerError` or `NotSpecified` — **always inspect
the action outputs** to get the actual HTTP status and response body:
```python
result = mcp("get_live_flow_run_action_outputs", ..., actionName="HTTP_Get_Data")
out = result[0]
print(f"HTTP {out['outputs']['statusCode']}")
print(json.dumps(out['outputs']['body'], indent=2)[:500])
```
### Connection / Auth Failures
@@ -234,7 +311,7 @@ service account running the flow. Cannot fix via API; fix in PA designer.
---
## Step 7 — Apply the Fix
## Step 8 — Apply the Fix
**For expression/data issues**:
```python
@@ -260,13 +337,23 @@ print(result.get("error")) # None = success
---
## Step 8 — Verify the Fix
## Step 9 — Verify the Fix
> **Use `resubmit_live_flow_run` to test ANY flow — not just HTTP triggers.**
> `resubmit_live_flow_run` replays a previous run using its original trigger
> payload. This works for **every trigger type**: Recurrence, SharePoint
> "When an item is created", connector webhooks, Button triggers, and HTTP
> triggers. You do NOT need to ask the user to manually trigger the flow or
> wait for the next scheduled run.
>
> The only case where `resubmit` is not available is a **brand-new flow that
> has never run** — it has no prior run to replay.
```python
# Resubmit the failed run
# Resubmit the failed run — works for ANY trigger type
resubmit = mcp("resubmit_live_flow_run",
environmentName=ENV, flowName=FLOW_ID, runName=RUN_ID)
print(resubmit)
print(resubmit) # {"resubmitted": true, "triggerName": "..."}
# Wait ~30 s then check
import time; time.sleep(30)
@@ -274,16 +361,26 @@ new_runs = mcp("get_live_flow_runs", environmentName=ENV, flowName=FLOW_ID, top=
print(new_runs[0]["status"]) # Succeeded = done
```
### Testing HTTP-Triggered Flows
### When to use resubmit vs trigger
For flows with a `Request` (HTTP) trigger, use `trigger_live_flow` instead
of `resubmit_live_flow_run` to test with custom payloads:
| Scenario | Use | Why |
|---|---|---|
| **Testing a fix** on any flow | `resubmit_live_flow_run` | Replays the exact trigger payload that caused the failure — best way to verify |
| Recurrence / scheduled flow | `resubmit_live_flow_run` | Cannot be triggered on demand any other way |
| SharePoint / connector trigger | `resubmit_live_flow_run` | Cannot be triggered without creating a real SP item |
| HTTP trigger with **custom** test payload | `trigger_live_flow` | When you need to send different data than the original run |
| Brand-new flow, never run | `trigger_live_flow` (HTTP only) | No prior run exists to resubmit |
### Testing HTTP-Triggered Flows with custom payloads
For flows with a `Request` (HTTP) trigger, use `trigger_live_flow` when you
need to send a **different** payload than the original run:
```python
# First inspect what the trigger expects
schema = mcp("get_live_flow_http_schema",
environmentName=ENV, flowName=FLOW_ID)
print("Expected body schema:", schema.get("triggerSchema"))
print("Expected body schema:", schema.get("requestSchema"))
print("Response schemas:", schema.get("responseSchemas"))
# Trigger with a test payload
@@ -291,7 +388,7 @@ result = mcp("trigger_live_flow",
environmentName=ENV,
flowName=FLOW_ID,
body={"name": "Test User", "value": 42})
print(f"Status: {result['status']}, Body: {result.get('body')}")
print(f"Status: {result['responseStatus']}, Body: {result.get('responseBody')}")
```
> `trigger_live_flow` handles AAD-authenticated triggers automatically.
@@ -301,13 +398,19 @@ print(f"Status: {result['status']}, Body: {result.get('body')}")
## Quick-Reference Diagnostic Decision Tree
| Symptom | First Tool to Call | What to Look For |
|---|---|---|
| Flow shows as Failed | `get_live_flow_run_error` | `failedActions[-1]["actionName"]` = root cause |
| Expression crash | `get_live_flow_run_action_outputs` on prior action | null / wrong-type fields in output body |
| Flow never starts | `get_live_flow` | check `properties.state` = "Started" |
| Action returns wrong data | `get_live_flow_run_action_outputs` | actual output body vs expected |
| Fix applied but still fails | `get_live_flow_runs` after resubmit | new run `status` field |
| Symptom | First Tool | Then ALWAYS Call | What to Look For |
|---|---|---|---|
| Flow shows as Failed | `get_live_flow_run_error` | `get_live_flow_run_action_outputs` on the failing action | HTTP status + response body in `outputs` |
| Error code is generic (`ActionFailed`, `NotSpecified`) | — | `get_live_flow_run_action_outputs` | The `outputs.body` contains the real error message, stack trace, or API error |
| HTTP action returns 500 | — | `get_live_flow_run_action_outputs` | `outputs.statusCode` + `outputs.body` with server error detail |
| Expression crash | — | `get_live_flow_run_action_outputs` on prior action | null / wrong-type fields in output body |
| Flow never starts | `get_live_flow` | — | check `properties.state` = "Started" |
| Action returns wrong data | `get_live_flow_run_action_outputs` | — | actual output body vs expected |
| Fix applied but still fails | `get_live_flow_runs` after resubmit | — | new run `status` field |
> **Rule: never diagnose from error codes alone.** `get_live_flow_run_error`
> identifies the failing action. `get_live_flow_run_action_outputs` reveals
> the actual cause. Always call both.
---