mirror of
https://github.com/github/awesome-copilot.git
synced 2026-02-20 02:15:12 +00:00
Merge remote-tracking branch 'origin/main' into plugin-migration
This commit is contained in:
50
agents/agent-governance-reviewer.agent.md
Normal file
50
agents/agent-governance-reviewer.agent.md
Normal file
@@ -0,0 +1,50 @@
|
||||
---
|
||||
description: 'AI agent governance expert that reviews code for safety issues, missing governance controls, and helps implement policy enforcement, trust scoring, and audit trails in agent systems.'
|
||||
model: 'gpt-4o'
|
||||
tools: ['codebase', 'terminalCommand']
|
||||
name: 'Agent Governance Reviewer'
|
||||
---
|
||||
|
||||
You are an expert in AI agent governance, safety, and trust systems. You help developers build secure, auditable, policy-compliant AI agent systems.
|
||||
|
||||
## Your Expertise
|
||||
|
||||
- Governance policy design (allowlists, blocklists, content filters, rate limits)
|
||||
- Semantic intent classification for threat detection
|
||||
- Trust scoring with temporal decay for multi-agent systems
|
||||
- Audit trail design for compliance and observability
|
||||
- Policy composition (most-restrictive-wins merging)
|
||||
- Framework-specific integration (PydanticAI, CrewAI, OpenAI Agents, LangChain, AutoGen)
|
||||
|
||||
## Your Approach
|
||||
|
||||
- Always review existing code for governance gaps before suggesting additions
|
||||
- Recommend the minimum governance controls needed — don't over-engineer
|
||||
- Prefer configuration-driven policies (YAML/JSON) over hardcoded rules
|
||||
- Suggest fail-closed patterns — deny on ambiguity, not allow
|
||||
- Think about multi-agent trust boundaries when reviewing delegation patterns
|
||||
|
||||
## When Reviewing Code
|
||||
|
||||
1. Check if tool functions have governance decorators or policy checks
|
||||
2. Verify that user inputs are scanned for threat signals before agent processing
|
||||
3. Look for hardcoded credentials, API keys, or secrets in agent configurations
|
||||
4. Confirm that audit logging exists for tool calls and governance decisions
|
||||
5. Check if rate limits are enforced on tool calls
|
||||
6. In multi-agent systems, verify trust boundaries between agents
|
||||
|
||||
## When Implementing Governance
|
||||
|
||||
1. Start with a `GovernancePolicy` dataclass defining allowed/blocked tools and patterns
|
||||
2. Add a `@govern(policy)` decorator to all tool functions
|
||||
3. Add intent classification to the input processing pipeline
|
||||
4. Implement audit trail logging for all governance events
|
||||
5. For multi-agent systems, add trust scoring with decay
|
||||
|
||||
## Guidelines
|
||||
|
||||
- Never suggest removing existing security controls
|
||||
- Always recommend append-only audit trails (never suggest mutable logs)
|
||||
- Prefer explicit allowlists over blocklists (allowlists are safer by default)
|
||||
- When in doubt, recommend human-in-the-loop for high-impact operations
|
||||
- Keep governance code separate from business logic
|
||||
@@ -24,6 +24,7 @@ Custom agents for GitHub Copilot, making it easy for users and organizations to
|
||||
| [Accessibility Expert](../agents/accessibility.agent.md)<br />[](https://aka.ms/awesome-copilot/install/agent?url=vscode%3Achat-agent%2Finstall%3Furl%3Dhttps%3A%2F%2Fraw.githubusercontent.com%2Fgithub%2Fawesome-copilot%2Fmain%2Fagents%2Faccessibility.agent.md)<br />[](https://aka.ms/awesome-copilot/install/agent?url=vscode-insiders%3Achat-agent%2Finstall%3Furl%3Dhttps%3A%2F%2Fraw.githubusercontent.com%2Fgithub%2Fawesome-copilot%2Fmain%2Fagents%2Faccessibility.agent.md) | Expert assistant for web accessibility (WCAG 2.1/2.2), inclusive UX, and a11y testing | |
|
||||
| [ADR Generator](../agents/adr-generator.agent.md)<br />[](https://aka.ms/awesome-copilot/install/agent?url=vscode%3Achat-agent%2Finstall%3Furl%3Dhttps%3A%2F%2Fraw.githubusercontent.com%2Fgithub%2Fawesome-copilot%2Fmain%2Fagents%2Fadr-generator.agent.md)<br />[](https://aka.ms/awesome-copilot/install/agent?url=vscode-insiders%3Achat-agent%2Finstall%3Furl%3Dhttps%3A%2F%2Fraw.githubusercontent.com%2Fgithub%2Fawesome-copilot%2Fmain%2Fagents%2Fadr-generator.agent.md) | Expert agent for creating comprehensive Architectural Decision Records (ADRs) with structured formatting optimized for AI consumption and human readability. | |
|
||||
| [AEM Front End Specialist](../agents/aem-frontend-specialist.agent.md)<br />[](https://aka.ms/awesome-copilot/install/agent?url=vscode%3Achat-agent%2Finstall%3Furl%3Dhttps%3A%2F%2Fraw.githubusercontent.com%2Fgithub%2Fawesome-copilot%2Fmain%2Fagents%2Faem-frontend-specialist.agent.md)<br />[](https://aka.ms/awesome-copilot/install/agent?url=vscode-insiders%3Achat-agent%2Finstall%3Furl%3Dhttps%3A%2F%2Fraw.githubusercontent.com%2Fgithub%2Fawesome-copilot%2Fmain%2Fagents%2Faem-frontend-specialist.agent.md) | Expert assistant for developing AEM components using HTL, Tailwind CSS, and Figma-to-code workflows with design system integration | |
|
||||
| [Agent Governance Reviewer](../agents/agent-governance-reviewer.agent.md)<br />[](https://aka.ms/awesome-copilot/install/agent?url=vscode%3Achat-agent%2Finstall%3Furl%3Dhttps%3A%2F%2Fraw.githubusercontent.com%2Fgithub%2Fawesome-copilot%2Fmain%2Fagents%2Fagent-governance-reviewer.agent.md)<br />[](https://aka.ms/awesome-copilot/install/agent?url=vscode-insiders%3Achat-agent%2Finstall%3Furl%3Dhttps%3A%2F%2Fraw.githubusercontent.com%2Fgithub%2Fawesome-copilot%2Fmain%2Fagents%2Fagent-governance-reviewer.agent.md) | AI agent governance expert that reviews code for safety issues, missing governance controls, and helps implement policy enforcement, trust scoring, and audit trails in agent systems. | |
|
||||
| [Amplitude Experiment Implementation](../agents/amplitude-experiment-implementation.agent.md)<br />[](https://aka.ms/awesome-copilot/install/agent?url=vscode%3Achat-agent%2Finstall%3Furl%3Dhttps%3A%2F%2Fraw.githubusercontent.com%2Fgithub%2Fawesome-copilot%2Fmain%2Fagents%2Famplitude-experiment-implementation.agent.md)<br />[](https://aka.ms/awesome-copilot/install/agent?url=vscode-insiders%3Achat-agent%2Finstall%3Furl%3Dhttps%3A%2F%2Fraw.githubusercontent.com%2Fgithub%2Fawesome-copilot%2Fmain%2Fagents%2Famplitude-experiment-implementation.agent.md) | This custom agent uses Amplitude's MCP tools to deploy new experiments inside of Amplitude, enabling seamless variant testing capabilities and rollout of product features. | |
|
||||
| [API Architect](../agents/api-architect.agent.md)<br />[](https://aka.ms/awesome-copilot/install/agent?url=vscode%3Achat-agent%2Finstall%3Furl%3Dhttps%3A%2F%2Fraw.githubusercontent.com%2Fgithub%2Fawesome-copilot%2Fmain%2Fagents%2Fapi-architect.agent.md)<br />[](https://aka.ms/awesome-copilot/install/agent?url=vscode-insiders%3Achat-agent%2Finstall%3Furl%3Dhttps%3A%2F%2Fraw.githubusercontent.com%2Fgithub%2Fawesome-copilot%2Fmain%2Fagents%2Fapi-architect.agent.md) | Your role is that of an API architect. Help mentor the engineer by providing guidance, support, and working code. | |
|
||||
| [Apify Integration Expert](../agents/apify-integration-expert.agent.md)<br />[](https://aka.ms/awesome-copilot/install/agent?url=vscode%3Achat-agent%2Finstall%3Furl%3Dhttps%3A%2F%2Fraw.githubusercontent.com%2Fgithub%2Fawesome-copilot%2Fmain%2Fagents%2Fapify-integration-expert.agent.md)<br />[](https://aka.ms/awesome-copilot/install/agent?url=vscode-insiders%3Achat-agent%2Finstall%3Furl%3Dhttps%3A%2F%2Fraw.githubusercontent.com%2Fgithub%2Fawesome-copilot%2Fmain%2Fagents%2Fapify-integration-expert.agent.md) | Expert agent for integrating Apify Actors into codebases. Handles Actor selection, workflow design, implementation across JavaScript/TypeScript and Python, testing, and production-ready deployment. | [apify](https://github.com/mcp/com.apify/apify-mcp-server)<br />[](https://aka.ms/awesome-copilot/install/mcp-vscode?name=apify&config=%7B%22url%22%3A%22https%3A%2F%2Fmcp.apify.com%22%2C%22headers%22%3A%7B%22Authorization%22%3A%22Bearer%20%24APIFY_TOKEN%22%2C%22Content-Type%22%3A%22application%2Fjson%22%7D%7D)<br />[](https://aka.ms/awesome-copilot/install/mcp-vscodeinsiders?name=apify&config=%7B%22url%22%3A%22https%3A%2F%2Fmcp.apify.com%22%2C%22headers%22%3A%7B%22Authorization%22%3A%22Bearer%20%24APIFY_TOKEN%22%2C%22Content-Type%22%3A%22application%2Fjson%22%7D%7D)<br />[](https://aka.ms/awesome-copilot/install/mcp-visualstudio/mcp-install?%7B%22url%22%3A%22https%3A%2F%2Fmcp.apify.com%22%2C%22headers%22%3A%7B%22Authorization%22%3A%22Bearer%20%24APIFY_TOKEN%22%2C%22Content-Type%22%3A%22application%2Fjson%22%7D%7D) |
|
||||
|
||||
@@ -27,5 +27,6 @@ Hooks enable automated workflows triggered by specific events during GitHub Copi
|
||||
|
||||
| Name | Description | Events | Bundled Assets |
|
||||
| ---- | ----------- | ------ | -------------- |
|
||||
| [Governance Audit](../hooks/governance-audit/README.md) | Scans Copilot agent prompts for threat signals and logs governance events | sessionStart, sessionEnd, userPromptSubmitted | `audit-prompt.sh`<br />`audit-session-end.sh`<br />`audit-session-start.sh`<br />`hooks.json` |
|
||||
| [Session Auto-Commit](../hooks/session-auto-commit/README.md) | Automatically commits and pushes changes when a Copilot coding agent session ends | sessionEnd | `auto-commit.sh`<br />`hooks.json` |
|
||||
| [Session Logger](../hooks/session-logger/README.md) | Logs all Copilot coding agent session activity for audit and analysis | sessionStart, sessionEnd, userPromptSubmitted | `hooks.json`<br />`log-prompt.sh`<br />`log-session-end.sh`<br />`log-session-start.sh` |
|
||||
|
||||
@@ -18,6 +18,7 @@ Team and project-specific instructions to enhance GitHub Copilot's behavior for
|
||||
| [.NET Framework Upgrade Specialist](../instructions/dotnet-upgrade.instructions.md)<br />[](https://aka.ms/awesome-copilot/install/instructions?url=vscode%3Achat-instructions%2Finstall%3Furl%3Dhttps%3A%2F%2Fraw.githubusercontent.com%2Fgithub%2Fawesome-copilot%2Fmain%2Finstructions%2Fdotnet-upgrade.instructions.md)<br />[](https://aka.ms/awesome-copilot/install/instructions?url=vscode-insiders%3Achat-instructions%2Finstall%3Furl%3Dhttps%3A%2F%2Fraw.githubusercontent.com%2Fgithub%2Fawesome-copilot%2Fmain%2Finstructions%2Fdotnet-upgrade.instructions.md) | Specialized agent for comprehensive .NET framework upgrades with progressive tracking and validation |
|
||||
| [.NET MAUI](../instructions/dotnet-maui.instructions.md)<br />[](https://aka.ms/awesome-copilot/install/instructions?url=vscode%3Achat-instructions%2Finstall%3Furl%3Dhttps%3A%2F%2Fraw.githubusercontent.com%2Fgithub%2Fawesome-copilot%2Fmain%2Finstructions%2Fdotnet-maui.instructions.md)<br />[](https://aka.ms/awesome-copilot/install/instructions?url=vscode-insiders%3Achat-instructions%2Finstall%3Furl%3Dhttps%3A%2F%2Fraw.githubusercontent.com%2Fgithub%2Fawesome-copilot%2Fmain%2Finstructions%2Fdotnet-maui.instructions.md) | .NET MAUI component and application patterns |
|
||||
| [Accessibility instructions](../instructions/a11y.instructions.md)<br />[](https://aka.ms/awesome-copilot/install/instructions?url=vscode%3Achat-instructions%2Finstall%3Furl%3Dhttps%3A%2F%2Fraw.githubusercontent.com%2Fgithub%2Fawesome-copilot%2Fmain%2Finstructions%2Fa11y.instructions.md)<br />[](https://aka.ms/awesome-copilot/install/instructions?url=vscode-insiders%3Achat-instructions%2Finstall%3Furl%3Dhttps%3A%2F%2Fraw.githubusercontent.com%2Fgithub%2Fawesome-copilot%2Fmain%2Finstructions%2Fa11y.instructions.md) | Guidance for creating more accessible code |
|
||||
| [Agent Safety & Governance](../instructions/agent-safety.instructions.md)<br />[](https://aka.ms/awesome-copilot/install/instructions?url=vscode%3Achat-instructions%2Finstall%3Furl%3Dhttps%3A%2F%2Fraw.githubusercontent.com%2Fgithub%2Fawesome-copilot%2Fmain%2Finstructions%2Fagent-safety.instructions.md)<br />[](https://aka.ms/awesome-copilot/install/instructions?url=vscode-insiders%3Achat-instructions%2Finstall%3Furl%3Dhttps%3A%2F%2Fraw.githubusercontent.com%2Fgithub%2Fawesome-copilot%2Fmain%2Finstructions%2Fagent-safety.instructions.md) | Guidelines for building safe, governed AI agent systems. Apply when writing code that uses agent frameworks, tool-calling LLMs, or multi-agent orchestration to ensure proper safety boundaries, policy enforcement, and auditability. |
|
||||
| [Agent Skills File Guidelines](../instructions/agent-skills.instructions.md)<br />[](https://aka.ms/awesome-copilot/install/instructions?url=vscode%3Achat-instructions%2Finstall%3Furl%3Dhttps%3A%2F%2Fraw.githubusercontent.com%2Fgithub%2Fawesome-copilot%2Fmain%2Finstructions%2Fagent-skills.instructions.md)<br />[](https://aka.ms/awesome-copilot/install/instructions?url=vscode-insiders%3Achat-instructions%2Finstall%3Furl%3Dhttps%3A%2F%2Fraw.githubusercontent.com%2Fgithub%2Fawesome-copilot%2Fmain%2Finstructions%2Fagent-skills.instructions.md) | Guidelines for creating high-quality Agent Skills for GitHub Copilot |
|
||||
| [AI Prompt Engineering & Safety Best Practices](../instructions/ai-prompt-engineering-safety-best-practices.instructions.md)<br />[](https://aka.ms/awesome-copilot/install/instructions?url=vscode%3Achat-instructions%2Finstall%3Furl%3Dhttps%3A%2F%2Fraw.githubusercontent.com%2Fgithub%2Fawesome-copilot%2Fmain%2Finstructions%2Fai-prompt-engineering-safety-best-practices.instructions.md)<br />[](https://aka.ms/awesome-copilot/install/instructions?url=vscode-insiders%3Achat-instructions%2Finstall%3Furl%3Dhttps%3A%2F%2Fraw.githubusercontent.com%2Fgithub%2Fawesome-copilot%2Fmain%2Finstructions%2Fai-prompt-engineering-safety-best-practices.instructions.md) | Comprehensive best practices for AI prompt engineering, safety frameworks, bias mitigation, and responsible AI usage for Copilot and LLMs. |
|
||||
| [Angular Development Instructions](../instructions/angular.instructions.md)<br />[](https://aka.ms/awesome-copilot/install/instructions?url=vscode%3Achat-instructions%2Finstall%3Furl%3Dhttps%3A%2F%2Fraw.githubusercontent.com%2Fgithub%2Fawesome-copilot%2Fmain%2Finstructions%2Fangular.instructions.md)<br />[](https://aka.ms/awesome-copilot/install/instructions?url=vscode-insiders%3Achat-instructions%2Finstall%3Furl%3Dhttps%3A%2F%2Fraw.githubusercontent.com%2Fgithub%2Fawesome-copilot%2Fmain%2Finstructions%2Fangular.instructions.md) | Angular-specific coding standards and best practices |
|
||||
|
||||
@@ -22,6 +22,7 @@ Skills differ from other primitives by supporting bundled assets (scripts, code
|
||||
|
||||
| Name | Description | Bundled Assets |
|
||||
| ---- | ----------- | -------------- |
|
||||
| [agent-governance](../skills/agent-governance/SKILL.md) | Patterns and techniques for adding governance, safety, and trust controls to AI agent systems. Use this skill when:<br />- Building AI agents that call external tools (APIs, databases, file systems)<br />- Implementing policy-based access controls for agent tool usage<br />- Adding semantic intent classification to detect dangerous prompts<br />- Creating trust scoring systems for multi-agent workflows<br />- Building audit trails for agent actions and decisions<br />- Enforcing rate limits, content filters, or tool restrictions on agents<br />- Working with any agent framework (PydanticAI, CrewAI, OpenAI Agents, LangChain, AutoGen) | None |
|
||||
| [agentic-eval](../skills/agentic-eval/SKILL.md) | Patterns and techniques for evaluating and improving AI agent outputs. Use this skill when:<br />- Implementing self-critique and reflection loops<br />- Building evaluator-optimizer pipelines for quality-critical generation<br />- Creating test-driven code refinement workflows<br />- Designing rubric-based or LLM-as-judge evaluation systems<br />- Adding iterative improvement to agent outputs (code, reports, analysis)<br />- Measuring and improving agent response quality | None |
|
||||
| [appinsights-instrumentation](../skills/appinsights-instrumentation/SKILL.md) | Instrument a webapp to send useful telemetry data to Azure App Insights | `LICENSE.txt`<br />`examples/appinsights.bicep`<br />`references/ASPNETCORE.md`<br />`references/AUTO.md`<br />`references/NODEJS.md`<br />`references/PYTHON.md`<br />`scripts/appinsights.ps1` |
|
||||
| [aspire](../skills/aspire/SKILL.md) | Aspire skill covering the Aspire CLI, AppHost orchestration, service discovery, integrations, MCP server, VS Code extension, Dev Containers, GitHub Codespaces, templates, dashboard, and deployment. Use when the user asks to create, run, debug, configure, deploy, or troubleshoot an Aspire distributed application. | `references/architecture.md`<br />`references/cli-reference.md`<br />`references/dashboard.md`<br />`references/deployment.md`<br />`references/integrations-catalog.md`<br />`references/mcp-server.md`<br />`references/polyglot-apis.md`<br />`references/testing.md`<br />`references/troubleshooting.md` |
|
||||
|
||||
99
hooks/governance-audit/README.md
Normal file
99
hooks/governance-audit/README.md
Normal file
@@ -0,0 +1,99 @@
|
||||
---
|
||||
name: 'Governance Audit'
|
||||
description: 'Scans Copilot agent prompts for threat signals and logs governance events'
|
||||
tags: ['security', 'governance', 'audit', 'safety']
|
||||
---
|
||||
|
||||
# Governance Audit Hook
|
||||
|
||||
Real-time threat detection and audit logging for GitHub Copilot coding agent sessions. Scans user prompts for dangerous patterns before the agent processes them.
|
||||
|
||||
## Overview
|
||||
|
||||
This hook provides governance controls for Copilot coding agent sessions:
|
||||
- **Threat detection**: Scans prompts for data exfiltration, privilege escalation, system destruction, prompt injection, and credential exposure
|
||||
- **Governance levels**: Open, standard, strict, locked — from audit-only to full blocking
|
||||
- **Audit trail**: Append-only JSON log of all governance events
|
||||
- **Session summary**: Reports threat counts at session end
|
||||
|
||||
## Threat Categories
|
||||
|
||||
| Category | Examples | Severity |
|
||||
|----------|----------|----------|
|
||||
| `data_exfiltration` | "send all records to external API" | 0.7 - 0.95 |
|
||||
| `privilege_escalation` | "sudo", "chmod 777", "add to sudoers" | 0.8 - 0.95 |
|
||||
| `system_destruction` | "rm -rf /", "drop database" | 0.9 - 0.95 |
|
||||
| `prompt_injection` | "ignore previous instructions" | 0.6 - 0.9 |
|
||||
| `credential_exposure` | Hardcoded API keys, AWS access keys | 0.9 - 0.95 |
|
||||
|
||||
## Governance Levels
|
||||
|
||||
| Level | Behavior |
|
||||
|-------|----------|
|
||||
| `open` | Log threats only, never block |
|
||||
| `standard` | Log threats, block only if `BLOCK_ON_THREAT=true` |
|
||||
| `strict` | Log and block all detected threats |
|
||||
| `locked` | Log and block all detected threats |
|
||||
|
||||
## Installation
|
||||
|
||||
1. Copy the hook folder to your repository:
|
||||
```bash
|
||||
cp -r hooks/governance-audit .github/hooks/
|
||||
```
|
||||
|
||||
2. Ensure scripts are executable:
|
||||
```bash
|
||||
chmod +x .github/hooks/governance-audit/*.sh
|
||||
```
|
||||
|
||||
3. Create the logs directory and add to `.gitignore`:
|
||||
```bash
|
||||
mkdir -p logs/copilot/governance
|
||||
echo "logs/" >> .gitignore
|
||||
```
|
||||
|
||||
4. Commit to your repository's default branch.
|
||||
|
||||
## Configuration
|
||||
|
||||
Set environment variables in `hooks.json`:
|
||||
|
||||
```json
|
||||
{
|
||||
"env": {
|
||||
"GOVERNANCE_LEVEL": "strict",
|
||||
"BLOCK_ON_THREAT": "true"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
| Variable | Values | Default | Description |
|
||||
|----------|--------|---------|-------------|
|
||||
| `GOVERNANCE_LEVEL` | `open`, `standard`, `strict`, `locked` | `standard` | Controls blocking behavior |
|
||||
| `BLOCK_ON_THREAT` | `true`, `false` | `false` | Block prompts with threats (standard level) |
|
||||
| `SKIP_GOVERNANCE_AUDIT` | `true` | unset | Disable governance audit entirely |
|
||||
|
||||
## Log Format
|
||||
|
||||
Events are written to `logs/copilot/governance/audit.log` in JSON Lines format:
|
||||
|
||||
```json
|
||||
{"timestamp":"2026-01-15T10:30:00Z","event":"session_start","governance_level":"standard","cwd":"/workspace/project"}
|
||||
{"timestamp":"2026-01-15T10:31:00Z","event":"prompt_scanned","governance_level":"standard","status":"clean"}
|
||||
{"timestamp":"2026-01-15T10:32:00Z","event":"threat_detected","governance_level":"standard","threat_count":1,"threats":[{"category":"privilege_escalation","severity":0.8,"description":"Elevated privileges","evidence":"sudo"}]}
|
||||
{"timestamp":"2026-01-15T10:45:00Z","event":"session_end","total_events":12,"threats_detected":1}
|
||||
```
|
||||
|
||||
## Requirements
|
||||
|
||||
- `jq` for JSON processing (pre-installed on most CI environments and macOS)
|
||||
- `grep` with `-E` (extended regex) support
|
||||
- `bc` for floating-point comparison (optional, gracefully degrades)
|
||||
|
||||
## Privacy & Security
|
||||
|
||||
- Full prompts are **never** logged — only matched threat patterns (minimal evidence snippets) and metadata are recorded
|
||||
- Add `logs/` to `.gitignore` to keep audit data local
|
||||
- Set `SKIP_GOVERNANCE_AUDIT=true` to disable entirely
|
||||
- All data stays local — no external network calls
|
||||
136
hooks/governance-audit/audit-prompt.sh
Normal file
136
hooks/governance-audit/audit-prompt.sh
Normal file
@@ -0,0 +1,136 @@
|
||||
#!/bin/bash
|
||||
|
||||
# Governance Audit: Scan user prompts for threat signals before agent processing
|
||||
#
|
||||
# Environment variables:
|
||||
# GOVERNANCE_LEVEL - "open", "standard", "strict", "locked" (default: standard)
|
||||
# BLOCK_ON_THREAT - "true" to exit non-zero on threats (default: false)
|
||||
# SKIP_GOVERNANCE_AUDIT - "true" to disable (default: unset)
|
||||
|
||||
set -euo pipefail
|
||||
|
||||
if [[ "${SKIP_GOVERNANCE_AUDIT:-}" == "true" ]]; then
|
||||
exit 0
|
||||
fi
|
||||
|
||||
INPUT=$(cat)
|
||||
|
||||
mkdir -p logs/copilot/governance
|
||||
|
||||
TIMESTAMP=$(date -u +"%Y-%m-%dT%H:%M:%SZ")
|
||||
LEVEL="${GOVERNANCE_LEVEL:-standard}"
|
||||
BLOCK="${BLOCK_ON_THREAT:-false}"
|
||||
LOG_FILE="logs/copilot/governance/audit.log"
|
||||
|
||||
# Extract prompt text from Copilot input (JSON with userMessage field)
|
||||
PROMPT=""
|
||||
if command -v jq &>/dev/null; then
|
||||
PROMPT=$(echo "$INPUT" | jq -r '.userMessage // .prompt // empty' 2>/dev/null || echo "")
|
||||
fi
|
||||
if [[ -z "$PROMPT" ]]; then
|
||||
PROMPT="$INPUT"
|
||||
fi
|
||||
|
||||
# Threat detection patterns organized by category
|
||||
# Each pattern has: category, description, severity (0.0-1.0)
|
||||
THREATS_FOUND=()
|
||||
|
||||
check_pattern() {
|
||||
local pattern="$1"
|
||||
local category="$2"
|
||||
local severity="$3"
|
||||
local description="$4"
|
||||
|
||||
if echo "$PROMPT" | grep -qiE "$pattern"; then
|
||||
local evidence
|
||||
evidence=$(echo "$PROMPT" | grep -oiE "$pattern" | head -1)
|
||||
local evidence_encoded
|
||||
evidence_encoded=$(printf '%s' "$evidence" | base64 | tr -d '\n')
|
||||
THREATS_FOUND+=("$category $severity $description $evidence_encoded")
|
||||
fi
|
||||
}
|
||||
|
||||
# Data exfiltration signals
|
||||
check_pattern "send\s+(all|every|entire)\s+\w+\s+to\s+" "data_exfiltration" "0.8" "Bulk data transfer"
|
||||
check_pattern "export\s+.*\s+to\s+(external|outside|third[_-]?party)" "data_exfiltration" "0.9" "External export"
|
||||
check_pattern "curl\s+.*\s+-d\s+" "data_exfiltration" "0.7" "HTTP POST with data"
|
||||
check_pattern "upload\s+.*\s+(credentials|secrets|keys)" "data_exfiltration" "0.95" "Credential upload"
|
||||
|
||||
# Privilege escalation signals
|
||||
check_pattern "(sudo|as\s+root|admin\s+access|runas\s+/user)" "privilege_escalation" "0.8" "Elevated privileges"
|
||||
check_pattern "chmod\s+777" "privilege_escalation" "0.9" "World-writable permissions"
|
||||
check_pattern "add\s+.*\s+(sudoers|administrators)" "privilege_escalation" "0.95" "Adding admin access"
|
||||
|
||||
# System destruction signals
|
||||
check_pattern "(rm\s+-rf\s+/|del\s+/[sq]|format\s+c:)" "system_destruction" "0.95" "Destructive command"
|
||||
check_pattern "(drop\s+database|truncate\s+table|delete\s+from\s+\w+\s*(;|\s*$))" "system_destruction" "0.9" "Database destruction"
|
||||
check_pattern "wipe\s+(all|entire|every)" "system_destruction" "0.9" "Mass deletion"
|
||||
|
||||
# Prompt injection signals
|
||||
check_pattern "ignore\s+(previous|above|all)\s+(instructions?|rules?|prompts?)" "prompt_injection" "0.9" "Instruction override"
|
||||
check_pattern "you\s+are\s+now\s+(a|an)\s+(assistant|ai|bot|system|expert|language\s+model)\b" "prompt_injection" "0.7" "Role reassignment"
|
||||
check_pattern "(^|\n)\s*system\s*:\s*you\s+are" "prompt_injection" "0.6" "System prompt injection"
|
||||
|
||||
# Credential exposure signals
|
||||
check_pattern "(api[_-]?key|secret[_-]?key|password|token)\s*[:=]\s*['\"]?\w{8,}" "credential_exposure" "0.9" "Possible hardcoded credential"
|
||||
check_pattern "(aws_access_key|AKIA[0-9A-Z]{16})" "credential_exposure" "0.95" "AWS key exposure"
|
||||
|
||||
# Log the prompt event
|
||||
if [[ ${#THREATS_FOUND[@]} -gt 0 ]]; then
|
||||
# Build threats JSON array
|
||||
THREATS_JSON="["
|
||||
FIRST=true
|
||||
MAX_SEVERITY="0.0"
|
||||
for threat in "${THREATS_FOUND[@]}"; do
|
||||
IFS=$'\t' read -r category severity description evidence_encoded <<< "$threat"
|
||||
local evidence
|
||||
evidence=$(printf '%s' "$evidence_encoded" | base64 -d 2>/dev/null || echo "[redacted]")
|
||||
|
||||
if [[ "$FIRST" != "true" ]]; then
|
||||
THREATS_JSON+=","
|
||||
fi
|
||||
FIRST=false
|
||||
|
||||
THREATS_JSON+=$(jq -Rn \
|
||||
--arg cat "$category" \
|
||||
--arg sev "$severity" \
|
||||
--arg desc "$description" \
|
||||
--arg ev "$evidence" \
|
||||
'{"category":$cat,"severity":($sev|tonumber),"description":$desc,"evidence":$ev}')
|
||||
|
||||
# Track max severity
|
||||
if (( $(echo "$severity > $MAX_SEVERITY" | bc -l 2>/dev/null || echo 0) )); then
|
||||
MAX_SEVERITY="$severity"
|
||||
fi
|
||||
done
|
||||
THREATS_JSON+="]"
|
||||
|
||||
jq -Rn \
|
||||
--arg timestamp "$TIMESTAMP" \
|
||||
--arg level "$LEVEL" \
|
||||
--arg max_severity "$MAX_SEVERITY" \
|
||||
--argjson threats "$THREATS_JSON" \
|
||||
--argjson count "${#THREATS_FOUND[@]}" \
|
||||
'{"timestamp":$timestamp,"event":"threat_detected","governance_level":$level,"threat_count":$count,"max_severity":($max_severity|tonumber),"threats":$threats}' \
|
||||
>> "$LOG_FILE"
|
||||
|
||||
echo "⚠️ Governance: ${#THREATS_FOUND[@]} threat signal(s) detected (max severity: $MAX_SEVERITY)"
|
||||
for threat in "${THREATS_FOUND[@]}"; do
|
||||
IFS=$'\t' read -r category severity description _evidence_encoded <<< "$threat"
|
||||
echo " 🔴 [$category] $description (severity: $severity)"
|
||||
done
|
||||
|
||||
# In strict/locked mode or when BLOCK_ON_THREAT is true, exit non-zero to block
|
||||
if [[ "$BLOCK" == "true" ]] || [[ "$LEVEL" == "strict" ]] || [[ "$LEVEL" == "locked" ]]; then
|
||||
echo "🚫 Prompt blocked by governance policy (level: $LEVEL)"
|
||||
exit 1
|
||||
fi
|
||||
else
|
||||
jq -Rn \
|
||||
--arg timestamp "$TIMESTAMP" \
|
||||
--arg level "$LEVEL" \
|
||||
'{"timestamp":$timestamp,"event":"prompt_scanned","governance_level":$level,"status":"clean"}' \
|
||||
>> "$LOG_FILE"
|
||||
fi
|
||||
|
||||
exit 0
|
||||
48
hooks/governance-audit/audit-session-end.sh
Normal file
48
hooks/governance-audit/audit-session-end.sh
Normal file
@@ -0,0 +1,48 @@
|
||||
#!/bin/bash
|
||||
|
||||
# Governance Audit: Log session end with summary statistics
|
||||
|
||||
set -euo pipefail
|
||||
|
||||
if [[ "${SKIP_GOVERNANCE_AUDIT:-}" == "true" ]]; then
|
||||
exit 0
|
||||
fi
|
||||
|
||||
INPUT=$(cat)
|
||||
|
||||
mkdir -p logs/copilot/governance
|
||||
|
||||
TIMESTAMP=$(date -u +"%Y-%m-%dT%H:%M:%SZ")
|
||||
LOG_FILE="logs/copilot/governance/audit.log"
|
||||
|
||||
# Count events from this session (filter by session start timestamp)
|
||||
TOTAL=0
|
||||
THREATS=0
|
||||
SESSION_START=""
|
||||
if [[ -f "$LOG_FILE" ]]; then
|
||||
# Find the last session_start event to scope stats to current session
|
||||
SESSION_START=$(grep '"session_start"' "$LOG_FILE" 2>/dev/null | tail -1 | jq -r '.timestamp' 2>/dev/null || echo "")
|
||||
if [[ -n "$SESSION_START" ]]; then
|
||||
# Count events after session start
|
||||
TOTAL=$(awk -v start="$SESSION_START" -F'"timestamp":"' '{split($2,a,"\""); if(a[1]>=start) count++} END{print count+0}' "$LOG_FILE" 2>/dev/null || echo 0)
|
||||
THREATS=$(awk -v start="$SESSION_START" -F'"timestamp":"' '{split($2,a,"\""); if(a[1]>=start && /threat_detected/) count++} END{print count+0}' "$LOG_FILE" 2>/dev/null || echo 0)
|
||||
else
|
||||
TOTAL=$(wc -l < "$LOG_FILE" 2>/dev/null || echo 0)
|
||||
THREATS=$(grep -c '"threat_detected"' "$LOG_FILE" 2>/dev/null || echo 0)
|
||||
fi
|
||||
fi
|
||||
|
||||
jq -Rn \
|
||||
--arg timestamp "$TIMESTAMP" \
|
||||
--argjson total "$TOTAL" \
|
||||
--argjson threats "$THREATS" \
|
||||
'{"timestamp":$timestamp,"event":"session_end","total_events":$total,"threats_detected":$threats}' \
|
||||
>> "$LOG_FILE"
|
||||
|
||||
if [[ "$THREATS" -gt 0 ]]; then
|
||||
echo "⚠️ Session ended: $THREATS threat(s) detected in $TOTAL events"
|
||||
else
|
||||
echo "✅ Session ended: $TOTAL events, no threats"
|
||||
fi
|
||||
|
||||
exit 0
|
||||
27
hooks/governance-audit/audit-session-start.sh
Normal file
27
hooks/governance-audit/audit-session-start.sh
Normal file
@@ -0,0 +1,27 @@
|
||||
#!/bin/bash
|
||||
|
||||
# Governance Audit: Log session start with governance context
|
||||
|
||||
set -euo pipefail
|
||||
|
||||
if [[ "${SKIP_GOVERNANCE_AUDIT:-}" == "true" ]]; then
|
||||
exit 0
|
||||
fi
|
||||
|
||||
INPUT=$(cat)
|
||||
|
||||
mkdir -p logs/copilot/governance
|
||||
|
||||
TIMESTAMP=$(date -u +"%Y-%m-%dT%H:%M:%SZ")
|
||||
CWD=$(pwd)
|
||||
LEVEL="${GOVERNANCE_LEVEL:-standard}"
|
||||
|
||||
jq -Rn \
|
||||
--arg timestamp "$TIMESTAMP" \
|
||||
--arg cwd "$CWD" \
|
||||
--arg level "$LEVEL" \
|
||||
'{"timestamp":$timestamp,"event":"session_start","governance_level":$level,"cwd":$cwd}' \
|
||||
>> logs/copilot/governance/audit.log
|
||||
|
||||
echo "🛡️ Governance audit active (level: $LEVEL)"
|
||||
exit 0
|
||||
33
hooks/governance-audit/hooks.json
Normal file
33
hooks/governance-audit/hooks.json
Normal file
@@ -0,0 +1,33 @@
|
||||
{
|
||||
"version": 1,
|
||||
"hooks": {
|
||||
"sessionStart": [
|
||||
{
|
||||
"type": "command",
|
||||
"bash": ".github/hooks/governance-audit/audit-session-start.sh",
|
||||
"cwd": ".",
|
||||
"timeoutSec": 5
|
||||
}
|
||||
],
|
||||
"sessionEnd": [
|
||||
{
|
||||
"type": "command",
|
||||
"bash": ".github/hooks/governance-audit/audit-session-end.sh",
|
||||
"cwd": ".",
|
||||
"timeoutSec": 5
|
||||
}
|
||||
],
|
||||
"userPromptSubmitted": [
|
||||
{
|
||||
"type": "command",
|
||||
"bash": ".github/hooks/governance-audit/audit-prompt.sh",
|
||||
"cwd": ".",
|
||||
"env": {
|
||||
"GOVERNANCE_LEVEL": "standard",
|
||||
"BLOCK_ON_THREAT": "false"
|
||||
},
|
||||
"timeoutSec": 10
|
||||
}
|
||||
]
|
||||
}
|
||||
}
|
||||
95
instructions/agent-safety.instructions.md
Normal file
95
instructions/agent-safety.instructions.md
Normal file
@@ -0,0 +1,95 @@
|
||||
---
|
||||
description: 'Guidelines for building safe, governed AI agent systems. Apply when writing code that uses agent frameworks, tool-calling LLMs, or multi-agent orchestration to ensure proper safety boundaries, policy enforcement, and auditability.'
|
||||
applyTo: '**'
|
||||
---
|
||||
|
||||
# Agent Safety & Governance
|
||||
|
||||
## Core Principles
|
||||
|
||||
- **Fail closed**: If a governance check errors or is ambiguous, deny the action rather than allowing it
|
||||
- **Policy as configuration**: Define governance rules in YAML/JSON files, not hardcoded in application logic
|
||||
- **Least privilege**: Agents should have the minimum tool access needed for their task
|
||||
- **Append-only audit**: Never modify or delete audit trail entries — immutability enables compliance
|
||||
|
||||
## Tool Access Controls
|
||||
|
||||
- Always define an explicit allowlist of tools an agent can use — never give unrestricted tool access
|
||||
- Separate tool registration from tool authorization — the framework knows what tools exist, the policy controls which are allowed
|
||||
- Use blocklists for known-dangerous operations (shell execution, file deletion, database DDL)
|
||||
- Require human-in-the-loop approval for high-impact tools (send email, deploy, delete records)
|
||||
- Enforce rate limits on tool calls per request to prevent infinite loops and resource exhaustion
|
||||
|
||||
## Content Safety
|
||||
|
||||
- Scan all user inputs for threat signals before passing to the agent (data exfiltration, prompt injection, privilege escalation)
|
||||
- Filter agent arguments for sensitive patterns: API keys, credentials, PII, SQL injection
|
||||
- Use regex pattern lists that can be updated without code changes
|
||||
- Check both the user's original prompt AND the agent's generated tool arguments
|
||||
|
||||
## Multi-Agent Safety
|
||||
|
||||
- Each agent in a multi-agent system should have its own governance policy
|
||||
- When agents delegate to other agents, apply the most restrictive policy from either
|
||||
- Track trust scores for agent delegates — degrade trust on failures, require ongoing good behavior
|
||||
- Never allow an inner agent to have broader permissions than the outer agent that called it
|
||||
|
||||
## Audit & Observability
|
||||
|
||||
- Log every tool call with: timestamp, agent ID, tool name, allow/deny decision, policy name
|
||||
- Log every governance violation with the matched rule and evidence
|
||||
- Export audit trails in JSON Lines format for integration with log aggregation systems
|
||||
- Include session boundaries (start/end) in audit logs for correlation
|
||||
|
||||
## Code Patterns
|
||||
|
||||
When writing agent tool functions:
|
||||
```python
|
||||
# Good: Governed tool with explicit policy
|
||||
@govern(policy)
|
||||
async def search(query: str) -> str:
|
||||
...
|
||||
|
||||
# Bad: Unprotected tool with no governance
|
||||
async def search(query: str) -> str:
|
||||
...
|
||||
```
|
||||
|
||||
When defining policies:
|
||||
```yaml
|
||||
# Good: Explicit allowlist, content filters, rate limit
|
||||
name: my-agent
|
||||
allowed_tools: [search, summarize]
|
||||
blocked_patterns: ["(?i)(api_key|password)\\s*[:=]"]
|
||||
max_calls_per_request: 25
|
||||
|
||||
# Bad: No restrictions
|
||||
name: my-agent
|
||||
allowed_tools: ["*"]
|
||||
```
|
||||
|
||||
When composing multi-agent policies:
|
||||
```python
|
||||
# Good: Most-restrictive-wins composition
|
||||
final_policy = compose_policies(org_policy, team_policy, agent_policy)
|
||||
|
||||
# Bad: Only using agent-level policy, ignoring org constraints
|
||||
final_policy = agent_policy
|
||||
```
|
||||
|
||||
## Framework-Specific Notes
|
||||
|
||||
- **PydanticAI**: Use `@agent.tool` with a governance decorator wrapper. PydanticAI's upcoming Traits feature is designed for this pattern.
|
||||
- **CrewAI**: Apply governance at the Crew level to cover all agents. Use `before_kickoff` callbacks for policy validation.
|
||||
- **OpenAI Agents SDK**: Wrap `@function_tool` with governance. Use handoff guards for multi-agent trust.
|
||||
- **LangChain/LangGraph**: Use `RunnableBinding` or tool wrappers for governance. Apply at the graph edge level for flow control.
|
||||
- **AutoGen**: Implement governance in the `ConversableAgent.register_for_execution` hook.
|
||||
|
||||
## Common Mistakes
|
||||
|
||||
- Relying only on output guardrails (post-generation) instead of pre-execution governance
|
||||
- Hardcoding policy rules instead of loading from configuration
|
||||
- Allowing agents to self-modify their own governance policies
|
||||
- Forgetting to governance-check tool *arguments*, not just tool *names*
|
||||
- Not decaying trust scores over time — stale trust is dangerous
|
||||
- Logging prompts in audit trails — log decisions and metadata, not user content
|
||||
569
skills/agent-governance/SKILL.md
Normal file
569
skills/agent-governance/SKILL.md
Normal file
@@ -0,0 +1,569 @@
|
||||
---
|
||||
name: agent-governance
|
||||
description: |
|
||||
Patterns and techniques for adding governance, safety, and trust controls to AI agent systems. Use this skill when:
|
||||
- Building AI agents that call external tools (APIs, databases, file systems)
|
||||
- Implementing policy-based access controls for agent tool usage
|
||||
- Adding semantic intent classification to detect dangerous prompts
|
||||
- Creating trust scoring systems for multi-agent workflows
|
||||
- Building audit trails for agent actions and decisions
|
||||
- Enforcing rate limits, content filters, or tool restrictions on agents
|
||||
- Working with any agent framework (PydanticAI, CrewAI, OpenAI Agents, LangChain, AutoGen)
|
||||
---
|
||||
|
||||
# Agent Governance Patterns
|
||||
|
||||
Patterns for adding safety, trust, and policy enforcement to AI agent systems.
|
||||
|
||||
## Overview
|
||||
|
||||
Governance patterns ensure AI agents operate within defined boundaries — controlling which tools they can call, what content they can process, how much they can do, and maintaining accountability through audit trails.
|
||||
|
||||
```
|
||||
User Request → Intent Classification → Policy Check → Tool Execution → Audit Log
|
||||
↓ ↓ ↓
|
||||
Threat Detection Allow/Deny Trust Update
|
||||
```
|
||||
|
||||
## When to Use
|
||||
|
||||
- **Agents with tool access**: Any agent that calls external tools (APIs, databases, shell commands)
|
||||
- **Multi-agent systems**: Agents delegating to other agents need trust boundaries
|
||||
- **Production deployments**: Compliance, audit, and safety requirements
|
||||
- **Sensitive operations**: Financial transactions, data access, infrastructure management
|
||||
|
||||
---
|
||||
|
||||
## Pattern 1: Governance Policy
|
||||
|
||||
Define what an agent is allowed to do as a composable, serializable policy object.
|
||||
|
||||
```python
|
||||
from dataclasses import dataclass, field
|
||||
from enum import Enum
|
||||
from typing import Optional
|
||||
import re
|
||||
|
||||
class PolicyAction(Enum):
|
||||
ALLOW = "allow"
|
||||
DENY = "deny"
|
||||
REVIEW = "review" # flag for human review
|
||||
|
||||
@dataclass
|
||||
class GovernancePolicy:
|
||||
"""Declarative policy controlling agent behavior."""
|
||||
name: str
|
||||
allowed_tools: list[str] = field(default_factory=list) # allowlist
|
||||
blocked_tools: list[str] = field(default_factory=list) # blocklist
|
||||
blocked_patterns: list[str] = field(default_factory=list) # content filters
|
||||
max_calls_per_request: int = 100 # rate limit
|
||||
require_human_approval: list[str] = field(default_factory=list) # tools needing approval
|
||||
|
||||
def check_tool(self, tool_name: str) -> PolicyAction:
|
||||
"""Check if a tool is allowed by this policy."""
|
||||
if tool_name in self.blocked_tools:
|
||||
return PolicyAction.DENY
|
||||
if tool_name in self.require_human_approval:
|
||||
return PolicyAction.REVIEW
|
||||
if self.allowed_tools and tool_name not in self.allowed_tools:
|
||||
return PolicyAction.DENY
|
||||
return PolicyAction.ALLOW
|
||||
|
||||
def check_content(self, content: str) -> Optional[str]:
|
||||
"""Check content against blocked patterns. Returns matched pattern or None."""
|
||||
for pattern in self.blocked_patterns:
|
||||
if re.search(pattern, content, re.IGNORECASE):
|
||||
return pattern
|
||||
return None
|
||||
```
|
||||
|
||||
### Policy Composition
|
||||
|
||||
Combine multiple policies (e.g., org-wide + team + agent-specific):
|
||||
|
||||
```python
|
||||
def compose_policies(*policies: GovernancePolicy) -> GovernancePolicy:
|
||||
"""Merge policies with most-restrictive-wins semantics."""
|
||||
combined = GovernancePolicy(name="composed")
|
||||
|
||||
for policy in policies:
|
||||
combined.blocked_tools.extend(policy.blocked_tools)
|
||||
combined.blocked_patterns.extend(policy.blocked_patterns)
|
||||
combined.require_human_approval.extend(policy.require_human_approval)
|
||||
combined.max_calls_per_request = min(
|
||||
combined.max_calls_per_request,
|
||||
policy.max_calls_per_request
|
||||
)
|
||||
if policy.allowed_tools:
|
||||
if combined.allowed_tools:
|
||||
combined.allowed_tools = [
|
||||
t for t in combined.allowed_tools if t in policy.allowed_tools
|
||||
]
|
||||
else:
|
||||
combined.allowed_tools = list(policy.allowed_tools)
|
||||
|
||||
return combined
|
||||
|
||||
|
||||
# Usage: layer policies from broad to specific
|
||||
org_policy = GovernancePolicy(
|
||||
name="org-wide",
|
||||
blocked_tools=["shell_exec", "delete_database"],
|
||||
blocked_patterns=[r"(?i)(api[_-]?key|secret|password)\s*[:=]"],
|
||||
max_calls_per_request=50
|
||||
)
|
||||
team_policy = GovernancePolicy(
|
||||
name="data-team",
|
||||
allowed_tools=["query_db", "read_file", "write_report"],
|
||||
require_human_approval=["write_report"]
|
||||
)
|
||||
agent_policy = compose_policies(org_policy, team_policy)
|
||||
```
|
||||
|
||||
### Policy as YAML
|
||||
|
||||
Store policies as configuration, not code:
|
||||
|
||||
```yaml
|
||||
# governance-policy.yaml
|
||||
name: production-agent
|
||||
allowed_tools:
|
||||
- search_documents
|
||||
- query_database
|
||||
- send_email
|
||||
blocked_tools:
|
||||
- shell_exec
|
||||
- delete_record
|
||||
blocked_patterns:
|
||||
- "(?i)(api[_-]?key|secret|password)\\s*[:=]"
|
||||
- "(?i)(drop|truncate|delete from)\\s+\\w+"
|
||||
max_calls_per_request: 25
|
||||
require_human_approval:
|
||||
- send_email
|
||||
```
|
||||
|
||||
```python
|
||||
import yaml
|
||||
|
||||
def load_policy(path: str) -> GovernancePolicy:
|
||||
with open(path) as f:
|
||||
data = yaml.safe_load(f)
|
||||
return GovernancePolicy(**data)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Pattern 2: Semantic Intent Classification
|
||||
|
||||
Detect dangerous intent in prompts before they reach the agent, using pattern-based signals.
|
||||
|
||||
```python
|
||||
from dataclasses import dataclass
|
||||
|
||||
@dataclass
|
||||
class IntentSignal:
|
||||
category: str # e.g., "data_exfiltration", "privilege_escalation"
|
||||
confidence: float # 0.0 to 1.0
|
||||
evidence: str # what triggered the detection
|
||||
|
||||
# Weighted signal patterns for threat detection
|
||||
THREAT_SIGNALS = [
|
||||
# Data exfiltration
|
||||
(r"(?i)send\s+(all|every|entire)\s+\w+\s+to\s+", "data_exfiltration", 0.8),
|
||||
(r"(?i)export\s+.*\s+to\s+(external|outside|third.?party)", "data_exfiltration", 0.9),
|
||||
(r"(?i)curl\s+.*\s+-d\s+", "data_exfiltration", 0.7),
|
||||
|
||||
# Privilege escalation
|
||||
(r"(?i)(sudo|as\s+root|admin\s+access)", "privilege_escalation", 0.8),
|
||||
(r"(?i)chmod\s+777", "privilege_escalation", 0.9),
|
||||
|
||||
# System modification
|
||||
(r"(?i)(rm\s+-rf|del\s+/[sq]|format\s+c:)", "system_destruction", 0.95),
|
||||
(r"(?i)(drop\s+database|truncate\s+table)", "system_destruction", 0.9),
|
||||
|
||||
# Prompt injection
|
||||
(r"(?i)ignore\s+(previous|above|all)\s+(instructions?|rules?)", "prompt_injection", 0.9),
|
||||
(r"(?i)you\s+are\s+now\s+(a|an)\s+", "prompt_injection", 0.7),
|
||||
]
|
||||
|
||||
def classify_intent(content: str) -> list[IntentSignal]:
|
||||
"""Classify content for threat signals."""
|
||||
signals = []
|
||||
for pattern, category, weight in THREAT_SIGNALS:
|
||||
match = re.search(pattern, content)
|
||||
if match:
|
||||
signals.append(IntentSignal(
|
||||
category=category,
|
||||
confidence=weight,
|
||||
evidence=match.group()
|
||||
))
|
||||
return signals
|
||||
|
||||
def is_safe(content: str, threshold: float = 0.7) -> bool:
|
||||
"""Quick check: is the content safe above the given threshold?"""
|
||||
signals = classify_intent(content)
|
||||
return not any(s.confidence >= threshold for s in signals)
|
||||
```
|
||||
|
||||
**Key insight**: Intent classification happens *before* tool execution, acting as a pre-flight safety check. This is fundamentally different from output guardrails which only check *after* generation.
|
||||
|
||||
---
|
||||
|
||||
## Pattern 3: Tool-Level Governance Decorator
|
||||
|
||||
Wrap individual tool functions with governance checks:
|
||||
|
||||
```python
|
||||
import functools
|
||||
import time
|
||||
from collections import defaultdict
|
||||
|
||||
_call_counters: dict[str, int] = defaultdict(int)
|
||||
|
||||
def govern(policy: GovernancePolicy, audit_trail=None):
|
||||
"""Decorator that enforces governance policy on a tool function."""
|
||||
def decorator(func):
|
||||
@functools.wraps(func)
|
||||
async def wrapper(*args, **kwargs):
|
||||
tool_name = func.__name__
|
||||
|
||||
# 1. Check tool allowlist/blocklist
|
||||
action = policy.check_tool(tool_name)
|
||||
if action == PolicyAction.DENY:
|
||||
raise PermissionError(f"Policy '{policy.name}' blocks tool '{tool_name}'")
|
||||
if action == PolicyAction.REVIEW:
|
||||
raise PermissionError(f"Tool '{tool_name}' requires human approval")
|
||||
|
||||
# 2. Check rate limit
|
||||
_call_counters[policy.name] += 1
|
||||
if _call_counters[policy.name] > policy.max_calls_per_request:
|
||||
raise PermissionError(f"Rate limit exceeded: {policy.max_calls_per_request} calls")
|
||||
|
||||
# 3. Check content in arguments
|
||||
for arg in list(args) + list(kwargs.values()):
|
||||
if isinstance(arg, str):
|
||||
matched = policy.check_content(arg)
|
||||
if matched:
|
||||
raise PermissionError(f"Blocked pattern detected: {matched}")
|
||||
|
||||
# 4. Execute and audit
|
||||
start = time.monotonic()
|
||||
try:
|
||||
result = await func(*args, **kwargs)
|
||||
if audit_trail is not None:
|
||||
audit_trail.append({
|
||||
"tool": tool_name,
|
||||
"action": "allowed",
|
||||
"duration_ms": (time.monotonic() - start) * 1000,
|
||||
"timestamp": time.time()
|
||||
})
|
||||
return result
|
||||
except Exception as e:
|
||||
if audit_trail is not None:
|
||||
audit_trail.append({
|
||||
"tool": tool_name,
|
||||
"action": "error",
|
||||
"error": str(e),
|
||||
"timestamp": time.time()
|
||||
})
|
||||
raise
|
||||
|
||||
return wrapper
|
||||
return decorator
|
||||
|
||||
|
||||
# Usage with any agent framework
|
||||
audit_log = []
|
||||
policy = GovernancePolicy(
|
||||
name="search-agent",
|
||||
allowed_tools=["search", "summarize"],
|
||||
blocked_patterns=[r"(?i)password"],
|
||||
max_calls_per_request=10
|
||||
)
|
||||
|
||||
@govern(policy, audit_trail=audit_log)
|
||||
async def search(query: str) -> str:
|
||||
"""Search documents — governed by policy."""
|
||||
return f"Results for: {query}"
|
||||
|
||||
# Passes: search("latest quarterly report")
|
||||
# Blocked: search("show me the admin password")
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Pattern 4: Trust Scoring
|
||||
|
||||
Track agent reliability over time with decay-based trust scores:
|
||||
|
||||
```python
|
||||
from dataclasses import dataclass, field
|
||||
import math
|
||||
import time
|
||||
|
||||
@dataclass
|
||||
class TrustScore:
|
||||
"""Trust score with temporal decay."""
|
||||
score: float = 0.5 # 0.0 (untrusted) to 1.0 (fully trusted)
|
||||
successes: int = 0
|
||||
failures: int = 0
|
||||
last_updated: float = field(default_factory=time.time)
|
||||
|
||||
def record_success(self, reward: float = 0.05):
|
||||
self.successes += 1
|
||||
self.score = min(1.0, self.score + reward * (1 - self.score))
|
||||
self.last_updated = time.time()
|
||||
|
||||
def record_failure(self, penalty: float = 0.15):
|
||||
self.failures += 1
|
||||
self.score = max(0.0, self.score - penalty * self.score)
|
||||
self.last_updated = time.time()
|
||||
|
||||
def current(self, decay_rate: float = 0.001) -> float:
|
||||
"""Get score with temporal decay — trust erodes without activity."""
|
||||
elapsed = time.time() - self.last_updated
|
||||
decay = math.exp(-decay_rate * elapsed)
|
||||
return self.score * decay
|
||||
|
||||
@property
|
||||
def reliability(self) -> float:
|
||||
total = self.successes + self.failures
|
||||
return self.successes / total if total > 0 else 0.0
|
||||
|
||||
|
||||
# Usage in multi-agent systems
|
||||
trust = TrustScore()
|
||||
|
||||
# Agent completes tasks successfully
|
||||
trust.record_success() # 0.525
|
||||
trust.record_success() # 0.549
|
||||
|
||||
# Agent makes an error
|
||||
trust.record_failure() # 0.467
|
||||
|
||||
# Gate sensitive operations on trust
|
||||
if trust.current() >= 0.7:
|
||||
# Allow autonomous operation
|
||||
pass
|
||||
elif trust.current() >= 0.4:
|
||||
# Allow with human oversight
|
||||
pass
|
||||
else:
|
||||
# Deny or require explicit approval
|
||||
pass
|
||||
```
|
||||
|
||||
**Multi-agent trust**: In systems where agents delegate to other agents, each agent maintains trust scores for its delegates:
|
||||
|
||||
```python
|
||||
class AgentTrustRegistry:
|
||||
def __init__(self):
|
||||
self.scores: dict[str, TrustScore] = {}
|
||||
|
||||
def get_trust(self, agent_id: str) -> TrustScore:
|
||||
if agent_id not in self.scores:
|
||||
self.scores[agent_id] = TrustScore()
|
||||
return self.scores[agent_id]
|
||||
|
||||
def most_trusted(self, agents: list[str]) -> str:
|
||||
return max(agents, key=lambda a: self.get_trust(a).current())
|
||||
|
||||
def meets_threshold(self, agent_id: str, threshold: float) -> bool:
|
||||
return self.get_trust(agent_id).current() >= threshold
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Pattern 5: Audit Trail
|
||||
|
||||
Append-only audit log for all agent actions — critical for compliance and debugging:
|
||||
|
||||
```python
|
||||
from dataclasses import dataclass, field
|
||||
import json
|
||||
import time
|
||||
|
||||
@dataclass
|
||||
class AuditEntry:
|
||||
timestamp: float
|
||||
agent_id: str
|
||||
tool_name: str
|
||||
action: str # "allowed", "denied", "error"
|
||||
policy_name: str
|
||||
details: dict = field(default_factory=dict)
|
||||
|
||||
class AuditTrail:
|
||||
"""Append-only audit trail for agent governance events."""
|
||||
def __init__(self):
|
||||
self._entries: list[AuditEntry] = []
|
||||
|
||||
def log(self, agent_id: str, tool_name: str, action: str,
|
||||
policy_name: str, **details):
|
||||
self._entries.append(AuditEntry(
|
||||
timestamp=time.time(),
|
||||
agent_id=agent_id,
|
||||
tool_name=tool_name,
|
||||
action=action,
|
||||
policy_name=policy_name,
|
||||
details=details
|
||||
))
|
||||
|
||||
def denied(self) -> list[AuditEntry]:
|
||||
"""Get all denied actions — useful for security review."""
|
||||
return [e for e in self._entries if e.action == "denied"]
|
||||
|
||||
def by_agent(self, agent_id: str) -> list[AuditEntry]:
|
||||
return [e for e in self._entries if e.agent_id == agent_id]
|
||||
|
||||
def export_jsonl(self, path: str):
|
||||
"""Export as JSON Lines for log aggregation systems."""
|
||||
with open(path, "w") as f:
|
||||
for entry in self._entries:
|
||||
f.write(json.dumps({
|
||||
"timestamp": entry.timestamp,
|
||||
"agent_id": entry.agent_id,
|
||||
"tool": entry.tool_name,
|
||||
"action": entry.action,
|
||||
"policy": entry.policy_name,
|
||||
**entry.details
|
||||
}) + "\n")
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Pattern 6: Framework Integration
|
||||
|
||||
### PydanticAI
|
||||
|
||||
```python
|
||||
from pydantic_ai import Agent
|
||||
|
||||
policy = GovernancePolicy(
|
||||
name="support-bot",
|
||||
allowed_tools=["search_docs", "create_ticket"],
|
||||
blocked_patterns=[r"(?i)(ssn|social\s+security|credit\s+card)"],
|
||||
max_calls_per_request=20
|
||||
)
|
||||
|
||||
agent = Agent("openai:gpt-4o", system_prompt="You are a support assistant.")
|
||||
|
||||
@agent.tool
|
||||
@govern(policy)
|
||||
async def search_docs(ctx, query: str) -> str:
|
||||
"""Search knowledge base — governed."""
|
||||
return await kb.search(query)
|
||||
|
||||
@agent.tool
|
||||
@govern(policy)
|
||||
async def create_ticket(ctx, title: str, body: str) -> str:
|
||||
"""Create support ticket — governed."""
|
||||
return await tickets.create(title=title, body=body)
|
||||
```
|
||||
|
||||
### CrewAI
|
||||
|
||||
```python
|
||||
from crewai import Agent, Task, Crew
|
||||
|
||||
policy = GovernancePolicy(
|
||||
name="research-crew",
|
||||
allowed_tools=["search", "analyze"],
|
||||
max_calls_per_request=30
|
||||
)
|
||||
|
||||
# Apply governance at the crew level
|
||||
def governed_crew_run(crew: Crew, policy: GovernancePolicy):
|
||||
"""Wrap crew execution with governance checks."""
|
||||
audit = AuditTrail()
|
||||
for agent in crew.agents:
|
||||
for tool in agent.tools:
|
||||
original = tool.func
|
||||
tool.func = govern(policy, audit_trail=audit)(original)
|
||||
result = crew.kickoff()
|
||||
return result, audit
|
||||
```
|
||||
|
||||
### OpenAI Agents SDK
|
||||
|
||||
```python
|
||||
from agents import Agent, function_tool
|
||||
|
||||
policy = GovernancePolicy(
|
||||
name="coding-agent",
|
||||
allowed_tools=["read_file", "write_file", "run_tests"],
|
||||
blocked_tools=["shell_exec"],
|
||||
max_calls_per_request=50
|
||||
)
|
||||
|
||||
@function_tool
|
||||
@govern(policy)
|
||||
async def read_file(path: str) -> str:
|
||||
"""Read file contents — governed."""
|
||||
import os
|
||||
safe_path = os.path.realpath(path)
|
||||
if not safe_path.startswith(os.path.realpath(".")):
|
||||
raise ValueError("Path traversal blocked by governance")
|
||||
with open(safe_path) as f:
|
||||
return f.read()
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Governance Levels
|
||||
|
||||
Match governance strictness to risk level:
|
||||
|
||||
| Level | Controls | Use Case |
|
||||
|-------|----------|----------|
|
||||
| **Open** | Audit only, no restrictions | Internal dev/testing |
|
||||
| **Standard** | Tool allowlist + content filters | General production agents |
|
||||
| **Strict** | All controls + human approval for sensitive ops | Financial, healthcare, legal |
|
||||
| **Locked** | Allowlist only, no dynamic tools, full audit | Compliance-critical systems |
|
||||
|
||||
---
|
||||
|
||||
## Best Practices
|
||||
|
||||
| Practice | Rationale |
|
||||
|----------|-----------|
|
||||
| **Policy as configuration** | Store policies in YAML/JSON, not hardcoded — enables change without deploys |
|
||||
| **Most-restrictive-wins** | When composing policies, deny always overrides allow |
|
||||
| **Pre-flight intent check** | Classify intent *before* tool execution, not after |
|
||||
| **Trust decay** | Trust scores should decay over time — require ongoing good behavior |
|
||||
| **Append-only audit** | Never modify or delete audit entries — immutability enables compliance |
|
||||
| **Fail closed** | If governance check errors, deny the action rather than allowing it |
|
||||
| **Separate policy from logic** | Governance enforcement should be independent of agent business logic |
|
||||
|
||||
---
|
||||
|
||||
## Quick Start Checklist
|
||||
|
||||
```markdown
|
||||
## Agent Governance Implementation Checklist
|
||||
|
||||
### Setup
|
||||
- [ ] Define governance policy (allowed tools, blocked patterns, rate limits)
|
||||
- [ ] Choose governance level (open/standard/strict/locked)
|
||||
- [ ] Set up audit trail storage
|
||||
|
||||
### Implementation
|
||||
- [ ] Add @govern decorator to all tool functions
|
||||
- [ ] Add intent classification to user input processing
|
||||
- [ ] Implement trust scoring for multi-agent interactions
|
||||
- [ ] Wire up audit trail export
|
||||
|
||||
### Validation
|
||||
- [ ] Test that blocked tools are properly denied
|
||||
- [ ] Test that content filters catch sensitive patterns
|
||||
- [ ] Test rate limiting behavior
|
||||
- [ ] Verify audit trail captures all events
|
||||
- [ ] Test policy composition (most-restrictive-wins)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Related Resources
|
||||
|
||||
- [Agent-OS Governance Engine](https://github.com/imran-siddique/agent-os) — Full governance framework
|
||||
- [AgentMesh Integrations](https://github.com/imran-siddique/agentmesh-integrations) — Framework-specific packages
|
||||
- [OWASP Top 10 for LLM Applications](https://owasp.org/www-project-top-10-for-large-language-model-applications/)
|
||||
@@ -232,3 +232,30 @@ tools:
|
||||
- github
|
||||
- copilot-customizations
|
||||
- validation
|
||||
|
||||
- id: vscode-agent-manager
|
||||
name: VS Code Agent Manager
|
||||
description: >-
|
||||
VS Code Agent Manager is the essential toolkit for supercharging your GitHub Copilot experience.
|
||||
Effortlessly discover, install, and manage custom Copilot Agents directly within VS Code.
|
||||
category: VS Code Extensions
|
||||
featured: true
|
||||
requirements:
|
||||
- VS Code version 1.106.0 or higher
|
||||
- Internet connection to fetch repository data
|
||||
links:
|
||||
github: https://github.com/luizbon/vscode-agent-manager
|
||||
vscode: vscode:extension/luizbon.vscode-agent-manager
|
||||
vscode-insiders: vscode-insiders:extension/luizbon.vscode-agent-manager
|
||||
marketplace: https://marketplace.visualstudio.com/items?itemName=luizbon.vscode-agent-manager
|
||||
features:
|
||||
- "🔍 Auto-Discovery: Point the Agent Manager to any GitHub repository, and it will automatically index all available agents"
|
||||
- "📦 One-Click Installation: Install agents in seconds. Choose to install them globally (User Profile) for access in all projects, or locally (Workspace) for project-specific needs."
|
||||
- "🔄 Smart Updates & Version Control: Stay up to date. The extension tracks installed versions against the remote source, alerting you when updates are available. View changelogs and author details at a glance."
|
||||
- "🛡️ Git-Powered Reliability: Built on top of Git, ensuring that you always get the exact version of the agent you expect, with verified author and commit data."
|
||||
tags:
|
||||
- vscode
|
||||
- extension
|
||||
- copilot
|
||||
- agent
|
||||
- manager
|
||||
|
||||
Reference in New Issue
Block a user