mirror of
https://github.com/github/awesome-copilot.git
synced 2026-02-20 10:25:13 +00:00
Add duplicate resource detector agentic workflow
Weekly scheduled workflow that scans agents, prompts, instructions, and skills for potential duplicates based on name, description, and content similarity. Reports findings as a GitHub issue with task list checkboxes for review. Checks previous duplicate-review issues to exclude known accepted pairs. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
This commit is contained in:
1045
.github/workflows/duplicate-resource-detector.lock.yml
generated
vendored
Normal file
1045
.github/workflows/duplicate-resource-detector.lock.yml
generated
vendored
Normal file
File diff suppressed because it is too large
Load Diff
127
.github/workflows/duplicate-resource-detector.md
vendored
Normal file
127
.github/workflows/duplicate-resource-detector.md
vendored
Normal file
@@ -0,0 +1,127 @@
|
||||
---
|
||||
description: Weekly scan of agents, prompts, instructions, and skills to identify potential duplicate resources and report them for review
|
||||
on:
|
||||
schedule: weekly
|
||||
permissions:
|
||||
contents: read
|
||||
issues: read
|
||||
tools:
|
||||
github:
|
||||
toolsets: [repos, issues]
|
||||
safe-outputs:
|
||||
create-issue:
|
||||
max: 1
|
||||
close-older-issues: true
|
||||
labels:
|
||||
- duplicate-review
|
||||
noop:
|
||||
---
|
||||
|
||||
# Duplicate Resource Detector
|
||||
|
||||
You are an AI agent that audits the resources in this repository to find potential duplicates — resources that appear to serve the same or very similar purpose.
|
||||
|
||||
## Your Task
|
||||
|
||||
Scan all resources in the following directories and identify groups of resources that may be duplicates or near-duplicates based on their **name**, **description**, and **content**:
|
||||
|
||||
- `agents/` (`.agent.md` files)
|
||||
- `prompts/` (`.prompt.md` files)
|
||||
- `instructions/` (`.instructions.md` files)
|
||||
- `skills/` (folders — check `SKILL.md` inside each)
|
||||
|
||||
### Step 1: Gather Resource Metadata
|
||||
|
||||
For each resource, extract:
|
||||
|
||||
1. **File name** (the path)
|
||||
2. **Front matter `description`** field
|
||||
3. **Front matter `name`** field (if present)
|
||||
4. **First ~20 lines of body content** (the markdown after the front matter)
|
||||
|
||||
Use bash to read files efficiently. For skills, read `skills/<name>/SKILL.md`.
|
||||
|
||||
### Step 2: Identify Potential Duplicates
|
||||
|
||||
Compare resources and flag groups that look like potential duplicates. Consider resources as potential duplicates when they share **two or more** of the following signals:
|
||||
|
||||
- **Similar names** — file names or `name` fields that share key terms (e.g., `react-testing.prompt.md` and `react-unit-testing.prompt.md`)
|
||||
- **Similar descriptions** — descriptions that describe the same task, technology, or domain with only minor wording differences
|
||||
- **Overlapping scope** — resources that target the same language/framework/tool and the same activity (e.g., two separate "Python best practices" instructions)
|
||||
- **Cross-type overlap** — an agent and a prompt (or instruction and agent) that cover the same topic so thoroughly that one may make the other redundant
|
||||
|
||||
Be pragmatic. Resources that cover related but distinct topics are NOT duplicates. For example:
|
||||
- `react.instructions.md` (general React coding standards) and `react-testing.prompt.md` (React testing prompts) are **not** duplicates — they serve different purposes.
|
||||
- `python-fastapi.instructions.md` and `python-flask.instructions.md` are **not** duplicates — they target different frameworks.
|
||||
- `code-review.agent.md` and `code-review.prompt.md` that both do the same style of code review **are** potential duplicates worth flagging.
|
||||
|
||||
### Step 3: Check for Known Accepted Duplicates
|
||||
|
||||
Before finalizing the report, search for **previous issues** labeled `duplicate-review` in this repository:
|
||||
|
||||
```
|
||||
Search for issues with label "duplicate-review" that are closed
|
||||
```
|
||||
|
||||
Read the comments and body of those past issues to find any pairs or groups that reviewers have explicitly marked as **"accepted"** or **"not duplicates"**. Look for phrases like:
|
||||
- "accepted as-is"
|
||||
- "not duplicates"
|
||||
- "intentionally separate"
|
||||
- "keep both"
|
||||
- checked task list items (i.e., `- [x]`)
|
||||
|
||||
Exclude those known-accepted pairs from the current report. If you include a group that was previously reviewed, add a note: `(previously reviewed — see #<issue-number>)`.
|
||||
|
||||
### Step 4: Produce the Report
|
||||
|
||||
Create an issue titled: `🔍 Duplicate Resource Review`
|
||||
|
||||
Format the body as follows:
|
||||
|
||||
```markdown
|
||||
### Summary
|
||||
|
||||
- **Potential duplicate groups found:** N
|
||||
- **Resources involved:** M
|
||||
- **Known accepted (excluded):** K pairs from previous reviews
|
||||
|
||||
### How to Use This Report
|
||||
|
||||
Review each group below. If the resources are intentionally separate, check the box to mark them as accepted. These will be excluded from future reports.
|
||||
|
||||
### Potential Duplicates
|
||||
|
||||
#### Group 1: <Short description of what they share>
|
||||
|
||||
- [ ] Reviewed — these are intentionally separate
|
||||
|
||||
| Resource | Type | Description |
|
||||
|----------|------|-------------|
|
||||
| `agents/foo.agent.md` | Agent | Does X for Y |
|
||||
| `prompts/foo.prompt.md` | Prompt | Also does X for Y |
|
||||
|
||||
**Why flagged:** <Brief explanation of the similarity>
|
||||
|
||||
---
|
||||
|
||||
#### Group 2: ...
|
||||
|
||||
<repeat for each group>
|
||||
```
|
||||
|
||||
Use `<details>` blocks to collapse groups if there are more than 10.
|
||||
|
||||
### Safe Output Guidance
|
||||
|
||||
- If you find potential duplicates: use `create-issue` to file the report.
|
||||
- If **no** potential duplicates are found (after excluding known accepted ones): call `noop` with the message: "No potential duplicate resources detected. All resources appear to serve distinct purposes."
|
||||
|
||||
## Guidelines
|
||||
|
||||
- Be conservative — only flag resources where there is a genuine risk of redundancy.
|
||||
- Group related duplicates together (don't list the same pair twice in separate groups).
|
||||
- Sort groups by confidence (strongest duplicate signals first).
|
||||
- Include cross-type duplicates (e.g., an agent and a prompt doing the same thing).
|
||||
- Limit the report to the top 20 most likely duplicate groups to keep it actionable.
|
||||
- For skills, use the folder name and description from `SKILL.md`.
|
||||
- Process resources in batches to stay within time limits — prioritize name and description comparison, then spot-check content for top candidates.
|
||||
Reference in New Issue
Block a user