mirror of
https://github.com/github/awesome-copilot.git
synced 2026-06-16 20:51:26 +00:00
chore: publish from staged
This commit is contained in:
@@ -0,0 +1,227 @@
|
||||
---
|
||||
name: harness-engineering
|
||||
description: 'Adopt repository-level harness engineering for coding agents. Use when a user wants to prevent repeated AI coding-agent mistakes by turning failures into durable instructions, drift checks, regression tests, failure memory, and adoption reports tailored to the target repository.'
|
||||
---
|
||||
|
||||
# Harness Engineering
|
||||
|
||||
Harness engineering turns repeated coding-agent mistakes into durable
|
||||
repository artifacts:
|
||||
|
||||
```text
|
||||
Harness = Instructions + Constraints + Feedback + Memory + Evaluation + Governance
|
||||
```
|
||||
|
||||
Use this skill when the user asks to:
|
||||
|
||||
- make a repository more reliable for GitHub Copilot or other coding agents
|
||||
- add durable agent instructions, repository rules, or guardrails
|
||||
- prevent repeated AI coding-agent mistakes
|
||||
- record known failure paths and the checks that prevent recurrence
|
||||
- add lightweight drift checks for project rules
|
||||
- review, refresh, or update an existing agent harness
|
||||
|
||||
Do not use this skill for ordinary feature implementation unless the user asks
|
||||
to improve the repository's agent operating environment.
|
||||
|
||||
## Core Principles
|
||||
|
||||
- Treat the target repository as the source of truth.
|
||||
- Inspect before editing. Preserve the existing stack, package manager, CI,
|
||||
docs, naming, and architecture.
|
||||
- Add the smallest useful harness. Prefer updating existing files over adding
|
||||
duplicate guidance.
|
||||
- Make important rules enforceable where practical through tests, linters,
|
||||
type checks, CI, pre-commit hooks, or drift scripts.
|
||||
- Use manual review points only when automation would be brittle or misleading.
|
||||
- Record high-risk failures that should not recur, and name the check or review
|
||||
point that catches recurrence.
|
||||
- Do not copy generic templates blindly. Adapt every artifact to real evidence
|
||||
in the target repository.
|
||||
|
||||
## Discovery
|
||||
|
||||
Before proposing or making harness changes, inspect the repository for existing
|
||||
rules and evidence.
|
||||
|
||||
Read these files and folders when they exist:
|
||||
|
||||
- `README.md`
|
||||
- `AGENTS.md`
|
||||
- `.github/copilot-instructions.md`
|
||||
- `.github/instructions/`
|
||||
- `.github/workflows/`
|
||||
- `CONTRIBUTING.md`
|
||||
- package manifests such as `package.json`, `pyproject.toml`, `go.mod`,
|
||||
`Cargo.toml`, `pom.xml`, or `build.gradle`
|
||||
- existing docs under `docs/`
|
||||
- existing scripts under `scripts/`
|
||||
- existing tests and CI checks
|
||||
|
||||
Then summarize:
|
||||
|
||||
- stack, package manager, and entry points
|
||||
- existing development and verification commands
|
||||
- current agent instructions or repository conventions
|
||||
- known failures, incidents, flaky paths, or repeated review comments
|
||||
- gaps where project rules are not enforced
|
||||
|
||||
## Adoption Workflow
|
||||
|
||||
Follow this sequence:
|
||||
|
||||
1. Choose the harness surface that fits the target repository.
|
||||
2. Write target-specific agent instructions.
|
||||
3. Add enforceable checks for high-value rules.
|
||||
4. Record failure memory for high-risk or recurring failures.
|
||||
5. Add drift checks for guidance that can silently become stale.
|
||||
6. Report the adoption with evidence, assumptions, and follow-up.
|
||||
|
||||
### 1. Choose the Harness Surface
|
||||
|
||||
Pick only the surfaces that fit the target repository:
|
||||
|
||||
| Need | Preferred artifact |
|
||||
| --- | --- |
|
||||
| Always-on agent behavior | `AGENTS.md` or `.github/copilot-instructions.md` |
|
||||
| File-scoped guidance | `.github/instructions/*.instructions.md` |
|
||||
| Recurring project checks | `scripts/check_*.py`, shell scripts, or package scripts |
|
||||
| CI enforcement | existing workflow files or a small new workflow |
|
||||
| Known failures | `docs/failures/*.md` |
|
||||
| Architecture or process decisions | `docs/decisions/*.md` |
|
||||
| Adoption evidence | `docs/harness/adoption-report.md` or similar |
|
||||
|
||||
If the repository already has an equivalent location, update it instead of
|
||||
creating a parallel system.
|
||||
|
||||
### 2. Write Agent Instructions
|
||||
|
||||
Agent instructions should be concrete and operational. Include:
|
||||
|
||||
- project purpose and major ownership boundaries
|
||||
- setup, test, lint, build, and verification commands
|
||||
- package manager and dependency rules
|
||||
- safe editing rules, generated file rules, and forbidden paths
|
||||
- testing expectations for changed code
|
||||
- PR and commit conventions if the repo has them
|
||||
- how to record new failures or decisions
|
||||
|
||||
Avoid broad personality guidance, generic best practices, and rules that cannot
|
||||
be checked or reviewed.
|
||||
|
||||
### 3. Add Enforceable Checks
|
||||
|
||||
Convert high-value rules into checks. Good harness checks are:
|
||||
|
||||
- narrow enough to avoid false positives
|
||||
- fast enough to run locally and in CI
|
||||
- named clearly so agents can run them before finishing
|
||||
- documented with the rule they protect
|
||||
|
||||
Examples:
|
||||
|
||||
```text
|
||||
Rule: Do not edit generated API clients.
|
||||
Check: script scans diffs for generated paths and fails with a clear message.
|
||||
|
||||
Rule: Every failure memory note names a regression check.
|
||||
Check: script validates docs/failures/*.md for a "Detection" section.
|
||||
|
||||
Rule: Profile docs and templates must stay aligned.
|
||||
Check: test compares profile README files to expected template files.
|
||||
```
|
||||
|
||||
### 4. Record Failure Memory
|
||||
|
||||
Record failures when they are user-visible, high-risk, or likely to recur.
|
||||
Use a new file under `docs/failures/` unless an existing note already covers
|
||||
the same root cause.
|
||||
|
||||
Recommended structure:
|
||||
|
||||
```markdown
|
||||
# Short Failure Title
|
||||
|
||||
## Summary
|
||||
|
||||
What failed, who saw it, and why it matters.
|
||||
|
||||
## Root Cause
|
||||
|
||||
The technical or process cause. Avoid blame.
|
||||
|
||||
## Prevention
|
||||
|
||||
Instruction, test, drift check, CI gate, fixture, or manual review point that
|
||||
prevents or detects recurrence.
|
||||
|
||||
## Evidence
|
||||
|
||||
Links to issue, PR, test, log, command output, or file paths.
|
||||
```
|
||||
|
||||
If no automated check is practical, record the manual review point and why
|
||||
automation would be unsafe or misleading.
|
||||
|
||||
### 5. Add Drift Checks
|
||||
|
||||
Use drift checks for guidance that can silently become stale. Common examples:
|
||||
|
||||
- docs mention commands that no longer exist
|
||||
- profile snippets and generated examples diverge
|
||||
- failure notes omit regression checks
|
||||
- decision records are missing for structural changes
|
||||
- CI references stale scripts or package commands
|
||||
|
||||
Prefer small scripts using the repository's existing language. If the repo has
|
||||
no scripting convention, Python with only the standard library is a portable
|
||||
default.
|
||||
|
||||
### 6. Report the Adoption
|
||||
|
||||
Finish substantial harness work with an adoption report that includes:
|
||||
|
||||
- files changed
|
||||
- rules added or updated
|
||||
- checks added or reused
|
||||
- commands run and results
|
||||
- assumptions and manual follow-up
|
||||
- failure memory created or intentionally skipped
|
||||
- how effectiveness will be measured
|
||||
|
||||
## Review Workflow
|
||||
|
||||
When asked to review a harness change, take an opposing perspective. Look for:
|
||||
|
||||
- generic rules copied without evidence from the target repository
|
||||
- duplicate or conflicting instruction files
|
||||
- broad checks that are likely to fail on valid changes
|
||||
- unenforced high-risk rules
|
||||
- missing failure memory for repeated mistakes or runtime failures
|
||||
- generated docs not refreshed after source changes
|
||||
- CI gates that do not run the relevant checks
|
||||
- target repository conventions being overwritten by harness defaults
|
||||
|
||||
Report findings first, ordered by severity, with file and line references when
|
||||
available. Do not modify files during a review unless the user explicitly asks
|
||||
for fixes.
|
||||
|
||||
## Output Contract
|
||||
|
||||
Before finishing harness adoption work, verify:
|
||||
|
||||
- the target repository was inspected before edits
|
||||
- new guidance is specific to the target repository
|
||||
- changed checks can be run locally or have a documented manual substitute
|
||||
- failure memory was recorded when required, or the final response explains why
|
||||
it was skipped
|
||||
- generated docs or indexes are refreshed
|
||||
- the final report names every command run and its result
|
||||
|
||||
## Optional Reference
|
||||
|
||||
The prompt-first workflow in
|
||||
`https://github.com/baskduf/harness-starter-kit` is a reference implementation
|
||||
of these ideas. Use it as reference material only when the user asks for it or
|
||||
when the repository already includes it. The target repository remains the
|
||||
source of truth.
|
||||
Reference in New Issue
Block a user