Add concise DevOps resources (agents, instructions, prompt) (#1) (#513)

* Initial plan * Add DevOps resources: agents, instructions, and prompt * Replace redundant GitHub Actions instructions with expert agent * Make DevOps resources more generic for easier maintenance * Remove optional model field to align with repository conventions * Reduce code examples to focus on principles and guidance * Add DevOps Expert agent following infinity loop principle --------- Co-authored-by: Copilot <198982749+Copilot@users.noreply.github.com> Co-authored-by: benjisho-aidome <218995725+benjisho-aidome@users.noreply.github.com> Co-authored-by: Matt Soucoup <masoucou@microsoft.com>
2026-02-20 02:15:12 +00:00 · 2026-01-09 18:41:01 +02:00
parent e496ef1b9b
commit 57473945b0
9 changed files with 921 additions and 0 deletions
--- a/instructions/kubernetes-manifests.instructions.md
+++ b/instructions/kubernetes-manifests.instructions.md
@@ -0,0 +1,136 @@
+---
+applyTo: 'k8s/**/*.yaml,k8s/**/*.yml,manifests/**/*.yaml,manifests/**/*.yml,deploy/**/*.yaml,deploy/**/*.yml,charts/**/templates/**/*.yaml,charts/**/templates/**/*.yml'
+description: 'Best practices for Kubernetes YAML manifests including labeling conventions, security contexts, pod security, resource management, probes, and validation commands'
+---
+
+# Kubernetes Manifests Instructions
+
+## Your Mission
+
+Create production-ready Kubernetes manifests that prioritize security, reliability, and operational excellence with consistent labeling, proper resource management, and comprehensive health checks.
+
+## Labeling Conventions
+
+**Required Labels** (Kubernetes recommended):
+- `app.kubernetes.io/name`: Application name
+- `app.kubernetes.io/instance`: Instance identifier
+- `app.kubernetes.io/version`: Version
+- `app.kubernetes.io/component`: Component role
+- `app.kubernetes.io/part-of`: Application group
+- `app.kubernetes.io/managed-by`: Management tool
+
+**Additional Labels**:
+- `environment`: Environment name
+- `team`: Owning team
+- `cost-center`: For billing
+
+**Useful Annotations**:
+- Documentation and ownership
+- Monitoring: `prometheus.io/scrape`, `prometheus.io/port`, `prometheus.io/path`
+- Change tracking: git commit, deployment date
+
+## SecurityContext Defaults
+
+**Pod-level**:
+- `runAsNonRoot: true`
+- `runAsUser` and `runAsGroup`: Specific IDs
+- `fsGroup`: File system group
+- `seccompProfile.type: RuntimeDefault`
+
+**Container-level**:
+- `allowPrivilegeEscalation: false`
+- `readOnlyRootFilesystem: true` (with tmpfs mounts for writable dirs)
+- `capabilities.drop: [ALL]` (add only what's needed)
+
+## Pod Security Standards
+
+Use Pod Security Admission:
+- **Restricted** (recommended for production): Enforces security hardening
+- **Baseline**: Minimal security requirements
+- Apply at namespace level
+
+## Resource Requests and Limits
+
+**Always define**:
+- Requests: Guaranteed minimum (scheduling)
+- Limits: Maximum allowed (prevents exhaustion)
+
+**QoS Classes**:
+- **Guaranteed**: requests == limits (best for critical apps)
+- **Burstable**: requests < limits (flexible resource use)
+- **BestEffort**: No resources defined (avoid in production)
+
+## Health Probes
+
+**Liveness**: Restart unhealthy containers
+**Readiness**: Control traffic routing
+**Startup**: Protect slow-starting applications
+
+Configure appropriate delays, periods, timeouts, and thresholds for each.
+
+## Rollout Strategies
+
+**Deployment Strategy**:
+- `RollingUpdate` with `maxSurge` and `maxUnavailable`
+- Set `maxUnavailable: 0` for zero-downtime
+
+**High Availability**:
+- Minimum 2-3 replicas
+- Pod Disruption Budget (PDB)
+- Anti-affinity rules (spread across nodes/zones)
+- Horizontal Pod Autoscaler (HPA) for variable load
+
+## Validation Commands
+
+**Pre-deployment**:
+- `kubectl apply --dry-run=client -f manifest.yaml`
+- `kubectl apply --dry-run=server -f manifest.yaml`
+- `kubeconform -strict manifest.yaml` (schema validation)
+- `helm template ./chart | kubeconform -strict` (for Helm)
+
+**Policy Validation**:
+- OPA Conftest, Kyverno, or Datree
+
+## Rollout & Rollback
+
+**Deploy**:
+- `kubectl apply -f manifest.yaml`
+- `kubectl rollout status deployment/NAME`
+
+**Rollback**:
+- `kubectl rollout undo deployment/NAME`
+- `kubectl rollout undo deployment/NAME --to-revision=N`
+- `kubectl rollout history deployment/NAME`
+
+**Restart**:
+- `kubectl rollout restart deployment/NAME`
+
+## Manifest Checklist
+
+- [ ] Labels: Standard labels applied
+- [ ] Annotations: Documentation and monitoring
+- [ ] Security: runAsNonRoot, readOnlyRootFilesystem, dropped capabilities
+- [ ] Resources: Requests and limits defined
+- [ ] Probes: Liveness, readiness, startup configured
+- [ ] Images: Specific tags (never :latest)
+- [ ] Replicas: Minimum 2-3 for production
+- [ ] Strategy: RollingUpdate with appropriate surge/unavailable
+- [ ] PDB: Defined for production
+- [ ] Anti-affinity: Configured for HA
+- [ ] Graceful shutdown: terminationGracePeriodSeconds set
+- [ ] Validation: Dry-run and kubeconform passed
+- [ ] Secrets: In Secrets resource, not ConfigMaps
+- [ ] NetworkPolicy: Least-privilege access (if applicable)
+
+## Best Practices Summary
+
+1. Use standard labels and annotations
+2. Always run as non-root with dropped capabilities
+3. Define resource requests and limits
+4. Implement all three probe types
+5. Pin image tags to specific versions
+6. Configure anti-affinity for HA
+7. Set Pod Disruption Budgets
+8. Use rolling updates with zero unavailability
+9. Validate manifests before applying
+10. Enable read-only root filesystem when possible