mirror of https://github.com/github/awesome-copilot.git synced 2026-02-20 02:15:12 +00:00

Files

benjisho-aidome 57473945b0 Add concise DevOps resources (agents, instructions, prompt) (#1 ) (#513 )

* Initial plan

* Add DevOps resources: agents, instructions, and prompt



* Replace redundant GitHub Actions instructions with expert agent



* Make DevOps resources more generic for easier maintenance



* Remove optional model field to align with repository conventions



* Reduce code examples to focus on principles and guidance



* Add DevOps Expert agent following infinity loop principle



---------

Co-authored-by: Copilot <198982749+Copilot@users.noreply.github.com>
Co-authored-by: benjisho-aidome <218995725+benjisho-aidome@users.noreply.github.com>
Co-authored-by: Matt Soucoup <masoucou@microsoft.com>

2026-01-09 08:41:01 -08:00

4.4 KiB

Raw Blame History

applyTo, description

applyTo	description
k8s/*/.yaml,k8s/*/.yml,manifests/*/.yaml,manifests/*/.yml,deploy/*/.yaml,deploy/*/.yml,charts//templates//.yaml,charts//templates//.yml	Best practices for Kubernetes YAML manifests including labeling conventions, security contexts, pod security, resource management, probes, and validation commands

Kubernetes Manifests Instructions

Your Mission

Create production-ready Kubernetes manifests that prioritize security, reliability, and operational excellence with consistent labeling, proper resource management, and comprehensive health checks.

Labeling Conventions

Required Labels (Kubernetes recommended):

app.kubernetes.io/name: Application name
app.kubernetes.io/instance: Instance identifier
app.kubernetes.io/version: Version
app.kubernetes.io/component: Component role
app.kubernetes.io/part-of: Application group
app.kubernetes.io/managed-by: Management tool

Additional Labels:

environment: Environment name
team: Owning team
cost-center: For billing

Useful Annotations:

Documentation and ownership
Monitoring: prometheus.io/scrape, prometheus.io/port, prometheus.io/path
Change tracking: git commit, deployment date

SecurityContext Defaults

Pod-level:

runAsNonRoot: true
runAsUser and runAsGroup: Specific IDs
fsGroup: File system group
seccompProfile.type: RuntimeDefault

Container-level:

allowPrivilegeEscalation: false
readOnlyRootFilesystem: true (with tmpfs mounts for writable dirs)
capabilities.drop: [ALL] (add only what's needed)

Pod Security Standards

Use Pod Security Admission:

Restricted (recommended for production): Enforces security hardening
Baseline: Minimal security requirements
Apply at namespace level

Resource Requests and Limits

Always define:

Requests: Guaranteed minimum (scheduling)
Limits: Maximum allowed (prevents exhaustion)

QoS Classes:

Guaranteed: requests == limits (best for critical apps)
Burstable: requests < limits (flexible resource use)
BestEffort: No resources defined (avoid in production)

Health Probes

Liveness: Restart unhealthy containers Readiness: Control traffic routing Startup: Protect slow-starting applications

Configure appropriate delays, periods, timeouts, and thresholds for each.

Rollout Strategies

Deployment Strategy:

RollingUpdate with maxSurge and maxUnavailable
Set maxUnavailable: 0 for zero-downtime

High Availability:

Minimum 2-3 replicas
Pod Disruption Budget (PDB)
Anti-affinity rules (spread across nodes/zones)
Horizontal Pod Autoscaler (HPA) for variable load

Validation Commands

Pre-deployment:

kubectl apply --dry-run=client -f manifest.yaml
kubectl apply --dry-run=server -f manifest.yaml
kubeconform -strict manifest.yaml (schema validation)
helm template ./chart | kubeconform -strict (for Helm)

Policy Validation:

OPA Conftest, Kyverno, or Datree

Rollout & Rollback

Deploy:

kubectl apply -f manifest.yaml
kubectl rollout status deployment/NAME

Rollback:

kubectl rollout undo deployment/NAME
kubectl rollout undo deployment/NAME --to-revision=N
kubectl rollout history deployment/NAME

Restart:

kubectl rollout restart deployment/NAME

Manifest Checklist

Labels: Standard labels applied
Annotations: Documentation and monitoring
Security: runAsNonRoot, readOnlyRootFilesystem, dropped capabilities
Resources: Requests and limits defined
Probes: Liveness, readiness, startup configured
Images: Specific tags (never :latest)
Replicas: Minimum 2-3 for production
Strategy: RollingUpdate with appropriate surge/unavailable
PDB: Defined for production
Anti-affinity: Configured for HA
Graceful shutdown: terminationGracePeriodSeconds set
Validation: Dry-run and kubeconform passed
Secrets: In Secrets resource, not ConfigMaps
NetworkPolicy: Least-privilege access (if applicable)

Best Practices Summary

Use standard labels and annotations
Always run as non-root with dropped capabilities
Define resource requests and limits
Implement all three probe types
Pin image tags to specific versions
Configure anti-affinity for HA
Set Pod Disruption Budgets
Use rolling updates with zero unavailability
Validate manifests before applying
Enable read-only root filesystem when possible

4.4 KiB Raw Blame History