mirror of
https://github.com/github/awesome-copilot.git
synced 2026-04-13 11:45:56 +00:00
feat: add GDPR-compliant engineering practices skill documentation (#1230)
* feat: add GDPR-compliant engineering practices skill documentation * Add GDPR compliance references for Security and Data Rights - Introduced a comprehensive Security.md file detailing encryption, password hashing, secrets management, anonymization, cloud practices, CI/CD controls, and incident response protocols. - Created a Data Rights.md file outlining user rights implementation, Record of Processing Activities (RoPA), consent management, sub-processor management, and DPIA triggers. * Refine GDPR compliance documentation by removing unnecessary symbols and ensuring clarity in security and data rights references * refactor: streamline description formatting in GDPR compliance skill documentation --------- Co-authored-by: Aaron Powell <me@aaron-powell.com>
This commit is contained in:
266
skills/gdpr-compliant/references/Security.md
Normal file
266
skills/gdpr-compliant/references/Security.md
Normal file
@@ -0,0 +1,266 @@
|
||||
# GDPR Reference — Security, Operations & Architecture
|
||||
|
||||
Load this file when you need implementation detail on:
|
||||
encryption, password hashing, secrets management, anonymization/pseudonymization,
|
||||
cloud/DevOps practices, CI/CD controls, incident response, architecture patterns.
|
||||
|
||||
---
|
||||
|
||||
## Encryption
|
||||
|
||||
### At-Rest Encryption
|
||||
|
||||
| Data sensitivity | Minimum standard |
|
||||
|---|---|
|
||||
| Standard personal data (name, address, email) | AES-256 disk/volume encryption (cloud provider default) |
|
||||
| Sensitive personal data (health, biometric, financial, national ID) | AES-256 **column-level** encryption + envelope encryption via KMS |
|
||||
| Encryption keys | HSM-backed KMS (Azure Key Vault Premium / AWS KMS CMK / GCP Cloud KMS) |
|
||||
|
||||
**Envelope encryption pattern:**
|
||||
1. Encrypt data with a **Data Encryption Key (DEK)** (AES-256, generated per record or per table).
|
||||
2. Encrypt the DEK with a **Key Encryption Key (KEK)** stored in the KMS.
|
||||
3. Store the encrypted DEK alongside the encrypted data.
|
||||
4. Deleting the KEK = effective crypto-shredding of all data encrypted with it.
|
||||
|
||||
### In-Transit Encryption
|
||||
|
||||
- **MUST** enforce TLS 1.2 minimum; prefer TLS 1.3.
|
||||
- **MUST** set `Strict-Transport-Security: max-age=31536000; includeSubDomains; preload`.
|
||||
- **MUST NOT** allow TLS 1.0, TLS 1.1, null cipher suites, or export-grade ciphers.
|
||||
- **MUST NOT** use self-signed certificates in production.
|
||||
|
||||
### Key Management
|
||||
|
||||
- Rotate DEKs annually minimum; rotate immediately upon suspected compromise.
|
||||
- Use separate key namespaces per environment (dev / staging / prod).
|
||||
- Log all KMS key access events — alert on anomalous access patterns.
|
||||
- MUST NOT hardcode encryption keys in source code or configuration files.
|
||||
|
||||
---
|
||||
|
||||
## Password Hashing
|
||||
|
||||
| Algorithm | Parameters | Notes |
|
||||
|---|---|---|
|
||||
| **Argon2id** recommended | memory ≥ 64 MB, iterations ≥ 3, parallelism ≥ 4 | OWASP and NIST recommended |
|
||||
| **bcrypt** acceptable | cost factor ≥ 12 | Widely supported; use if Argon2id unavailable |
|
||||
| **scrypt** acceptable | N=32768, r=8, p=1 | Good alternative |
|
||||
| MD5 | — | Never — trivially broken |
|
||||
| SHA-1 / SHA-256 | — | Never for passwords — not designed for this purpose |
|
||||
|
||||
**MUST**
|
||||
- Use a unique salt per password (built into all three algorithms above).
|
||||
- Store only the hash — never the plaintext, never a reversible encoding.
|
||||
- Re-hash on login if the stored hash uses an outdated algorithm — upgrade transparently.
|
||||
|
||||
**SHOULD**
|
||||
- Add a **pepper** (server-side secret added before hashing) stored in the KMS, not in the DB.
|
||||
- Check passwords against known breach lists at registration (`haveibeenpwned` API, k-anonymity mode).
|
||||
- Enforce minimum password length of 12 characters.
|
||||
|
||||
**MUST NOT**
|
||||
- Log passwords in any form — not during registration, not during failed login.
|
||||
- Transmit passwords in URLs or query strings.
|
||||
- Store password reset tokens in plaintext — hash them before storage.
|
||||
|
||||
---
|
||||
|
||||
## Secrets Management
|
||||
|
||||
**MUST**
|
||||
- Store all secrets in a dedicated secret manager: Azure Key Vault, AWS Secrets Manager,
|
||||
GCP Secret Manager, or HashiCorp Vault.
|
||||
- Use pre-commit hooks to prevent secret commits: `gitleaks`, `detect-secrets`, GitHub native secret scanning.
|
||||
- Rotate secrets immediately upon: developer offboarding, suspected compromise, annual schedule.
|
||||
- Maintain a **secrets inventory document** — every secret listed with its purpose and rotation date.
|
||||
|
||||
**SHOULD**
|
||||
- Use **short-lived credentials** via OIDC federation (GitHub Actions → Azure/AWS/GCP) instead of long-lived API keys.
|
||||
- Audit all KMS secret access — alert on access outside business hours or from unexpected sources.
|
||||
- Use separate secret namespaces per environment.
|
||||
|
||||
**`.gitignore` MUST include:**
|
||||
```
|
||||
.env
|
||||
.env.*
|
||||
*.pem
|
||||
*.key
|
||||
*.pfx
|
||||
*.p12
|
||||
secrets/
|
||||
appsettings.*.json # if it may contain connection strings
|
||||
```
|
||||
|
||||
**MUST NOT**
|
||||
- Commit secrets to source code repositories.
|
||||
- Pass secrets as plain-text CLI arguments (they appear in process lists and shell history).
|
||||
- Store secrets as unencrypted environment variable defaults in code.
|
||||
|
||||
---
|
||||
|
||||
## Anonymization & Pseudonymization
|
||||
|
||||
### Definitions
|
||||
|
||||
| Term | Reversible? | GDPR scope? | Use case |
|
||||
|---|---|---|---|
|
||||
| **Anonymization** | No | Outside GDPR scope | Retained records after erasure, analytics datasets |
|
||||
| **Pseudonymization** | Yes (with key) | Still personal data | Analytics pipelines, audit logs, reduced-risk processing |
|
||||
|
||||
### Anonymization Techniques
|
||||
|
||||
| Technique | How | When |
|
||||
|---|---|---|
|
||||
| Suppression | Remove the field entirely | Fields with no analytical value |
|
||||
| Masking | Replace with fixed placeholder (`"ANONYMIZED_USER"`) | Audit log identifiers after erasure |
|
||||
| Generalization | Replace exact value with a range (age 34 → "30–40") | Analytics |
|
||||
| Noise addition | Add statistical noise to numerical values | Aggregate analytics |
|
||||
| Aggregation | Report group statistics, never individual values | Reporting |
|
||||
| K-anonymity | Ensure each record is indistinguishable from k-1 others | Analytics datasets |
|
||||
|
||||
### Pseudonymization Techniques
|
||||
|
||||
| Technique | How |
|
||||
|---|---|
|
||||
| HMAC-SHA256 with secret key | Consistent, one-way, keyed. Use for user IDs in analytics. Key in KMS. |
|
||||
| Tokenization | Replace value with opaque token; mapping in separate secure vault. |
|
||||
| Encryption with separate key | Decrypt only with explicit KMS authorization. |
|
||||
|
||||
**MUST**
|
||||
- When erasing a user, **anonymize** records that must be retained (financial, audit logs) — replace identifying fields with `"ANONYMIZED"` or a hashed placeholder.
|
||||
- Store the pseudonymization key in the KMS — never in the same database as the pseudonymized data.
|
||||
- Test anonymization routines with assertions: the original value MUST NOT be recoverable from the output.
|
||||
|
||||
**Crypto-shredding pattern (event sourcing):**
|
||||
Encrypt personal data in events with a per-user DEK. Store the DEK in the KMS.
|
||||
On erasure: delete the DEK from the KMS → all events for that user are effectively anonymized.
|
||||
|
||||
**MUST NOT**
|
||||
- Call data "anonymized" if re-identification is possible through linkage with other datasets.
|
||||
- Apply pseudonymization and store the mapping key in the same table as the pseudonymized data.
|
||||
|
||||
---
|
||||
|
||||
## Cloud & DevOps Practices
|
||||
|
||||
**MUST**
|
||||
- Enable encryption at rest for all cloud storage: blobs, managed databases, queues, caches.
|
||||
- Use **private endpoints** — databases MUST NOT be publicly accessible.
|
||||
- Apply network security groups / firewall rules: restrict DB access to application layers only.
|
||||
- Enable cloud-native audit logging: Azure Monitor / AWS CloudTrail / GCP Cloud Audit Logs.
|
||||
- Store personal data only in **approved geographic regions** (EEA, or adequacy decision / SCCs).
|
||||
- Tag all cloud resources processing personal data with a `DataClassification` tag.
|
||||
|
||||
**SHOULD**
|
||||
- Enable Microsoft Defender for Cloud / AWS Security Hub / GCP SCC — review recommendations weekly.
|
||||
- Use **managed identities** (Azure) or **IAM roles** (AWS/GCP) instead of long-lived access keys.
|
||||
- Enable soft delete and versioning on object storage.
|
||||
- Apply DLP policies on cloud storage to detect PII written to unprotected buckets.
|
||||
- Enable database-level audit logging for SELECT on sensitive tables.
|
||||
|
||||
**MUST NOT**
|
||||
- Store personal data in public storage buckets without access controls.
|
||||
- Deploy databases with public IPs in production.
|
||||
- Use the same cloud account/subscription for production and non-production if data could bleed across.
|
||||
|
||||
---
|
||||
|
||||
## CI/CD Controls
|
||||
|
||||
**MUST**
|
||||
- Run **secret scanning** on every commit: `gitleaks`, `detect-secrets`, GitHub secret scanning.
|
||||
- Run **dependency vulnerability scanning** on every build: `npm audit`, `dotnet list package --vulnerable`, `trivy`, `snyk`.
|
||||
- MUST NOT use real personal data in CI test jobs.
|
||||
- MUST NOT log environment variables in CI pipelines — mask all secrets.
|
||||
|
||||
**SHOULD**
|
||||
- Run **SAST**: SonarQube, Semgrep, or CodeQL on every PR.
|
||||
- Run **container image scanning**: `trivy`, Snyk Container, or AWS ECR scanning.
|
||||
- Add a **GDPR compliance gate** to the pipeline:
|
||||
- New migrations without a documented retention period → fail.
|
||||
- Log statements containing known PII field names → warn.
|
||||
|
||||
**Pipeline secret rules:**
|
||||
```yaml
|
||||
# MUST: mask secrets before use
|
||||
- name: Mask secret
|
||||
run: echo "::add-mask::${{ secrets.MY_SECRET }}"
|
||||
|
||||
# MUST NOT: echo secrets to console
|
||||
- run: echo "Key=$API_KEY" # Never
|
||||
|
||||
# SHOULD: use OIDC federation (no long-lived keys)
|
||||
- uses: azure/login@v1
|
||||
with:
|
||||
client-id: ${{ vars.AZURE_CLIENT_ID }}
|
||||
tenant-id: ${{ vars.AZURE_TENANT_ID }}
|
||||
subscription-id: ${{ vars.AZURE_SUBSCRIPTION_ID }}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Incident & Breach Handling
|
||||
|
||||
### Regulatory Timeline
|
||||
|
||||
| Window | Obligation |
|
||||
|---|---|
|
||||
| **72 hours** from awareness | Notify the supervisory authority (CNIL, APD, ICO…) — unless breach is unlikely to risk individuals |
|
||||
| **Without undue delay** | Notify affected data subjects if breach is likely to result in **high risk** to their rights |
|
||||
|
||||
Log **all** personal data breaches internally — even those that do not require DPA notification.
|
||||
|
||||
### Breach Response Runbook (template)
|
||||
|
||||
1. **Detection** — Define criteria: what triggers an incident (credential leak, DB dump exposed, ransomware, accidental public bucket).
|
||||
2. **Severity classification** — Low / Medium / High / Critical based on data sensitivity and volume.
|
||||
3. **Containment** — Revoke compromised credentials; isolate affected systems; preserve evidence (do NOT delete logs).
|
||||
4. **Assessment** — What data was exposed? How many subjects? What is the risk level?
|
||||
5. **DPA notification** — Use the supervisory authority's online portal; include: nature of breach, categories and approximate number of data subjects, categories and approximate number of records, contact point, likely consequences, measures taken.
|
||||
6. **Data subject notification** — If high risk: clear language, nature of breach, likely consequences, measures taken, DPO contact.
|
||||
7. **Post-incident review** — Root cause analysis; corrective measures; update runbook.
|
||||
|
||||
### Automated Breach Detection Alerts
|
||||
|
||||
Configure alerts for:
|
||||
- Unusual volume of data exports (threshold per hour)
|
||||
- Access to sensitive tables outside business hours
|
||||
- Bulk deletion events
|
||||
- Failed authentication spikes
|
||||
- New credentials appearing in public breach databases (HaveIBeenPwned monitoring)
|
||||
|
||||
Store breach records internally for at least **5 years**.
|
||||
|
||||
---
|
||||
|
||||
## Architecture Patterns
|
||||
|
||||
### Data Store Separation
|
||||
Separate operational data (transactional DB) from analytical data (data warehouse).
|
||||
Apply different retention periods and access controls to each.
|
||||
The analytics store MUST NOT read directly from production operational tables.
|
||||
|
||||
### Dedicated Consent Store
|
||||
Track consent as an immutable event log in a separate store, not a boolean column on the user table.
|
||||
This enables: auditable consent history, version tracking, easy withdrawal without data loss.
|
||||
|
||||
### Audit Log Segregation
|
||||
Store audit logs in a separate, append-only store.
|
||||
The application service account MUST NOT be able to delete audit log entries.
|
||||
Use a separate DB user with INSERT-only rights on the audit table.
|
||||
|
||||
### DSR Queue Pattern
|
||||
Implement Data Subject Requests as an asynchronous workflow:
|
||||
`POST /api/v1/me/erasure-request` → enqueue a job → worker scrubs all stores → notify user on completion.
|
||||
This handles the complexity of multi-store scrubbing reliably and provides a retry mechanism.
|
||||
|
||||
### Pseudonymization Gateway
|
||||
For analytics pipelines, implement a pseudonymization service at the boundary between
|
||||
operational and analytical systems.
|
||||
The mapping key (HMAC secret or tokenization vault) never leaves the operational zone.
|
||||
The analytics zone receives only pseudonymized identifiers.
|
||||
|
||||
### Crypto-Shredding (Event Sourcing)
|
||||
Encrypt personal data in events with a per-user DEK stored in the KMS.
|
||||
On user erasure: delete the DEK → all historical events for that user are effectively anonymized
|
||||
without modifying the event log.
|
||||
177
skills/gdpr-compliant/references/data-rights.md
Normal file
177
skills/gdpr-compliant/references/data-rights.md
Normal file
@@ -0,0 +1,177 @@
|
||||
# GDPR Reference — Data Rights, Accountability & Governance
|
||||
|
||||
Load this file when you need implementation detail on:
|
||||
user rights endpoints, Data Subject Request (DSR) workflow,
|
||||
Record of Processing Activities (RoPA), consent management.
|
||||
|
||||
---
|
||||
|
||||
## User Rights Implementation (Articles 15–22)
|
||||
|
||||
Every right MUST have a tested API endpoint or documented back-office process
|
||||
before the system goes live. Respond to verified requests within **30 calendar days**.
|
||||
|
||||
| Right | Article | Engineering implementation |
|
||||
|---|---|---|
|
||||
| Right of access | 15 | `GET /api/v1/me/data-export` — all personal data, JSON or CSV |
|
||||
| Right to rectification | 16 | `PUT /api/v1/me/profile` — propagate to all downstream stores |
|
||||
| Right to erasure | 17 | `DELETE /api/v1/me` — scrub all stores per erasure checklist |
|
||||
| Right to restriction | 18 | `ProcessingRestricted` flag on user record; gate non-essential processing |
|
||||
| Right to portability | 20 | Same as access endpoint; structured, machine-readable (JSON) |
|
||||
| Right to object | 21 | Opt-out endpoint for legitimate-interest processing; honor immediately |
|
||||
| Automated decision-making | 22 | Expose a human review path + explanation of the logic |
|
||||
|
||||
### Erasure Checklist — MUST cover all stores
|
||||
|
||||
When `DELETE /api/v1/me` is called, the erasure pipeline MUST scrub:
|
||||
|
||||
- Primary relational database (anonymize or delete rows)
|
||||
- Read replicas
|
||||
- Search index (Elasticsearch, Azure Cognitive Search, etc.)
|
||||
- In-memory cache (Redis, IMemoryCache)
|
||||
- Object storage (S3, Azure Blob — profile pictures, documents)
|
||||
- Email service logs (Brevo, SendGrid — delivery logs)
|
||||
- Analytics platform (Mixpanel, Amplitude, GA4 — user deletion API)
|
||||
- Audit logs (anonymize identifying fields — do not delete the event)
|
||||
- Backups (document the backup TTL; accept that backups expire naturally)
|
||||
- CDN edge cache (purge if personal data may be cached)
|
||||
- Third-party sub-processors (trigger their deletion API or document the manual step)
|
||||
|
||||
### Data Export Format (`GET /api/v1/me/data-export`)
|
||||
|
||||
```json
|
||||
{
|
||||
"exportedAt": "2025-03-30T10:00:00Z",
|
||||
"subject": {
|
||||
"id": "uuid",
|
||||
"email": "user@example.com",
|
||||
"createdAt": "2024-01-15T08:30:00Z"
|
||||
},
|
||||
"profile": { ... },
|
||||
"orders": [ ... ],
|
||||
"consents": [ ... ],
|
||||
"auditEvents": [ ... ]
|
||||
}
|
||||
```
|
||||
|
||||
- MUST be machine-readable (JSON preferred, CSV acceptable).
|
||||
- MUST NOT be a PDF screenshot or HTML page.
|
||||
- MUST include all stores listed in the RoPA for this user.
|
||||
|
||||
### DSR Tracker (back-office)
|
||||
|
||||
Implement a **Data Subject Request tracker** with:
|
||||
- Incoming request date
|
||||
- Request type (access / rectification / erasure / portability / restriction / objection)
|
||||
- Verification status (identity confirmed y/n)
|
||||
- Deadline (received date + 30 days)
|
||||
- Assigned handler
|
||||
- Completion date and outcome
|
||||
- Notes
|
||||
|
||||
Automate the primary store scrubbing; document manual steps for third-party stores.
|
||||
|
||||
---
|
||||
|
||||
## Record of Processing Activities (RoPA)
|
||||
|
||||
Maintain as a living document (Markdown, YAML, or JSON) version-controlled in the repo.
|
||||
Update with **every** new feature that introduces a processing activity.
|
||||
|
||||
### Minimum fields per processing activity
|
||||
|
||||
```yaml
|
||||
- name: "User account management"
|
||||
purpose: "Create and manage user accounts for service access"
|
||||
legalBasis: "Contract (Art. 6(1)(b))"
|
||||
dataSubjects: ["Registered users"]
|
||||
personalDataCategories: ["Name", "Email", "Password hash", "IP address"]
|
||||
recipients: ["Internal engineering team", "Brevo (email delivery)"]
|
||||
retentionPeriod: "Account lifetime + 12 months"
|
||||
transfers:
|
||||
outside_eea: true
|
||||
safeguard: "Brevo — Standard Contractual Clauses (SCCs)"
|
||||
securityMeasures: ["TLS 1.3", "AES-256 at rest", "bcrypt password hashing"]
|
||||
dpia_required: false
|
||||
```
|
||||
|
||||
### Legal basis options (Art. 6)
|
||||
|
||||
| Basis | When to use |
|
||||
|---|---|
|
||||
| `Contract (6(1)(b))` | Processing necessary to fulfill the service contract |
|
||||
| `Legitimate interest (6(1)(f))` | Fraud prevention, security, analytics (requires balancing test) |
|
||||
| `Consent (6(1)(a))` | Marketing, non-essential cookies, optional profiling |
|
||||
| `Legal obligation (6(1)(c))` | Tax records, anti-money-laundering |
|
||||
| `Vital interest (6(1)(d))` | Emergency situations only |
|
||||
| `Public task (6(1)(e))` | Public authorities |
|
||||
|
||||
---
|
||||
|
||||
## Consent Management
|
||||
|
||||
### MUST
|
||||
|
||||
- Store consent as an **immutable event log**, not a mutable boolean flag.
|
||||
- Record: what was consented to, when, which version of the privacy policy, the mechanism.
|
||||
- Load analytics / marketing SDKs **conditionally** — only after consent is granted.
|
||||
- Provide a consent withdrawal mechanism as easy to use as the consent grant.
|
||||
|
||||
### Consent store schema (minimum)
|
||||
|
||||
```sql
|
||||
CREATE TABLE ConsentRecords (
|
||||
Id UUID PRIMARY KEY,
|
||||
UserId UUID NOT NULL,
|
||||
Purpose VARCHAR(100) NOT NULL, -- e.g. "marketing_emails", "analytics"
|
||||
Granted BOOLEAN NOT NULL,
|
||||
PolicyVersion VARCHAR(20) NOT NULL,
|
||||
ConsentedAt TIMESTAMPTZ NOT NULL,
|
||||
IpAddressHash VARCHAR(64), -- HMAC-SHA256 of anonymized IP
|
||||
UserAgent VARCHAR(500)
|
||||
);
|
||||
```
|
||||
|
||||
### MUST NOT
|
||||
|
||||
- MUST NOT pre-tick consent checkboxes.
|
||||
- MUST NOT bundle consent for marketing with consent for service delivery.
|
||||
- MUST NOT make service access conditional on marketing consent.
|
||||
- MUST NOT use dark patterns (e.g., "Accept all" prominent, "Reject" buried).
|
||||
|
||||
---
|
||||
|
||||
## Sub-processor Management
|
||||
|
||||
Maintain a **sub-processor list** updated with every new SaaS tool or cloud service
|
||||
that touches personal data.
|
||||
|
||||
Minimum fields per sub-processor:
|
||||
|
||||
| Field | Example |
|
||||
|---|---|
|
||||
| Name | Brevo |
|
||||
| Service | Transactional email |
|
||||
| Data categories transferred | Email address, name, email content |
|
||||
| Processing location | EU (Paris) |
|
||||
| DPA signed | 2024-01-10 |
|
||||
| DPA URL / reference | [link] |
|
||||
| SCCs applicable | N/A (EU-based) |
|
||||
|
||||
**MUST** review the sub-processor list annually and upon any change.
|
||||
**MUST NOT** allow data to flow to a new sub-processor before a DPA is signed.
|
||||
|
||||
---
|
||||
|
||||
## DPIA Triggers (Article 35)
|
||||
|
||||
A DPIA is **mandatory** before processing that is likely to result in a high risk. Triggers include:
|
||||
|
||||
- Systematic and extensive profiling with significant effects on individuals
|
||||
- Large-scale processing of special category data (health, biometric, racial origin, sexual orientation, religion)
|
||||
- Systematic monitoring of publicly accessible areas (CCTV, location tracking)
|
||||
- Processing of children's data at scale
|
||||
- Innovative technology with unknown privacy implications
|
||||
- Matching or combining datasets from multiple sources
|
||||
|
||||
When in doubt: conduct the DPIA anyway. Document the outcome.
|
||||
Reference in New Issue
Block a user