GDPR Technical Measures: Encryption, Pseudonymization, and Access Controls
Article 32 of GDPR requires appropriate technical measures for personal data security. Learn encryption, pseudonymization, and access control implementations that satisfy regulators.
GDPR Technical Measures: Encryption, Pseudonymization, and Access Controls
Article 32 of the GDPR requires controllers and processors to implement "appropriate technical and organisational measures" to ensure a level of security appropriate to the risk. The regulation deliberately avoids mandating specific technologies — instead, it lists four examples of measures that may be appropriate: encryption, pseudonymization, ongoing confidentiality/integrity/availability assurance, and a process for testing and evaluating security measures.
"Appropriate to the risk" is the operative phrase. What you implement must be calibrated to the nature, scope, context, and purposes of the processing, and to the risks posed to individuals. This gives organizations flexibility but also creates ambiguity. This guide translates Article 32 into concrete technical implementations.
Encryption: What Article 32 Actually Requires
GDPR does not require encryption in all cases — it lists it as one example of an appropriate measure. However, supervisory authority enforcement actions and guidance have made clear that failing to encrypt personal data in high-risk contexts is a serious failure.
Encryption at Rest
For databases and file systems storing personal data, encryption at rest protects against unauthorized physical or logical access to storage media.
Database level: AWS RDS, Google Cloud SQL, and Azure SQL Database all support encryption at rest using managed keys. Enable this for all databases containing personal data. If you use a managed database service and have not explicitly enabled encryption, verify whether it is on by default — some services require explicit configuration.
Application level: For highly sensitive fields (health data, financial data, government IDs), consider application-level encryption in addition to database encryption. This means the application encrypts the value before writing it to the database, so even someone with database access cannot read the plaintext. Libraries like libsodium provide authenticated encryption; use AES-256-GCM or ChaCha20-Poly1305.
# Example: encrypting a sensitive field before storage
from cryptography.hazmat.primitives.ciphers.aead import AESGCM
import os
def encrypt_field(plaintext: str, key: bytes) -> bytes:
aesgcm = AESGCM(key)
nonce = os.urandom(12)
ciphertext = aesgcm.encrypt(nonce, plaintext.encode(), None)
return nonce + ciphertext
Key management: Encryption is only as strong as your key management. Use a dedicated key management service (AWS KMS, GCP Cloud KMS, HashiCorp Vault) rather than storing encryption keys alongside the encrypted data. Rotate keys annually or upon suspected compromise.
Encryption in Transit
All personal data transmitted over networks must be encrypted. TLS 1.2 is the minimum; TLS 1.3 is preferred. Enforce TLS in your web server configuration:
# nginx: enforce TLS 1.2+ and strong ciphers
ssl_protocols TLSv1.2 TLSv1.3;
ssl_ciphers ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES128-GCM-SHA256:ECDHE-ECDSA-AES256-GCM-SHA384:ECDHE-RSA-AES256-GCM-SHA384;
ssl_prefer_server_ciphers off;
Enable HSTS with a minimum one-year max-age to prevent protocol downgrade attacks:
add_header Strict-Transport-Security "max-age=31536000; includeSubDomains; preload" always;
For internal service-to-service communication, mutual TLS (mTLS) is increasingly expected. Service meshes like Istio or Linkerd can enforce mTLS transparently across microservices.
Encryption of Backups
Backup files are a common source of data breaches. Ensure that database backups, filesystem snapshots, and data exports are encrypted with the same rigor as production data. Test restoration from encrypted backups periodically — an untested backup is not a backup.
Pseudonymization
Article 4(5) of GDPR defines pseudonymization as "the processing of personal data in such a manner that the personal data can no longer be attributed to a specific data subject without the use of additional information." The key distinction from anonymization: pseudonymized data is still personal data, but the standard treats it favorably in several provisions.
When Pseudonymization Reduces Risk
Pseudonymization is particularly valuable for:
- Analytics and reporting: Replace customer IDs with internal tokens before sending data to an analytics service. If the analytics vendor suffers a breach, the leaked data cannot be directly attributed to individuals without access to your mapping table.
- Development and testing: Use pseudonymized copies of production data in development environments so engineers can work with realistic data without accessing real personal data.
- Cross-system data sharing: When sharing datasets internally across teams with different access levels, pseudonymize before sharing.
Tokenization
Tokenization replaces a sensitive identifier (email address, phone number, national ID number) with a randomly generated token. The mapping between the token and the original value is stored in a separate, access-controlled token vault.
-- Example schema: token vault table
CREATE TABLE pii_tokens (
token UUID PRIMARY KEY DEFAULT gen_random_uuid(),
data_type VARCHAR(50) NOT NULL, -- 'email', 'phone', etc.
encrypted_value BYTEA NOT NULL, -- encrypted original value
created_at TIMESTAMPTZ DEFAULT NOW(),
INDEX idx_encrypted_value (encrypted_value) -- for reverse lookup if authorized
);
The main application stores only the token. When displaying data to an authorized user, the application retrieves the original value from the token vault. Access to the token vault is restricted and audited separately.
Hashing for Lookups
For identifiers used in lookups (e.g., matching records across systems), HMAC-based pseudonymization is practical. Use a keyed hash function so the same input always produces the same output, but the output cannot be reversed without the key:
import hmac
import hashlib
def pseudonymize_identifier(identifier: str, secret_key: bytes) -> str:
return hmac.new(secret_key, identifier.encode(), hashlib.sha256).hexdigest()
Store the pseudonymized value in the system that does not need the original. Keep the key in a secrets manager, not in application configuration.
Access Controls
Article 32 implicitly requires access control as part of maintaining confidentiality. The principle of least privilege — giving each user and system component access only to what they need — is foundational.
Identity and Access Management
Authentication: Enforce MFA for all access to systems processing personal data. For web applications, support TOTP (authenticator apps) at minimum; FIDO2/WebAuthn is preferred for staff-facing systems.
Authorization: Implement role-based access control (RBAC) aligned to job functions. Define roles before assigning access. Common roles for a SaaS application:
viewer: read-only access to non-PII fieldssupport: access to customer records necessary for support ticketsadmin: full access to configuration, restricted to a small number of named individualsdata_processor: service account access scoped to specific operations
Service accounts: Applications accessing databases or APIs should use dedicated service accounts with the minimum permissions required for their function. A service that only reads customer records should not have write or delete permissions.
Database Access Controls
Restrict direct database access to a small set of named DBAs. Application access should go through application users with limited permissions:
-- Create application user with least privilege
CREATE USER app_user WITH PASSWORD 'generated_strong_password';
GRANT SELECT, INSERT, UPDATE ON customers TO app_user;
GRANT SELECT ON orders TO app_user;
-- Do NOT grant DELETE or DROP to application users
Audit direct database queries. If someone connects directly to the production database and runs a SELECT on the customers table, that should generate a log entry and, ideally, an alert.
Audit Logging
Article 32 requires the ability to ensure ongoing confidentiality, integrity, and availability. Audit logs are a core mechanism for detecting and investigating access violations.
Log at minimum:
- Authentication events (success and failure, with user ID and IP)
- Access to records containing personal data (user ID, record ID, timestamp, action)
- Administrative actions (permission changes, user creation/deletion, configuration changes)
- Export and download events
Retain logs for at least 12 months in a tamper-evident store (e.g., write-once S3 bucket with object lock, or a dedicated SIEM). Logs must be reviewed — alerts for anomalous access patterns should trigger investigation.
Documentation Requirements
Article 32 compliance is not just about implementing controls — you must be able to demonstrate to the supervisory authority that appropriate measures are in place. Required documentation includes:
- A record of the technical measures implemented (often part of the Article 30 records of processing activities)
- Evidence of encryption key management procedures
- Access control policies and current access lists
- Penetration test results and remediation records
- Results of regular security testing and evaluation
GDPR allows regulators to request documentation on short notice during an investigation. Maintaining this documentation proactively avoids the scramble.