Database Backup Security: Encryption, Testing, and Ransomware Protection

Backups are your last line of defense against ransomware, accidental deletion, and catastrophic infrastructure failure. They are also one of the most neglected parts of a security program. Teams spend significant effort securing production databases and then store unencrypted backups on the same network, never test restores, and have no idea what their actual recovery time objective is until they're in a crisis at 2 AM.

This guide builds a defense-in-depth backup strategy: encryption at rest and in transit, the 3-2-1-1-0 rule, air-gapped storage, automated restore validation, and meaningful RTO/RPO targets.

Why Backup Security Is a Security Problem

Backup files contain the same sensitive data as your production database. An attacker who cannot break into your hardened PostgreSQL instance may be able to:

Download an unencrypted .sql.gz backup from a misconfigured S3 bucket
Exfiltrate backup files that bypass DLP controls because they look like binary blobs
Encrypt your backups with ransomware if they're stored on an accessible network share
Restore your backup to their own instance and query it offline

The Marriott breach (500 million records) persisted for four years partly because attackers were exfiltrating encrypted backup files that were later decrypted offline. The backup was the exfiltration vector.

Encrypting Backups

Encryption at the Dump Level

The simplest approach: encrypt the backup file itself before writing to disk or storage.

PostgreSQL with GPG encryption:

#!/bin/bash
set -euo pipefail

DB_NAME="myapp"
TIMESTAMP=$(date +%Y%m%d_%H%M%S)
BACKUP_FILE="/tmp/backup_${DB_NAME}_${TIMESTAMP}.sql.gz.gpg"
GPG_RECIPIENT="backup-team@example.com"

# Dump, compress, and encrypt in a single pipeline
pg_dump \
  --host=postgres.internal \
  --username=backup_agent \
  --no-password \
  --format=plain \
  --compress=9 \
  "${DB_NAME}" \
  | gpg \
    --encrypt \
    --recipient "${GPG_RECIPIENT}" \
    --trust-model always \
    --batch \
  > "${BACKUP_FILE}"

echo "Backup created: ${BACKUP_FILE}"
echo "SHA256: $(sha256sum "${BACKUP_FILE}")"

Store the SHA256 checksum separately. On restore, verify the checksum before decrypting to detect corruption or tampering.

Symmetric encryption with AES-256 (for automated systems):

# Encrypt with a key stored in your secrets manager
BACKUP_KEY=$(aws secretsmanager get-secret-value \
  --secret-id prod/backup-encryption-key \
  --query SecretString \
  --output text)

pg_dump myapp | gzip | \
  openssl enc -aes-256-gcm \
    -pbkdf2 \
    -iter 100000 \
    -pass "pass:${BACKUP_KEY}" \
    -out "backup_${TIMESTAMP}.sql.gz.enc"

Use AES-256-GCM rather than AES-256-CBC. GCM provides authenticated encryption — it detects tampering, not just decrypts. Never use AES-ECB.

MySQL/MariaDB with Percona XtraBackup:

xtrabackup \
  --backup \
  --target-dir=/tmp/xtrabackup \
  --encrypt=AES256 \
  --encrypt-key-file=/etc/backup/encryption.key \
  --encrypt-threads=4 \
  --compress \
  --compress-threads=4

Encryption at the Storage Level

If your backup tool does not support encryption, encrypt the storage layer. AWS S3 with SSE-KMS:

# Upload with server-side encryption
aws s3 cp backup.sql.gz.gpg \
  s3://myapp-backups/postgresql/2025/08/ \
  --sse aws:kms \
  --sse-kms-key-id arn:aws:kms:us-east-1:123456789:key/abc123

# Enforce encryption on the bucket
aws s3api put-bucket-encryption \
  --bucket myapp-backups \
  --server-side-encryption-configuration '{
    "Rules": [{
      "ApplyServerSideEncryptionByDefault": {
        "SSEAlgorithm": "aws:kms",
        "KMSMasterKeyID": "arn:aws:kms:us-east-1:123456789:key/abc123"
      },
      "BucketKeyEnabled": true
    }]
  }'

Combine both layers: encrypt the backup file with GPG/AES, then store on encrypted S3. An attacker who exfiltrates the S3 object still cannot read the data without the backup encryption key.

The 3-2-1-1-0 Rule

The original 3-2-1 rule (3 copies, 2 different media, 1 offsite) has been extended to account for ransomware:

3 total copies of your data
2 different storage media/platforms (e.g., S3 + Glacier, or cloud + tape)
1 copy offsite (different cloud region or physical location)
1 copy offline or air-gapped (cannot be reached by ransomware)
0 errors on restore tests (every backup must be verified restorable)

In practice for a cloud-native application:

Copy	Storage	Location	Online?
Primary	RDS automated backup	AWS us-east-1	Yes
Secondary	S3 encrypted export	AWS us-west-2	Yes
Tertiary	Glacier Deep Archive	AWS eu-west-1	Offline (12h retrieval)
Air-gapped	Object Lock or tape	Separate AWS account or physical	Immutable

Air-Gapped Backup Implementation

An air-gapped backup cannot be modified or deleted by ransomware, even if your primary AWS account is fully compromised.

S3 Object Lock (Immutable Backups)

S3 Object Lock with Compliance mode prevents deletion or modification even by the account root user for the retention period:

# Create the backup bucket with Object Lock enabled
aws s3api create-bucket \
  --bucket myapp-immutable-backups \
  --region us-east-1 \
  --object-lock-enabled-for-bucket

# Configure default retention: 90 days, Compliance mode
aws s3api put-object-lock-configuration \
  --bucket myapp-immutable-backups \
  --object-lock-configuration '{
    "ObjectLockEnabled": "Enabled",
    "Rule": {
      "DefaultRetention": {
        "Mode": "COMPLIANCE",
        "Days": 90
      }
    }
  }'

# Upload a backup — it cannot be deleted for 90 days
aws s3 cp backup_encrypted.sql.gz.gpg \
  s3://myapp-immutable-backups/2025/08/01/

The backup AWS account should have no IAM roles or users that can disable Object Lock or delete objects. Use a dedicated AWS account with minimal blast radius.

Cross-Account Replication

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "BackupAccountReplication",
      "Effect": "Allow",
      "Principal": {
        "AWS": "arn:aws:iam::PRODUCTION_ACCOUNT:role/backup-replication"
      },
      "Action": [
        "s3:ReplicateObject",
        "s3:ReplicateTags"
      ],
      "Resource": "arn:aws:s3:::myapp-immutable-backups/*"
    }
  ]
}

The production account can write new backups but cannot delete or modify existing ones.

Automated Restore Testing

A backup you have never restored is a hypothesis, not a guarantee. The "0" in 3-2-1-1-0 means zero untested backups in production rotation.

Restore Test Script

#!/bin/bash
set -euo pipefail

BACKUP_BUCKET="s3://myapp-backups"
TEST_DB_NAME="restore_test_$(date +%Y%m%d)"
ALERT_WEBHOOK="${SLACK_WEBHOOK_URL}"

send_alert() {
  local status="$1"
  local message="$2"
  curl -s -X POST "${ALERT_WEBHOOK}" \
    -H 'Content-type: application/json' \
    --data "{\"text\":\"Backup restore test ${status}: ${message}\"}"
}

# Find latest backup
LATEST=$(aws s3 ls "${BACKUP_BUCKET}/postgresql/" \
  --recursive \
  | sort | tail -n 1 | awk '{print $4}')

# Download and decrypt
aws s3 cp "${BACKUP_BUCKET}/${LATEST}" /tmp/latest_backup.enc

BACKUP_KEY=$(aws secretsmanager get-secret-value \
  --secret-id prod/backup-encryption-key \
  --query SecretString --output text)

openssl enc -aes-256-gcm -d \
  -pbkdf2 -iter 100000 \
  -pass "pass:${BACKUP_KEY}" \
  -in /tmp/latest_backup.enc \
  | gunzip > /tmp/latest_backup.sql

# Create test database and restore
psql --host=test-postgres.internal \
  --username=restore_agent \
  --command="CREATE DATABASE ${TEST_DB_NAME};"

psql --host=test-postgres.internal \
  --username=restore_agent \
  --dbname="${TEST_DB_NAME}" \
  < /tmp/latest_backup.sql

# Validate restore: check row counts against expected baseline
ROW_COUNT=$(psql --host=test-postgres.internal \
  --username=restore_agent \
  --dbname="${TEST_DB_NAME}" \
  --tuples-only \
  --command="SELECT COUNT(*) FROM users;")

EXPECTED_MIN=1000  # Adjust to your real minimum

if [ "${ROW_COUNT}" -lt "${EXPECTED_MIN}" ]; then
  send_alert "FAILED" "User count ${ROW_COUNT} below minimum ${EXPECTED_MIN}"
  exit 1
fi

# Cleanup
psql --host=test-postgres.internal \
  --username=restore_agent \
  --command="DROP DATABASE ${TEST_DB_NAME};"

rm -f /tmp/latest_backup.enc /tmp/latest_backup.sql

send_alert "PASSED" "Restored ${LATEST}, ${ROW_COUNT} users verified"

Run this script daily in a cron job or CI pipeline. Page the on-call engineer if it fails.

What to Validate in a Restore Test

Beyond row counts, validate:

Schema integrity: Run your migration tool in dry-run mode against the restored DB — it should report no pending migrations
Foreign key constraints: Run ANALYZE and check for constraint violations
Application smoke test: Point a staging environment at the restored DB and run a subset of integration tests
Backup age: Assert the backup is no older than your RPO — a restore from a backup that's 3 days old when your RPO is 24 hours is a policy violation

# Check backup age
BACKUP_DATE=$(echo "${LATEST}" | grep -oP '\d{8}')
BACKUP_EPOCH=$(date -d "${BACKUP_DATE}" +%s)
NOW_EPOCH=$(date +%s)
AGE_HOURS=$(( (NOW_EPOCH - BACKUP_EPOCH) / 3600 ))
RPO_HOURS=24

if [ "${AGE_HOURS}" -gt "${RPO_HOURS}" ]; then
  send_alert "FAILED" "Backup is ${AGE_HOURS}h old, exceeds RPO of ${RPO_HOURS}h"
  exit 1
fi

RTO and RPO: Setting Realistic Targets

Recovery Point Objective (RPO): How much data loss is acceptable? If your RPO is 1 hour, you must take backups at least every hour.

Recovery Time Objective (RTO): How long can the system be down? If your RTO is 2 hours, your entire restore process — detection, decision, download, decrypt, restore, validate, cutover — must complete in under 2 hours.

Measure actual restore time against a realistic dataset:

time (
  aws s3 cp s3://myapp-backups/latest.enc /tmp/ &&
  decrypt /tmp/latest.enc /tmp/latest.sql &&
  psql restore_target < /tmp/latest.sql
)

For a 100 GB database, a full restore from S3 might take 45 minutes just for the download. Add decryption (minutes), restore (30-60 minutes for PostgreSQL), validation (minutes), and DNS/connection string update (minutes), and a 2-hour RTO may be very tight.

Common strategies for reducing RTO:

Point-in-time recovery (PITR): AWS RDS and Aurora support PITR to any second within the retention window. Restore time is minutes, not hours.
Warm standby: A continuously-replicated standby that can be promoted in under 60 seconds (RDS Multi-AZ).
Snapshot pre-warming: Keep a recent snapshot in the same region so download time is near zero.
Tabletop exercises: Run a full recovery drill quarterly. The first time you run through a restore under pressure should not be a real incident.

Document your RTO and RPO in your runbooks, measure against them on every restore test, and escalate to leadership if actual restore times exceed targets. The gap between "we think our RTO is 2 hours" and "our last test took 7 hours" is a business risk that needs a budget decision.