Cloud Security

Kubernetes Security Hardening: CIS Benchmarks and NSA/CISA Guidance

A comprehensive guide to hardening Kubernetes clusters based on CIS Benchmarks, NSA/CISA guidance — covering RBAC, Pod Security Standards, etcd encryption, Falco, and default-deny network policies.

November 1, 20258 min readShipSafer Team

The NSA and CISA published their Kubernetes Hardening Guide citing container and cluster-level misconfigurations as the primary source of Kubernetes incidents. Most attacks they documented didn't involve novel exploits — they exploited default configurations, overpermissioned service accounts, and absent network segmentation. This guide covers the concrete hardening steps from CIS Kubernetes Benchmark v1.8 and NSA/CISA guidance.

Control Plane Hardening

Secure API Server Configuration

The API server is the entry point to all Kubernetes operations. Key hardening flags:

# /etc/kubernetes/manifests/kube-apiserver.yaml
spec:
  containers:
  - command:
    - kube-apiserver
    # Disable anonymous authentication
    - --anonymous-auth=false
    # Enable RBAC and Node authorization
    - --authorization-mode=Node,RBAC
    # Require client certificates for kubelet connections
    - --kubelet-client-certificate=/etc/kubernetes/pki/apiserver-kubelet-client.crt
    - --kubelet-client-key=/etc/kubernetes/pki/apiserver-kubelet-client.key
    # Enable audit logging
    - --audit-log-path=/var/log/kubernetes/audit.log
    - --audit-log-maxage=30
    - --audit-log-maxbackup=10
    - --audit-log-maxsize=100
    - --audit-policy-file=/etc/kubernetes/audit-policy.yaml
    # Disable profiling endpoints
    - --profiling=false
    # Restrict admission controllers
    - --enable-admission-plugins=NodeRestriction,PodSecurity,ResourceQuota,LimitRanger
    # Disable service account automounting at admission
    - --disable-admission-plugins=ServiceAccount
    # Bind to specific interface
    - --bind-address=10.0.0.1
    # TLS settings
    - --tls-min-version=VersionTLS12
    - --tls-cipher-suites=TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256

Audit Policy Configuration

A comprehensive audit policy captures security-relevant events without logging every list/watch request:

# /etc/kubernetes/audit-policy.yaml
apiVersion: audit.k8s.io/v1
kind: Policy
rules:
  # Log auth changes at RequestResponse level
  - level: RequestResponse
    resources:
    - group: ""
      resources: ["secrets", "configmaps"]
    - group: "rbac.authorization.k8s.io"
      resources: ["roles", "rolebindings", "clusterroles", "clusterrolebindings"]
  # Log pod exec, port-forward, and attach
  - level: RequestResponse
    verbs: ["create"]
    resources:
    - group: ""
      resources: ["pods/exec", "pods/portforward", "pods/attach"]
  # Log all create/update/delete at Metadata level
  - level: Metadata
    verbs: ["create", "update", "patch", "delete"]
  # Exclude noisy read requests
  - level: None
    verbs: ["get", "list", "watch"]
    resources:
    - group: ""
      resources: ["nodes", "pods", "services", "endpoints"]

Encrypt etcd at Rest

etcd stores all cluster state including Secrets. Without encryption, anyone with access to the etcd data directory can read all secrets:

# /etc/kubernetes/encryption-config.yaml
apiVersion: apiserver.config.k8s.io/v1
kind: EncryptionConfiguration
resources:
- resources:
  - secrets
  providers:
  - aescbc:
      keys:
      - name: key1
        secret: <base64-encoded-32-byte-key>
  - identity: {}  # Fallback for unencrypted data (remove after migration)

Apply to kube-apiserver:

- --encryption-provider-config=/etc/kubernetes/encryption-config.yaml

Then re-encrypt all existing secrets:

kubectl get secrets --all-namespaces -o json | kubectl replace -f -

For key rotation, add the new key as the first entry in the keys list, restart the API server, re-encrypt all secrets, then remove the old key.

RBAC: Least Privilege Service Accounts

Disable Automounting of Service Account Tokens

By default, Kubernetes mounts a service account token in every pod. This is unnecessary for most workloads and provides attackers with API credentials if they compromise a container:

apiVersion: v1
kind: ServiceAccount
metadata:
  name: my-app
  namespace: production
automountServiceAccountToken: false

Or disable at the pod level:

apiVersion: v1
kind: Pod
spec:
  automountServiceAccountToken: false
  serviceAccountName: my-app

Create Minimal-Permission Service Accounts

Each workload should have its own service account with only the permissions it needs:

apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  name: config-reader
  namespace: production
rules:
- apiGroups: [""]
  resources: ["configmaps"]
  resourceNames: ["app-config"]  # Scope to specific ConfigMap names
  verbs: ["get"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: config-reader-binding
  namespace: production
subjects:
- kind: ServiceAccount
  name: my-app
  namespace: production
roleRef:
  kind: Role
  name: config-reader
  apiGroup: rbac.authorization.k8s.io

Audit Overpermissioned ClusterRoles

Find service accounts with cluster-admin or wildcard permissions:

# List all cluster-admin bindings
kubectl get clusterrolebindings -o json | \
  jq '.items[] | select(.roleRef.name=="cluster-admin") | {name:.metadata.name, subjects:.subjects}'

# Find roles with wildcard verbs or resources
kubectl get clusterroles -o json | \
  jq '.items[] | select(.rules[]?.verbs[]? == "*") | .metadata.name'

Pod Security Standards

Pod Security Standards (PSS) replaced PodSecurityPolicy in Kubernetes 1.25. They operate at namespace level via labels and enforce three profiles:

  • Privileged: No restrictions (for system namespaces)
  • Baseline: Prevents known privilege escalation vectors
  • Restricted: Heavily restricted, requires dropping all capabilities and running as non-root
# Apply restricted policy to production namespace
apiVersion: v1
kind: Namespace
metadata:
  name: production
  labels:
    pod-security.kubernetes.io/enforce: restricted
    pod-security.kubernetes.io/enforce-version: v1.28
    pod-security.kubernetes.io/audit: restricted
    pod-security.kubernetes.io/warn: restricted

Workloads in this namespace must comply with the restricted profile:

apiVersion: v1
kind: Pod
spec:
  securityContext:
    runAsNonRoot: true
    runAsUser: 1000
    runAsGroup: 3000
    fsGroup: 2000
    seccompProfile:
      type: RuntimeDefault
  containers:
  - name: app
    image: my-app:1.0
    securityContext:
      allowPrivilegeEscalation: false
      readOnlyRootFilesystem: true
      capabilities:
        drop: ["ALL"]
    resources:
      limits:
        cpu: "500m"
        memory: "128Mi"
      requests:
        cpu: "100m"
        memory: "64Mi"
    volumeMounts:
    - name: tmp
      mountPath: /tmp
  volumes:
  - name: tmp
    emptyDir: {}

Network Policies: Default-Deny Architecture

Without network policies, all pods can communicate with all other pods across all namespaces. Implement a default-deny policy in every namespace:

# Default deny all ingress and egress
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: default-deny-all
  namespace: production
spec:
  podSelector: {}
  policyTypes:
  - Ingress
  - Egress

Then explicitly allow required traffic:

# Allow frontend to reach backend on port 8080
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: allow-frontend-to-backend
  namespace: production
spec:
  podSelector:
    matchLabels:
      app: backend
  policyTypes:
  - Ingress
  ingress:
  - from:
    - podSelector:
        matchLabels:
          app: frontend
    ports:
    - protocol: TCP
      port: 8080
---
# Allow DNS resolution (required for all pods)
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: allow-dns
  namespace: production
spec:
  podSelector: {}
  policyTypes:
  - Egress
  egress:
  - ports:
    - port: 53
      protocol: UDP
    - port: 53
      protocol: TCP

Note: Network policies require a CNI plugin that supports them (Calico, Cilium, Weave Net). The built-in kubenet plugin does not enforce network policies.

Runtime Security with Falco

Falco monitors system calls in real-time and alerts on suspicious behavior. It can detect container breakouts, privilege escalation, cryptomining, and data exfiltration:

# Install Falco via Helm
helm repo add falcosecurity https://falcosecurity.github.io/charts
helm install falco falcosecurity/falco \
  --namespace falco \
  --create-namespace \
  --set falco.grpc.enabled=true \
  --set falco.grpcOutput.enabled=true \
  --set driver.kind=ebpf

Key Falco rules to enable:

# rules/custom.yaml

# Alert on shell execution inside containers
- rule: Shell Spawned in Container
  desc: Shell was spawned inside a container
  condition: >
    spawned_process and container and
    shell_procs and proc.pname != "sudo"
  output: >
    Shell spawned in container (user=%user.name container=%container.name
    image=%container.image.repository:%container.image.tag
    shell=%proc.name parent=%proc.pname)
  priority: WARNING

# Alert on sensitive file reads
- rule: Read Sensitive File
  desc: An attempt to read sensitive files
  condition: >
    open_read and container and
    (fd.name startswith /etc/shadow or
     fd.name startswith /etc/ssh or
     fd.name = /proc/1/environ)
  output: >
    Sensitive file opened for reading (file=%fd.name user=%user.name
    container=%container.name image=%container.image.repository)
  priority: ERROR

# Alert on network connection to unexpected destinations
- rule: Unexpected Network Outbound
  desc: Container making unexpected outbound connection
  condition: >
    outbound and container and
    not proc.name in (allowed_processes) and
    not fd.sip in (allowed_ips)
  output: >
    Unexpected outbound connection (container=%container.name
    connection=%fd.name ip=%fd.sip port=%fd.sport)
  priority: WARNING

Route Falco alerts to your SIEM via Falcosidekick, which supports Slack, PagerDuty, Elasticsearch, Datadog, and dozens of other outputs.

Image Security

Scan Images in CI/CD

Integrate vulnerability scanning into your pipeline so images with critical CVEs never reach production:

# GitHub Actions example
- name: Scan image with Trivy
  uses: aquasecurity/trivy-action@master
  with:
    image-ref: 'my-registry/my-app:${{ github.sha }}'
    format: 'sarif'
    output: 'trivy-results.sarif'
    severity: 'CRITICAL,HIGH'
    exit-code: '1'  # Fail the build on critical/high CVEs

Use Distroless or Minimal Base Images

Standard images like ubuntu:latest or python:3.11 contain hundreds of packages that expand the attack surface. Distroless images contain only the application and its runtime dependencies:

# Multi-stage build with distroless final image
FROM python:3.11-slim as builder
WORKDIR /app
COPY requirements.txt .
RUN pip install --user -r requirements.txt

FROM gcr.io/distroless/python3-debian12
COPY --from=builder /root/.local /root/.local
COPY --from=builder /app /app
USER nonroot:nonroot
ENTRYPOINT ["python", "/app/main.py"]

Enforce Image Signing with Sigstore

Use cosign to sign images in CI/CD and verify signatures in Kubernetes admission via the built-in admission controller or a policy engine like Kyverno:

# Sign image after build
cosign sign --key cosign.key my-registry/my-app:sha256-abcdef123

# Verify in Kyverno policy
kubectl apply -f - <<EOF
apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
  name: verify-image-signature
spec:
  validationFailureAction: enforce
  rules:
  - name: check-image-signature
    match:
      resources:
        kinds: [Pod]
    verifyImages:
    - imageReferences: ["my-registry/*"]
      attestors:
      - entries:
        - keys:
            publicKeys: |
              -----BEGIN PUBLIC KEY-----
              ...
              -----END PUBLIC KEY-----
EOF

CIS Benchmark Automated Scanning

Run kube-bench to audit your cluster configuration against CIS Kubernetes Benchmark:

kubectl apply -f https://raw.githubusercontent.com/aquasecurity/kube-bench/main/job.yaml
kubectl logs -f job/kube-bench

kube-bench checks over 100 controls covering API server flags, etcd configuration, kubelet settings, and RBAC configuration, mapping each finding to CIS control IDs.

The most impactful controls to prioritize are: etcd encryption at rest (#1 in terms of blast radius if compromised), RBAC with minimal service account permissions (#2 most commonly misconfigured), default-deny network policies (stops lateral movement), and Falco runtime monitoring (detects active exploitation). These four address the majority of documented Kubernetes attack patterns.

Kubernetes
container security
CIS benchmark
RBAC
Pod Security Standards
Falco
network policies

Check Your Security Score — Free

See exactly how your domain scores on DMARC, TLS, HTTP headers, and 25+ other automated security checks in under 60 seconds.