Infrastructure security encompasses the practices, tools, and processes used to protect your systems, networks, and data from threats. In a DevSecOps model, security is integrated throughout the development and operations lifecycle rather than bolted on at the end.

Zero Trust Architecture

Traditional security assumes everything inside the network perimeter is trusted. Zero trust assumes nothing is trusted by default.

Core Principles

  1. Never trust, always verify — Authenticate and authorise every request
  2. Assume breach — Design as if attackers are already inside
  3. Verify explicitly — Use all available data points (identity, location, device, behaviour)
  4. Least privilege access — Grant minimum necessary permissions
  5. Micro-segmentation — Limit blast radius with fine-grained network controls

Implementing Zero Trust

  • Strong identity verification for all users and services
  • Device health validation before granting access
  • Micro-segmentation of networks and workloads
  • Encryption of all data in transit
  • Continuous monitoring and validation
  • Just-in-time and just-enough access

Identity and Access Management (IAM)

Principles

Least privilege: Grant only the permissions needed to perform a task, nothing more.

Separation of duties: Divide responsibilities so no single person can compromise the system (e.g., developers can’t deploy to production without review).

Defense in depth: Multiple layers of controls; if one fails, others still protect.

Authentication

Multi-factor authentication (MFA): Require something you know (password) + something you have (token, phone) or something you are (biometrics).

  • Enforce MFA for all human users, especially privileged access
  • Use hardware tokens (YubiKey) for highest security
  • Avoid SMS-based MFA where possible (vulnerable to SIM swapping)

Service-to-service authentication:

  • Use short-lived credentials (tokens, certificates)
  • Mutual TLS (mTLS) between services
  • Avoid long-lived API keys where possible

Authorisation

Role-Based Access Control (RBAC):

  • Assign permissions to roles, assign roles to users
  • Easier to manage than individual permissions
  • Common in Kubernetes, cloud IAM

Attribute-Based Access Control (ABAC):

  • Permissions based on attributes (user department, resource owner, time of day)
  • More flexible but more complex

Policy as Code:

  • Define access policies in code (e.g., OPA/Rego)
  • Version control and review policies like application code
  • Test policies before deployment

Privileged Access Management

  • Use just-in-time (JIT) access — grant elevated permissions only when needed
  • Require approval workflows for sensitive operations
  • Time-bound access — automatically revoke after a period
  • Audit all privileged access

Secrets Management

Secrets include API keys, database credentials, certificates, encryption keys, and tokens.

Bad Practices

  • Hardcoding secrets in source code
  • Storing secrets in environment variables in plain text
  • Committing secrets to version control
  • Sharing secrets via email or chat
  • Using the same secret across environments

Good Practices

  • Use a dedicated secrets manager
  • Rotate secrets regularly (and automatically)
  • Audit secret access
  • Use different secrets per environment
  • Encrypt secrets at rest and in transit

Secrets Management Tools

Kubernetes Secrets

Native Kubernetes secrets are base64-encoded, not encrypted. Enhance security with:

Network Security

Network Segmentation

Divide networks into zones with controlled traffic between them.

Typical zones:

  • Public — Internet-facing (load balancers, CDN)
  • DMZ — Semi-trusted (web servers, API gateways)
  • Private — Internal services (application servers)
  • Restricted — Sensitive data (databases, secrets)

Firewall and Security Groups

  • Default deny — only allow explicitly permitted traffic
  • Use security groups/NSGs to control traffic between resources
  • Review rules regularly; remove unused rules
  • Log denied traffic for security analysis

See AWS Security Groups for cloud-specific guidance.

Web Application Firewall (WAF)

Protect web applications from common attacks:

  • SQL injection
  • Cross-site scripting (XSS)
  • OWASP Top 10 vulnerabilities

Services: AWS WAF, Cloudflare WAF, Azure WAF, Fastly

DDoS Protection

  • Use CDN providers with built-in DDoS mitigation
  • Enable cloud provider DDoS protection (AWS Shield, GCP Cloud Armor)
  • Implement rate limiting
  • Design for horizontal scaling

Private Connectivity

  • Use private endpoints/Private Link to access cloud services without internet
  • VPN or Direct Connect for on-premises connectivity
  • VPC peering or transit gateways for cross-account/cross-region connectivity

Supply Chain Security

Software Bill of Materials (SBOM)

An SBOM lists all components (dependencies, libraries) in your software. Essential for:

  • Knowing what you’re running
  • Responding quickly to vulnerabilities (e.g., Log4Shell)
  • Compliance requirements

Tools: Syft, Trivy, CycloneDX

Dependency Scanning

Automatically scan dependencies for known vulnerabilities:

  • Scan in CI/CD pipelines
  • Fail builds for critical vulnerabilities
  • Monitor continuously (new CVEs affect existing code)

Tools:

Container Image Security

Build:

  • Use minimal base images (distroless, Alpine)
  • Don’t run as root
  • Pin image versions (avoid latest tag)
  • Scan images for vulnerabilities
  • Sign images to verify authenticity

Runtime:

  • Use read-only filesystems where possible
  • Drop unnecessary capabilities
  • Enforce image policies (only allow signed/scanned images)

Tools:

Code Signing

Sign commits and artefacts to ensure integrity and authenticity:

  • Sign Git commits with GPG or SSH keys
  • Sign container images
  • Sign Helm charts and other deployment artefacts

Vulnerability Management

Scanning

What to scan:

  • Application code (SAST — Static Application Security Testing)
  • Dependencies (SCA — Software Composition Analysis)
  • Container images
  • Infrastructure as Code (misconfigurations)
  • Cloud configurations
  • Running systems (DAST — Dynamic Application Security Testing)

When to scan:

  • In CI/CD pipelines (shift left)
  • Continuously in production (runtime scanning)
  • Before major releases

Triage and Prioritisation

Not all vulnerabilities are equal. Prioritise based on:

  • Severity — CVSS score
  • Exploitability — Is there a known exploit? Is it easy to exploit?
  • Exposure — Is the vulnerable component reachable from the internet?
  • Business impact — What could an attacker do if they exploited it?

Patch Management

  • Automate patching where possible (auto-merge dependabot, auto-update base images)
  • Define SLAs for patching (e.g., critical within 24 hours, high within 7 days)
  • Test patches before production deployment
  • Track patch compliance

Tools

  • Trivy — All-in-one scanner (containers, IaC, SBOM)
  • Snyk — Developer-first security platform
  • Checkov — IaC security scanning
  • tfsec — Terraform security scanner
  • Semgrep — Static analysis
  • SonarQube — Code quality and security

Compliance and Governance

Common Frameworks

  • SOC 2 — Trust service criteria for service organisations
  • ISO 27001 — Information security management
  • PCI DSS — Payment card industry standards
  • HIPAA — Healthcare data protection (US)
  • GDPR — Data protection (EU)
  • FedRAMP — US government cloud security

Policy as Code

Codify compliance requirements:

  • Use OPA or Kyverno for policy enforcement
  • Scan infrastructure with Checkov, tfsec
  • Block non-compliant deployments in CI/CD

Audit Logging

  • Log all access to sensitive systems and data
  • Log administrative actions
  • Centralise logs in immutable storage
  • Retain logs per compliance requirements
  • Enable cloud provider audit trails (CloudTrail, GCP Audit Logs, Azure Activity Log)

Kubernetes Security

Cluster Hardening

  • Keep Kubernetes up to date
  • Enable RBAC (disable ABAC)
  • Encrypt etcd data at rest
  • Restrict access to the API server
  • Use network policies to control pod-to-pod traffic
  • Enable audit logging

Pod Security

  • Don’t run containers as root
  • Use read-only root filesystems
  • Drop all capabilities, add only what’s needed
  • Use Pod Security Standards (restricted, baseline, privileged)
  • Set resource limits to prevent denial of service

Network Policies

Control traffic between pods:

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: default-deny-ingress
spec:
  podSelector: {}
  policyTypes:
  - Ingress

Start with default deny, then allow specific traffic.

Runtime Security

Detect threats at runtime:

  • Falco — Runtime security and threat detection
  • Tetragon — eBPF-based security observability
  • KubeArmor — Runtime security enforcement

Incident Response

See Incident Management for general incident response practices.

Security-specific considerations:

  • Have a dedicated security incident process
  • Know when to involve legal, PR, and executive leadership
  • Preserve evidence for forensics
  • Have containment strategies ready (isolate affected systems)
  • Plan for disclosure and communication