CAP Theorem
A distributed system can only guarantee two of three:
- Consistency — every read receives the most recent write
- Availability — every request receives a response
- Partition tolerance — system continues despite network failures
In practice, network partitions always happen, so you’re really choosing between CP (consistency) and AP (availability). Most real-world systems are somewhere on the spectrum.
Consistency Patterns
- Strong consistency — reads always return the latest write. Simpler to reason about, but slower (e.g. ACID databases)
- Eventual consistency — reads may return stale data temporarily, but will converge. Higher availability and lower latency (e.g. DynamoDB, DNS)
- Read-your-writes — a user always sees their own writes, even if other users see stale data
Scaling
Vertical vs Horizontal
- Vertical (scale up) — bigger machine. Simple, but has a ceiling
- Horizontal (scale out) — more machines. Complex, but no ceiling
Database Scaling
- Read replicas — offload reads to copies
- Sharding — split data across databases by a key (e.g. user ID). Hard to change later, so avoid until necessary
- Connection pooling — reuse database connections (PgBouncer, ProxySQL)
Load Balancing
Distributes traffic across multiple servers. Strategies:
- Round robin — simple rotation
- Least connections — send to the server with fewest active connections
- Weighted — route more traffic to stronger servers
- Consistent hashing — useful for caching layers, minimises redistribution when servers are added/removed
- Visual guide to load balancing
Caching
- Cache invalidation is one of the hardest problems in CS
- Cache-aside — application checks cache first, loads from DB on miss, writes to cache
- Write-through — write to cache and DB simultaneously
- Write-behind — write to cache, asynchronously persist to DB. Faster writes, risk of data loss
- TTL (time to live) — expire entries after a duration. Simple but may serve stale data
- Layers: browser → CDN → reverse proxy → application cache → database cache
Rate Limiting
- Protect services from overload
- Token bucket — tokens refill at a fixed rate, each request consumes a token
- Sliding window — count requests in a rolling time window
- Apply at API gateway level, not deep in the application
Retries
- Use exponential backoff with jitter — don’t hammer a failing service
- Set a max retry count
- Make operations idempotent so retries are safe
- Retries guide
Communication Patterns
- Synchronous (request/response) — HTTP/REST, gRPC. Simpler, but creates coupling
- Asynchronous (event-driven) — message Queues, pub/sub. Decoupled, but harder to debug
- gRPC — binary protocol over HTTP/2, faster than REST, strongly typed via protobuf. Good for service-to-service communication