System Design

CAP Theorem

A distributed system can only guarantee two of three:

Consistency — every read receives the most recent write
Availability — every request receives a response
Partition tolerance — system continues despite network failures

In practice, network partitions always happen, so you’re really choosing between CP (consistency) and AP (availability). Most real-world systems are somewhere on the spectrum.

Consistency Patterns

Strong consistency — reads always return the latest write. Simpler to reason about, but slower (e.g. ACID databases)
Eventual consistency — reads may return stale data temporarily, but will converge. Higher availability and lower latency (e.g. DynamoDB, DNS)
Read-your-writes — a user always sees their own writes, even if other users see stale data

Scaling

Vertical vs Horizontal

Vertical (scale up) — bigger machine. Simple, but has a ceiling
Horizontal (scale out) — more machines. Complex, but no ceiling

Database Scaling

Read replicas — offload reads to copies
Sharding — split data across databases by a key (e.g. user ID). Hard to change later, so avoid until necessary
Connection pooling — reuse database connections (PgBouncer, ProxySQL)

Load Balancing

Distributes traffic across multiple servers. Strategies:

Round robin — simple rotation
Least connections — send to the server with fewest active connections
Weighted — route more traffic to stronger servers
Consistent hashing — useful for caching layers, minimises redistribution when servers are added/removed
Visual guide to load balancing

Caching

Cache invalidation is one of the hardest problems in CS
Cache-aside — application checks cache first, loads from DB on miss, writes to cache
Write-through — write to cache and DB simultaneously
Write-behind — write to cache, asynchronously persist to DB. Faster writes, risk of data loss
TTL (time to live) — expire entries after a duration. Simple but may serve stale data
Layers: browser → CDN → reverse proxy → application cache → database cache

Rate Limiting

Protect services from overload
Token bucket — tokens refill at a fixed rate, each request consumes a token
Sliding window — count requests in a rolling time window
Apply at API gateway level, not deep in the application

Retries

Use exponential backoff with jitter — don’t hammer a failing service
Set a max retry count
Make operations idempotent so retries are safe
Retries guide

Communication Patterns

Synchronous (request/response) — HTTP/REST, gRPC. Simpler, but creates coupling
Asynchronous (event-driven) — message Queues, pub/sub. Decoupled, but harder to debug
gRPC — binary protocol over HTTP/2, faster than REST, strongly typed via protobuf. Good for service-to-service communication

Rai Notes

Explorer

System Design

CAP Theorem

Consistency Patterns

Scaling

Vertical vs Horizontal

Database Scaling

Load Balancing

Caching

Rate Limiting

Retries

Communication Patterns

Resources

Graph View

Table of Contents

Backlinks

Rai Notes

Explorer

System Design

CAP Theorem

Consistency Patterns

Scaling

Vertical vs Horizontal

Database Scaling

Load Balancing

Caching

Rate Limiting

Retries

Communication Patterns

Related

Resources

Graph View

Table of Contents

Backlinks