🧠 AI Computer Institute
Content is AI-generated for educational purposes. Verify critical information independently. A bharath.ai initiative.

System Design Cheat Sheet

systemsGrades 11-127 sections

System Design Goals

// Core metrics
Latency: Response time (ms, lower is better)
Throughput: Requests/second (higher is better)
Availability: % uptime (99.9% = 8.76 hours downtime/year)
Scalability: Handle growth (users, data, traffic)

// QoS Levels
99% = 52 min downtime/year
99.9% (three nines) = 8.76 hours/year
99.99% (four nines) = 52 minutes/year
99.999% (five nines) = 5.26 minutes/year

// Design constraints
Latency: < 100ms for web, < 1s for interactive
Throughput: 1000s to millions req/sec
Storage: MB to PB scale
Cost: Infrastructure + operational

// Trade-offs (choose 2)
Fast + Scalable: Expensive
Fast + Cheap: Not scalable
Scalable + Cheap: Slow

Load Balancing

// Purpose: Distribute traffic across servers
Increases: Throughput, availability, resilience

// Algorithms
Round-robin: Rotate through servers
Least connections: Send to least busy
Weighted: Send more to powerful servers
IP hash: Same client → same server (sticky)
Least response time: Send to fastest
Random

// Layers
L4 (Transport): TCP/UDP based (fast, less awareness)
L7 (Application): HTTP based (smart, can route by URL)

Example L7 rules:
- /api/* → API servers
- /images/* → CDN
- /users/* → User service

// Global load balancer
Route users to nearest datacenter
GeoDNS: DNS returns closest IP
Consider latency + availability

// Health checks
Periodic checks (every 10s)
If unhealthy: Remove from pool
Re-check periodically
Circuit breaker: Fail-open after threshold

Caching Strategies

// Cache levels
Browser: Client-side cache
CDN: Geographic distribution
Application: In-memory (Redis, Memcached)
Database: Query result cache
Disk: Slow persistent cache

// Cache patterns
Write-through: Write to cache + DB simultaneously
Write-back: Write to cache, async to DB (risky)
Write-around: Write to DB only, cache on read
Read-through: App reads from cache (miss → load from DB)

// Eviction policies (when cache full)
LRU (Least Recently Used): Remove least recent
LFU (Least Frequently Used): Remove least accessed
FIFO (First In First Out): Remove oldest
Random: Remove random
TTL: Time-to-live, expire after time

// Cache invalidation
TTL: Auto-expire after N seconds
Event-based: Clear on data update
Manual: Explicit invalidate
Cache-aside: App handles loading

// Problems
Cache invalidation: Hard problem (quote: "naming, caching, off-by-one errors")
Stale data: Old cached value
Thundering herd: All clients hit DB on cache miss
Cache stampede: Same value expires, multiple updates

// Solutions
Probabilistic early expiry: Refresh before expiry
Locks: Serialize updates
Eventual consistency: Accept stale data
Cache warming: Preload hot data

Content Delivery Network (CDN)

// CDN: Geographically distributed servers
Users get content from nearest location
Reduces: Latency, bandwidth, server load

// How it works
1. Upload content to CDN provider (Cloudflare, Akamai)
2. Content distributed to edge servers globally
3. User's request goes to nearest edge server
4. If not cached, edge fetches from origin

// Benefits
Lower latency (closer to users)
Reduced origin load
Better availability
DDoS protection
HTTPS acceleration

// Cache headers
Cache-Control: public, max-age=3600  (1 hour)
Cache-Control: private  (user-only cache)
Cache-Control: no-cache  (revalidate before use)
Expires: Absolute time
ETag: Version identifier (validate with server)

// Purging
Invalidate cache on updates
TTL-based: Auto-expire
Webhook: Push updates to CDN

// Static vs Dynamic
Static: Images, CSS, JS, HTML (CDN perfect)
Dynamic: Personalized content (harder to cache)
Partial: Cache where possible, origin for dynamic parts

Message Queues & Asynchronous Processing

// Why async?
Decouple services
Handle spikes (queue absorbs traffic)
Retry failed jobs
Load balance workers
Enable scalability

// Message queue brokers
RabbitMQ: Reliable, complex routing
Apache Kafka: High throughput, log-based
AWS SQS: Managed, simple
Redis: Fast in-memory
ActiveMQ: Enterprise

// Publish-Subscribe
Publisher: Sends message
Broker: Routes to subscribers
Subscribers: Receive and process

Example:
- Order created event
- Email service subscribes
- Payment service subscribes
- Inventory service subscribes

// Point-to-Point (Queue)
Producer → Broker → Consumer
Message consumed once
Good for: Task distribution, job queues

// Benefits
Resilience: Producer doesn't wait
Scalability: Consumers can scale independently
Decoupling: Services don't know each other
Flexibility: Add new consumers without code changes

// Guarantees
At most once: May lose messages
At least once: May duplicate (handle idempotency)
Exactly once: Hardest to achieve

// Challenges
Ordering: May not preserve order (distributed)
Deduplication: Handle duplicates
Poison pills: Broken messages cause loops

Microservices Architecture

// Microservices: Small, independent services
Instead of monolith, split into:
- User service
- Order service
- Payment service
- Inventory service
- Notification service

// Benefits
Independent scaling: Scale only what's needed
Technology variety: Different techs per service
Fast deployment: Deploy independently
Fault isolation: One service down ≠ all down
Organizational alignment: Teams own services

// Challenges
Network latency: More remote calls
Eventual consistency: No ACID across services
Operational complexity: Deploy/monitor many services
Testing: Integration tests harder
Debugging: Trace requests across services

// Communication
REST API: Synchronous, simple
gRPC: Binary, fast, efficient
Message queues: Asynchronous (decoupled)
Service mesh: (Istio, Linkerd) handle cross-cutting concerns

// Service discovery
Hardcoding IPs: Bad (not scalable)
Load balancer: Single entry point
Service registry: Services register, clients query
Example: Consul, etcd, Kubernetes

Typical Architecture Layers

// 3-tier architecture
Presentation: Web/mobile UI
Business logic: API servers
Data: Database

// Modern multi-layer
├─ Client (Web, Mobile)
├─ CDN (Static assets)
├─ Load Balancer (Distribute traffic)
├─ API Gateway (Auth, rate limit, routing)
├─ Microservices (Separate services)
├─ Message Queue (Async jobs)
├─ Cache (Redis, Memcached)
├─ Primary Database (SQL)
├─ Replica Database (Read-only)
├─ Search Index (Elasticsearch)
├─ Data Warehouse (Analytics)
└─ Backup/Archival (Cold storage)

// Request flow (e-commerce)
1. Client requests product page
2. CDN serves static assets
3. Load balancer routes to API server
4. API gateway authenticates
5. Service calls:
   - Product service (cache hit)
   - Review service (database query)
   - Recommendation service (ML model)
6. Response compiled & returned

// Monitoring
Metrics: CPU, memory, disk, network
Logs: Application events
Traces: Request flow across services
Alerts: Notify on anomalies

More Cheat Sheets