<< back to Guides
β‘ Guide: Caching β A Systems Design Deep Dive
Caching is a critical system design pattern used to improve performance, reduce latency, and scale applications by avoiding repeated computation or I/O.
This guide covers:
- What caching is and why it matters
- Caching placement strategies (client, CDN, edge, app, DB)
- Cache invalidation and consistency
- Eviction policies
- Common caching architectures and tools
π§ 1. What Is Caching?
Caching stores precomputed or previously retrieved data in a faster-access storage layer, allowing future requests to bypass more expensive operations.
π Why Cache?
- Reduce database or API load
- Lower latency for frequently accessed data
- Smooth traffic spikes
- Enable offline access (client-side)
// Without cache
user = db.query("SELECT * FROM users WHERE id = 42")
// With cache
user = cache.get("user:42")
if (!user) {
user = db.query(...)
cache.set("user:42", user)
}
π¦ 2. Caching Layers & Placement
| Layer |
Description |
Example |
| Client-side |
In browser/app |
LocalStorage, IndexedDB |
| CDN |
Caches static assets near users |
Cloudflare, Akamai |
| Edge caching |
Dynamic content at PoPs |
Fastly, CloudFront Lambda@Edge |
| App-level |
Cache inside the app |
In-memory (e.g. Map, Guava) |
| Server-side |
Central cache layer |
Redis, Memcached |
| DB/Storage |
Internal query caching |
PostgreSQL buffer cache |
π 3. Caching Strategies
π’ Read-through Cache
App asks cache first; if not found, fetches from source and stores it.
function getUser(id) {
let user = cache.get(`user:${id}`);
if (!user) {
user = db.query(...);
cache.set(`user:${id}`, user);
}
return user;
}
π Write-through Cache
Write to cache and the DB at the same time.
π΄ Cache-aside (Lazy loading)
App controls cache population manually.
π£ Write-back (Write-behind)
Write only to cache and flush to DB asynchronously (dangerous on crashes).
β 4. Cache Invalidation
βThere are only two hard things in computer science: cache invalidation and naming things.β
β Phil Karlton
π£ Why Itβs Hard:
- When data changes, how do we know which cache keys to evict?
- Cached data might become stale or inconsistent
π§ Techniques:
- Time-to-live (TTL): Expire keys automatically
- Explicit Invalidation: App deletes/updates cache when DB changes
- Versioning: Store with key version or hash (
user:42:v3)
- Event-driven: Use pub/sub (e.g. Redis Streams, Kafka) to evict across systems
π§Ή 5. Eviction Policies
| Policy |
Description |
Use Case |
| LRU (Least Recently Used) |
Remove least recently accessed |
Most popular |
| LFU (Least Frequently Used) |
Remove least accessed frequently |
Hot/cold data separation |
| FIFO |
Remove oldest added |
Simple but naive |
| TTL-based |
Remove after N seconds |
Time-bound data like sessions |
| Manual |
Application deletes keys explicitly |
Fine-grained control |
// Redis: set TTL
SET user:42 "data" EX 60
β οΈ 6. Consistency Models
| Consistency Model |
Description |
Trade-offs |
| Strong |
Cache and source always in sync |
Hard to scale |
| Eventual |
Cache updated after source writes |
Simpler, may serve stale reads |
| Write-through |
Write to cache + DB together |
Safer, slightly slower |
| Write-back |
Write to cache only, sync later |
Fast, but risk of data loss |
π 7. Common Use Cases
| Use Case |
Caching Strategy |
Tools |
| API responses |
TTL or CDN-based caching |
Fastly, Varnish |
| User profiles |
Cache-aside or read-through |
Redis |
| Search suggestions |
In-memory with LRU |
Guava, Caffeine (Java) |
| Product catalog |
Versioned cache + TTL |
Redis + async updates |
| ML features or scores |
Write-through, time-bounded |
Redis, feature stores |
π οΈ 8. Tools and Technologies
| Tool |
Type |
Notes |
| Redis |
In-memory, LRU, TTL, pub/sub |
Versatile, widely used |
| Memcached |
In-memory, LRU only |
Lightweight, simpler than Redis |
| Caffeine (Java) |
Local, LRU, async loading |
High-performance in-JVM caching |
| Varnish |
HTTP reverse proxy |
Edge and CDN caching |
| Cloudflare / Fastly |
CDN |
Global static and dynamic caching |
π₯ 9. Pitfalls to Avoid
| Pitfall |
Recommendation |
| Serving stale data |
Use TTLs, versioned keys, or event triggers |
| Inconsistent multi-node caches |
Use central Redis or distributed cache |
| Cache stampede (thundering herd) |
Use locking or request coalescing |
| Large unbounded keys |
Use size limits and LRU eviction |
| Over-caching |
Donβt cache low-traffic or fast queries |
// Prevent cache stampede
if (!cache.get(key)) {
if (acquireLock(key)) {
let val = db.query(...)
cache.set(key, val);
releaseLock(key);
} else {
waitAndRetry(); // someone else is loading
}
}
π§° 10. Advanced Patterns
π§© Sharded Caches
- Split cache across multiple nodes
- Redis Cluster, consistent hashing
π Cache Invalidation by Events
- Invalidate based on pub/sub changes
- Kafka, Redis Streams, Debezium
π Cache Observability
- Monitor hit/miss rates, TTL behavior, evictions
- Use Prometheus exporters, Redis Insights, Datadog
β
Summary
| Topic |
Key Point |
| Strategy |
Cache-aside is most common |
| Placement |
Choose based on latency + scale |
| Invalidation |
TTL + versioning often best combo |
| Eviction |
LRU is best default |
| Consistency |
Choose based on criticality & latency |
π Further Reading
<< back to Guides