Cache-Aside Cloud Pattern: Redis Lazy Loading

After this topic, you will be able to:

Implement cache-aside pattern with proper invalidation strategies
Design cache key strategies for different data access patterns
Evaluate cache-aside vs write-through vs write-behind trade-offs

TL;DR

Cache-aside (also called lazy loading) puts the application in control of cache management: the app checks the cache first, loads from the database on miss, and explicitly invalidates cache entries after writes. Unlike write-through or write-behind, the cache is not in the critical write path, making it simpler but requiring careful invalidation logic. This is the most common caching pattern at companies like Amazon, Facebook, and Netflix because it’s straightforward to implement and naturally handles cache failures gracefully.

Cheat Sheet: Read: check cache → miss? load DB + populate cache. Write: update DB → invalidate cache. Pros: simple, cache failures don’t break writes. Cons: cache stampede risk, stale data window, extra code in app layer.

The Problem It Solves

Database queries are often 100-1000x slower than cache lookups, but maintaining cache consistency is notoriously difficult. The fundamental tension: you want fast reads without stale data, but you can’t afford to query the database for every request. Early systems tried keeping caches automatically synchronized with databases, but this created tight coupling and complex failure modes. When your cache goes down, should writes fail? When the database updates, how does it notify all cache nodes? These questions led to brittle architectures.

The cache-aside pattern emerged from production systems at Amazon and Facebook that needed a pragmatic solution: accept that caches will occasionally be stale, but give the application explicit control over when and how to refresh them. This shifts complexity from infrastructure (automatic synchronization) to application logic (explicit invalidation), which is actually easier to reason about and debug. The pattern solves three specific problems: (1) reducing database load for read-heavy workloads, (2) surviving cache failures without impacting writes, and (3) allowing gradual cache warming without complex preloading logic.

Solution Overview

Cache-aside makes the application responsible for cache management through a simple contract: on reads, check the cache first and populate it on misses; on writes, update the database and then invalidate (or update) the cache. The cache sits “aside” from the main data flow rather than inline, meaning the database remains the source of truth and the cache is purely an optimization layer.

The read path follows a three-step pattern: (1) application queries cache with a key, (2) on cache miss, application queries database and stores result in cache with a TTL, (3) on cache hit, application returns cached value immediately. The write path is simpler: (1) application updates database, (2) application invalidates the cache key (or updates it directly). This explicit control means the application decides caching granularity, TTL values, and invalidation strategies based on data characteristics.

The pattern’s power comes from its simplicity and failure tolerance. If the cache is unavailable during a read, the application simply queries the database. If cache invalidation fails after a write, the TTL eventually expires the stale data. This graceful degradation makes cache-aside the default choice for most web applications, from e-commerce product catalogs to social media feeds.

How It Works

Let’s walk through a concrete example: an e-commerce product detail page at Amazon scale, serving 10,000 requests/second for popular products.

Read Flow (Cache Hit) A user requests product ID 12345. The application generates cache key product:12345 and queries Redis. The key exists with value {name: "iPhone 15", price: 999, stock: 47} and TTL of 5 minutes remaining. The application returns this data immediately—total latency 1ms. The database never sees this request. At 10,000 req/s with 95% cache hit rate, this pattern saves 9,500 database queries per second.

Read Flow (Cache Miss) A user requests product ID 67890, which isn’t cached. The application queries Redis, gets a miss, then queries PostgreSQL: SELECT * FROM products WHERE id = 67890. Database returns the product after 50ms. The application stores this in Redis with SET product:67890 {json} EX 300 (5-minute TTL) and returns the result to the user. Total latency: 51ms. Subsequent requests for this product will hit the cache.

Write Flow (Invalidation Strategy) An inventory system updates product 12345’s stock from 47 to 46 units. The application executes UPDATE products SET stock = 46 WHERE id = 12345 in PostgreSQL (20ms). Immediately after, it executes DEL product:12345 in Redis (1ms). The next read request will miss the cache and fetch fresh data from the database, repopulating the cache with the correct stock level. This creates a brief window (milliseconds) where reads might get stale data if they hit the cache between the database update and cache invalidation.

Cache Key Design Effective cache-aside requires thoughtful key strategies. For product details, a simple key like product:{id} works. For user-specific data like shopping carts, use cart:user:{user_id}. For paginated lists, include pagination parameters: search:iphone:page:2:sort:price. The key must uniquely identify the cached data and include all parameters that affect the query result. At Netflix, recommendation cache keys include user ID, device type, and timestamp bucket: recs:user:123:device:tv:hour:2024010115 to balance freshness with cache efficiency.

TTL Configuration TTL acts as a safety net for failed invalidations. Product details might use 5-minute TTL (frequently changing inventory), while product descriptions use 1-hour TTL (rarely changing content). User sessions might use 30-minute TTL matching session timeout. The TTL should be shorter than your tolerance for stale data but long enough to provide meaningful cache hit rates. At Stripe, API response caches use 60-second TTL because financial data staleness is unacceptable beyond that window.

Cache Stampede Prevention When a popular cache key expires, hundreds of concurrent requests might simultaneously miss the cache and query the database—a “stampede.” The solution: probabilistic early expiration. Instead of TTL=300 seconds exactly, use TTL=300 with early recomputation when TTL < 30 and random(0,1) < 0.1. This means 10% of requests in the final 30 seconds will proactively refresh the cache, preventing the thundering herd. Facebook’s implementation adds a “computing” flag: the first miss sets SETNX product:12345:computing 1 EX 10, queries the database, updates cache, and deletes the flag. Other requests see the flag and wait briefly or return stale data rather than all hitting the database.

Cache-Aside Read Flow: Hit vs Miss

graph LR
    subgraph Cache Hit Path
        A1["Application"] --"1. GET product:12345"--> B1[("Redis Cache")]
        B1 --"2. Return cached data<br/>(1ms)"--> A1
    end
    
    subgraph Cache Miss Path
        A2["Application"] --"1. GET product:67890"--> B2[("Redis Cache")]
        B2 --"2. Cache MISS"--> A2
        A2 --"3. SELECT * FROM products<br/>WHERE id=67890"--> C2[("PostgreSQL")]
        C2 --"4. Return data<br/>(50ms)"--> A2
        A2 --"5. SET product:67890<br/>TTL=300s"--> B2
        A2 --"6. Return to user"--> D2["User"]
    end

Cache hit returns data in 1ms directly from Redis. Cache miss requires a 50ms database query, then populates the cache with a 5-minute TTL for subsequent requests. At 95% hit rate with 10,000 req/s, this saves 9,500 database queries per second.

Cache-Aside Write Flow with Invalidation

sequenceDiagram
    participant App as Application
    participant DB as PostgreSQL
    participant Cache as Redis Cache
    participant User as Next Reader
    
    Note over App,Cache: Write Operation
    App->>DB: 1. UPDATE products<br/>SET stock=46<br/>WHERE id=12345<br/>(20ms)
    DB-->>App: Update confirmed
    App->>Cache: 2. DEL product:12345<br/>(1ms)
    Cache-->>App: Key deleted
    
    Note over App,User: Stale Data Window<br/>(milliseconds)
    
    Note over App,Cache: Next Read Operation
    User->>App: 3. GET product 12345
    App->>Cache: 4. GET product:12345
    Cache-->>App: Cache MISS
    App->>DB: 5. SELECT * FROM products<br/>WHERE id=12345
    DB-->>App: Fresh data (stock=46)
    App->>Cache: 6. SET product:12345<br/>with new data
    App-->>User: Return fresh data

Write flow updates the database first (source of truth), then invalidates the cache key. A brief stale data window exists between steps 1 and 2 where concurrent reads might return old data. The next read after invalidation fetches fresh data from the database and repopulates the cache.

Cache Stampede Problem and Prevention

graph TB
    subgraph Without Prevention - Stampede
        T1["t=0: Cache key expires"]
        T2["t=1ms: 1000 concurrent<br/>requests arrive"]
        T3["All 1000 requests<br/>miss cache"]
        T4["All 1000 query<br/>database simultaneously"]
        T5["Database overload<br/>P99 latency spikes"]
        T1 --> T2 --> T3 --> T4 --> T5
    end
    
    subgraph With Probabilistic Early Expiration
        P1["TTL=300s remaining"]
        P2["Request at TTL=25s<br/>random < 0.1"]
        P3["Proactively refresh cache<br/>before expiration"]
        P4["Other requests continue<br/>hitting warm cache"]
        P5["No stampede occurs"]
        P1 --> P2 --> P3 --> P4 --> P5
    end
    
    subgraph With Lease-Based Locking
        L1["First request gets cache miss"]
        L2["SETNX product:12345:computing<br/>EX 10 seconds"]
        L3["First request queries DB<br/>and updates cache"]
        L4["Concurrent requests see<br/>computing flag"]
        L5["Wait briefly or return<br/>stale data"]
        L1 --> L2 --> L3
        L1 --> L4 --> L5
    end

Cache stampede occurs when a popular key expires and concurrent requests simultaneously query the database. Probabilistic early expiration refreshes cache before expiration (10% of requests when TTL < 30s). Lease-based locking lets the first miss query the database while others wait, reducing Facebook’s database load by 25% during traffic spikes.

Variants

Standard Invalidation (Delete on Write) After updating the database, delete the cache key. The next read will miss and repopulate with fresh data. This is the safest approach because you never risk caching stale data from a failed write. Use this when: data changes frequently, consistency is critical, cache misses are acceptable. Pros: simple, no stale data risk. Cons: every write causes a cache miss, potential stampede on popular keys.

Write-Through Update (Update on Write) After updating the database, immediately update the cache with the new value instead of deleting. This keeps the cache warm and avoids the post-write cache miss. Use this when: reads vastly outnumber writes, the write operation already has the new value computed, you want to avoid cache misses on hot keys. Pros: no cache miss after writes, simpler read path. Cons: race conditions if writes are concurrent, wasted cache updates if data isn’t read soon.

Refresh-Ahead (Proactive Refresh) Before cache expiration, asynchronously refresh the cache in the background. Monitor cache access patterns and refresh popular keys before they expire. Use this when: you can predict which keys will be accessed, cache misses are expensive, you have background worker capacity. Pros: eliminates cache misses for hot keys, prevents stampedes. Cons: complex implementation, wasted refreshes for unpopular keys, requires access pattern tracking.

Versioned Keys Instead of invalidating product:12345, write to product:12345:v2 and update a pointer. This allows atomic cache updates and rollback capability. Use this when: you need atomic cache updates across multiple keys, you want to support gradual rollouts, debugging requires comparing cache versions. Pros: atomic updates, rollback support, no race conditions. Cons: increased storage (multiple versions), pointer indirection adds complexity, requires cleanup of old versions.

Cache-Aside Pattern Variants Comparison

graph TB
    subgraph Standard Invalidation
        SI1["Write to DB"] --> SI2["DELETE cache key"]
        SI2 --> SI3["Next read: cache miss"]
        SI3 --> SI4["Fetch from DB<br/>+ populate cache"]
        SI5["✓ No stale data risk<br/>✗ Cache miss after write"]
    end
    
    subgraph Write-Through Update
        WT1["Write to DB"] --> WT2["UPDATE cache with<br/>new value"]
        WT2 --> WT3["Next read: cache hit"]
        WT3 --> WT4["Return immediately"]
        WT5["✓ No cache miss<br/>✗ Race condition risk"]
    end
    
    subgraph Refresh-Ahead
        RA1["Monitor access patterns"] --> RA2["Detect popular keys"]
        RA2 --> RA3["Background worker<br/>refreshes before expiry"]
        RA3 --> RA4["Cache always warm"]
        RA5["✓ Eliminates misses<br/>✗ Complex implementation"]
    end
    
    subgraph Versioned Keys
        VK1["Write to DB"] --> VK2["Write product:12345:v2"]
        VK2 --> VK3["Update pointer:<br/>product:12345 → v2"]
        VK3 --> VK4["Atomic cache update"]
        VK5["✓ Atomic + rollback<br/>✗ Storage overhead"]
    end

Four cache-aside variants offer different trade-offs. Standard invalidation (delete on write) is safest but causes cache misses. Write-through update keeps cache warm but risks race conditions. Refresh-ahead eliminates misses for hot keys but requires background workers. Versioned keys enable atomic updates and rollback at the cost of storage overhead.

Trade-offs

Consistency vs Performance Cache-aside trades strict consistency for performance. Between database update and cache invalidation (or until TTL expires), readers might see stale data. In practice, this window is milliseconds to minutes depending on TTL. Decision criteria: if you need read-after-write consistency (banking, inventory reservations), use write-through caching or skip caching entirely. If eventual consistency is acceptable (product descriptions, user profiles), cache-aside provides 10-100x performance improvement. Netflix accepts 30-second staleness for movie metadata because the user experience improvement from fast loads outweighs occasional stale descriptions.

Simplicity vs Automation Cache-aside requires explicit cache management code in every service versus automatic cache synchronization (write-through, change data capture). You gain: simple failure modes, clear debugging path, flexible invalidation strategies. You lose: automatic consistency, centralized cache logic, protection against developer mistakes. Decision criteria: for small teams or microservices, explicit control is better—you understand exactly what’s cached and why. For large organizations with many developers, consider write-through or CDC to prevent cache management bugs.

Cache Miss Cost vs Storage Cost Aggressive caching (long TTLs, many keys) reduces database load but increases cache storage costs and stale data risk. Conservative caching (short TTLs, selective keys) keeps data fresh but increases database load. Decision criteria: calculate your cache hit rate improvement per dollar of cache storage. At Amazon’s scale, even 1% hit rate improvement on product pages saves millions in database costs, justifying large Redis clusters. For a startup, cache only the top 10% of hot keys and use short TTLs.

Invalidation vs Expiration Explicit invalidation (delete on write) provides immediate consistency but requires tracking all write paths. TTL-only expiration is simpler but guarantees staleness. Decision criteria: use explicit invalidation for data with clear write paths (user profiles, product details). Use TTL-only for data with complex dependencies (aggregated analytics, computed recommendations) where tracking all invalidation triggers is impractical. Stripe uses explicit invalidation for customer data but TTL-only for fraud score caches because fraud models have hundreds of input signals.

When to Use (and When Not To)

Use cache-aside when:

Your workload is read-heavy (10:1 or higher read:write ratio). Product catalogs, user profiles, configuration data, and content management systems are ideal candidates. The pattern shines when the same data is read repeatedly—if every request is unique, caching provides no benefit.

You can tolerate eventual consistency. The gap between database update and cache refresh must be acceptable for your use case. Social media feeds, search results, and recommendation systems work well. Real-time bidding, financial transactions, and inventory reservations do not.

Your application already handles database queries. Cache-aside requires application-level cache management, so you need control over the data access layer. This works naturally in application code but poorly in legacy systems where the ORM or database driver handles all queries.

Cache failures shouldn’t break writes. Because the cache is not in the write path, your system continues functioning (slower) when Redis is down. This graceful degradation is critical for high-availability systems.

Avoid cache-aside when:

You need read-after-write consistency. If a user updates their profile and immediately views it, they must see the new data. Cache-aside’s invalidation delay breaks this guarantee. Use write-through caching or skip caching for these paths.

Write volume is high relative to reads. If you’re updating data as often as reading it, cache invalidation overhead exceeds the benefit. Metrics, logs, and time-series data are poor candidates.

Your data has complex dependencies. If updating one table requires invalidating cache keys across multiple services, the invalidation logic becomes error-prone. Consider event-driven invalidation or write-behind patterns instead.

You lack application-level cache control. If your framework or ORM manages data access opaquely, implementing cache-aside requires significant refactoring. Evaluate database-level caching (query cache, materialized views) first.

Real-World Examples

Amazon Product Catalog Amazon’s product detail pages serve millions of requests per second using cache-aside with Redis. Each product’s core data (title, price, images, description) is cached with a 5-minute TTL using key pattern product:{asin}. When inventory updates, the warehouse system writes to Aurora PostgreSQL and invalidates the cache key. The 5-minute TTL provides a safety net for missed invalidations. Interesting detail: Amazon uses probabilistic early expiration to prevent cache stampedes on popular products during flash sales. When a product goes viral, the first request to detect TTL < 30 seconds refreshes the cache, preventing thousands of simultaneous database queries.

Facebook News Feed Facebook’s news feed uses cache-aside for individual post data while using more sophisticated patterns for feed assembly. Each post’s content, author info, and engagement metrics are cached in Memcached with keys like post:{post_id}. When a user edits a post, the application updates MySQL and deletes the cache key. The next view fetches fresh data and repopulates the cache. Interesting detail: Facebook uses a “lease” system to prevent stampedes—the first cache miss gets a lease token, queries the database, and stores the result. Concurrent misses see the lease and wait briefly rather than all querying the database. This reduced database load by 25% during traffic spikes.

Stripe API Responses Stripe caches API responses for idempotent requests using cache-aside with a 60-second TTL. When a client makes a GET request for a customer object, Stripe checks Redis with key api:customer:{id}:v{schema_version}. On miss, it queries PostgreSQL and caches the response. The schema version in the key ensures cache invalidation when API response format changes. Interesting detail: Stripe uses write-through updates (not invalidation) for customer objects because the write operation already has the complete new state, eliminating the post-write cache miss. This optimization improved P99 latency by 40ms for customer update operations followed by immediate reads.

Facebook Lease-Based Stampede Prevention

sequenceDiagram
    participant R1 as Request 1<br/>(First Miss)
    participant R2 as Request 2-100<br/>(Concurrent)
    participant Cache as Memcached
    participant DB as MySQL
    
    Note over R1,DB: Popular post cache expires
    
    R1->>Cache: GET post:98765
    Cache-->>R1: MISS
    R1->>Cache: SETNX post:98765:computing 1<br/>EX 10 seconds
    Cache-->>R1: Lease granted
    
    par Concurrent requests arrive
        R2->>Cache: GET post:98765
        Cache-->>R2: MISS
        R2->>Cache: SETNX post:98765:computing
        Cache-->>R2: Lease already held<br/>(key exists)
        R2->>R2: Wait 50ms or<br/>return stale data
    end
    
    R1->>DB: SELECT * FROM posts<br/>WHERE id=98765
    DB-->>R1: Post data
    R1->>Cache: SET post:98765 {data}<br/>EX 300
    R1->>Cache: DEL post:98765:computing
    
    Note over R1,R2: Reduced DB load by 25%<br/>during traffic spikes
    
    R2->>Cache: GET post:98765<br/>(retry after wait)
    Cache-->>R2: HIT - return data

Facebook’s lease system prevents cache stampedes by granting the first cache miss exclusive rights to query the database. Concurrent misses detect the lease (computing flag) and wait briefly rather than all hitting the database. This optimization reduced database load by 25% during viral content spikes while maintaining sub-second response times.

Interview Essentials

Mid-Level

Explain the basic read and write flows: check cache, query database on miss, populate cache; update database, invalidate cache. Discuss why you invalidate rather than update (simpler, avoids race conditions). Describe TTL as a safety net for failed invalidations. Walk through a concrete example like caching user profiles. Demonstrate understanding of cache key design—include all query parameters that affect the result. Explain the stale data window between database update and cache invalidation and why it’s usually acceptable.

Senior

Design a complete cache-aside implementation with stampede prevention. Explain probabilistic early expiration or lease-based approaches and when each is appropriate. Discuss cache key versioning strategies for schema changes. Compare invalidation vs update-on-write trade-offs with specific use cases. Calculate cache hit rate improvements and cost savings: if 95% hit rate saves 9,500 DB queries/second at $0.10 per million queries, that’s $8,208/day saved. Explain monitoring strategy: track hit rate, miss rate, invalidation failures, TTL distribution. Discuss failure modes: what happens when Redis is down, when invalidation fails, when database is slow.

Staff+

Architect cache-aside for a multi-region system with consistency requirements. Explain how to handle cache invalidation across regions—do you invalidate all regions or accept temporary inconsistency? Design a cache warming strategy for new deployments to avoid cold start stampedes. Discuss observability: how do you detect cache poisoning, measure staleness impact, identify invalidation bugs? Explain when to graduate from cache-aside to more sophisticated patterns (write-behind for write-heavy, CDC for complex dependencies). Provide a decision framework: given workload characteristics (read/write ratio, consistency requirements, data dependencies), recommend cache-aside vs alternatives. Discuss organizational challenges: how do you prevent cache management bugs across 50 microservices?

Common Interview Questions

Why invalidate instead of updating the cache on writes? (Simpler, avoids race conditions where concurrent updates arrive out of order, eliminates wasted updates for data that won’t be read)

How do you prevent cache stampedes? (Probabilistic early expiration, lease-based locking, serving stale data while refreshing)

What happens if cache invalidation fails? (TTL eventually expires the stale data, monitoring should alert on invalidation failures, some systems use a dead letter queue to retry)

How do you choose TTL values? (Balance between staleness tolerance and cache hit rate, use shorter TTL for frequently changing data, longer for stable data)

Cache-aside vs write-through vs write-behind? (Cache-aside: app controls cache, simpler, stale data window. Write-through: cache in write path, consistent, slower writes. Write-behind: async writes, fast, complex failure modes)

Red Flags to Avoid

Not mentioning TTL as a safety net for failed invalidations—shows lack of production experience with cache failures

Claiming cache-aside provides strong consistency—it doesn’t, there’s always a stale data window

Ignoring cache stampede problem—this causes real production incidents at scale

Not discussing cache key design—poor keys lead to stale data bugs

Suggesting cache-aside for write-heavy workloads—shows poor pattern selection judgment

Key Takeaways

Cache-aside puts the application in control: explicitly check cache on reads, populate on miss, and invalidate on writes. This simplicity makes it the most common caching pattern, but requires careful invalidation logic to prevent stale data bugs.

The pattern trades strict consistency for performance—there’s always a window between database update and cache invalidation where readers might see stale data. TTL acts as a safety net, ensuring stale data eventually expires even if invalidation fails.

Cache stampedes are a real production problem: when a popular cache key expires, concurrent requests can overwhelm the database. Prevent this with probabilistic early expiration, lease-based locking, or serving stale data while refreshing.

Cache key design is critical: include all parameters that affect the query result, use consistent naming conventions, and consider versioning for schema changes. Poor key design leads to stale data bugs that are difficult to debug.

Use cache-aside for read-heavy workloads (10:1 ratio or higher) where eventual consistency is acceptable. Avoid it for write-heavy workloads, read-after-write consistency requirements, or data with complex invalidation dependencies.