Cache Eviction Policies: LRU, LFU, FIFO Compared

After this topic, you will be able to:

Compare LRU, LFU, FIFO, and MRU eviction policies across different access patterns
Evaluate which eviction policy to choose based on workload characteristics
Analyze the time and space complexity trade-offs of different eviction algorithms
Justify eviction policy selection for specific use cases (e.g., video streaming vs social feed)

TL;DR

Cache eviction policies determine which items to remove when cache capacity is reached. LRU (Least Recently Used) works best for temporal locality, LFU (Least Frequently Used) for frequency-based patterns, FIFO for simple streaming, and MRU for sequential scans. The choice depends on your access pattern: recency-focused workloads favor LRU, frequency-focused favor LFU, and scan-heavy workloads need MRU. Cheat sheet: LRU = recent access matters (social feeds), LFU = popularity matters (trending content), FIFO = simple time-based (logs), MRU = avoid caching scans (analytics queries).

Context

Every cache has finite capacity. When Twitter’s timeline cache fills up, should it evict the oldest tweet, the least popular one, or the one accessed longest ago? This decision directly impacts cache hit rates, which cascade into database load, latency, and infrastructure costs. A 5% improvement in cache hit rate at Netflix scale means millions of dollars in CDN savings annually. The wrong eviction policy can turn your cache into a liability—imagine evicting viral content right as it trends, or keeping stale data while fresh requests miss. This isn’t academic: Redis defaults to LRU variants, Memcached uses LRU, and CDNs like Cloudflare offer multiple policies because no single approach fits all workloads. Understanding eviction policies means understanding your access patterns. Do users repeatedly view the same content (temporal locality)? Do some items get accessed far more than others (frequency skew)? Are you serving sequential scans that shouldn’t pollute the cache? The interviewer wants to see you connect workload characteristics to policy selection, not just recite algorithms. They’re testing whether you can diagnose why a cache isn’t performing and prescribe the right fix. At senior levels, they expect you to discuss implementation trade-offs—LRU’s O(1) operations versus LFU’s complexity, or why Facebook built a custom eviction policy for their social graph cache. The stakes are real: Airbnb’s search cache serves millions of queries per second, and choosing the wrong eviction policy would either waste memory on irrelevant results or thrash the cache with constant misses.

Side-by-Side Comparison

Policy Comparison Matrix

Core Policies

LRU (Least Recently Used)

What it does: Evicts the item that hasn’t been accessed for the longest time. Maintains a recency-ordered list where every access moves an item to the front.
Time complexity: O(1) for both access and eviction with hash map + doubly-linked list
Space overhead: O(n) for pointers (8-16 bytes per entry)
Best for: Workloads with temporal locality—recently accessed items are likely to be accessed again soon
Example: Social media feeds where users scroll recent posts repeatedly
Weakness: Doesn’t account for access frequency. A one-time viral spike can evict consistently popular content.

LFU (Least Frequently Used)

What it does: Evicts the item with the lowest access count. Tracks frequency counters for each item.
Time complexity: O(log n) with priority queue, or O(1) with complex frequency bucketing
Space overhead: O(n) for counters plus data structure overhead
Best for: Workloads where popularity matters more than recency—some items are consistently hot
Example: Video streaming where certain shows (Stranger Things) get watched repeatedly over weeks
Weakness: Suffers from “pollution” problem—old popular items accumulate high counts and never evict, even when no longer relevant. New items struggle to gain traction.

FIFO (First-In-First-Out)

What it does: Evicts items in insertion order, like a queue. Ignores access patterns entirely.
Time complexity: O(1) for both operations
Space overhead: Minimal—just insertion timestamps or queue pointers
Best for: Time-based data where age correlates with relevance, or when access patterns are uniformly random
Example: Log aggregation systems where recent logs matter most, regardless of access
Weakness: Completely ignores whether items are being used. Can evict hot data simply because it was cached early.

MRU (Most Recently Used)

What it does: Evicts the most recently accessed item—opposite of LRU
Time complexity: O(1) with same structure as LRU
Space overhead: Same as LRU
Best for: Sequential scan workloads where recently accessed items won’t be accessed again
Example: Database query cache for analytics—full table scans shouldn’t pollute cache with data that won’t be reused
Weakness: Counterintuitive for most workloads. Only makes sense in specific scan scenarios.

Advanced Variants

LRU-K (K-distance LRU)

Tracks the timestamp of the Kth most recent access (typically K=2). Evicts items with the oldest Kth access.
Why it matters: Solves LRU’s one-hit-wonder problem. A single access doesn’t immediately make an item “hot.”
Used by: PostgreSQL buffer cache (with K=2)
Trade-off: Requires tracking K timestamps per item—more memory overhead

2Q (Two Queue)

Maintains two LRU queues: A1 (probationary) for new items, Am (main) for proven hot items
Items graduate from A1 to Am after second access. Eviction happens from A1 first.
Why it matters: Protects against cache pollution from scans while keeping truly hot items
Trade-off: More complex implementation, tuning the queue size ratio

ARC (Adaptive Replacement Cache)

Dynamically balances between recency (LRU) and frequency (LFU) based on workload
Maintains four lists and adapts the balance point based on recent eviction performance
Why it matters: Self-tuning—adapts to workload changes without manual configuration
Used by: ZFS filesystem cache
Trade-off: Patent-encumbered (IBM), complex implementation, higher overhead

LRU Cache Implementation: Hash Map + Doubly-Linked List

graph LR
    subgraph Hash Map
        K1["Key: 'user_123'"]
        K2["Key: 'post_456'"]
        K3["Key: 'feed_789'"]
    end
    
    subgraph Doubly-Linked List - Ordered by Recency
        Head["HEAD<br/><i>Most Recent</i>"]
        N1["Node: 'feed_789'<br/>Value: {...}<br/>prev | next"]
        N2["Node: 'post_456'<br/>Value: {...}<br/>prev | next"]
        N3["Node: 'user_123'<br/>Value: {...}<br/>prev | next"]
        Tail["TAIL<br/><i>Least Recent</i>"]
    end
    
    K1 -."O(1) pointer".-> N3
    K2 -."O(1) pointer".-> N2
    K3 -."O(1) pointer".-> N1
    
    Head --> N1
    N1 <--> N2
    N2 <--> N3
    N3 --> Tail
    
    Access["Access 'post_456'"] --"1. Hash lookup O(1)"--> K2
    K2 --"2. Get node pointer"--> N2
    N2 --"3. Remove from current position<br/>4. Insert at head"--> Head
    
    Evict["Eviction needed"] --"Remove from tail O(1)"--> N3

LRU achieves O(1) operations by combining a hash map for fast lookups with a doubly-linked list for maintaining recency order. Every access moves the node to the head; eviction removes from the tail. This requires ~40 bytes per entry (pointers + hash map overhead) but enables constant-time cache operations.

Cache Eviction Policies: Access Pattern Behavior Comparison

graph TB
    subgraph LRU - Temporal Locality
        LRU_Access["Access Sequence:<br/>A, B, C, D, A, B"]
        LRU_Cache["Cache State:<br/>[B, A, D, C]<br/><i>Recent items at front</i>"]
        LRU_Evict["Next Eviction: C<br/><i>Oldest access time</i>"]
        LRU_Access --> LRU_Cache --> LRU_Evict
    end
    
    subgraph LFU - Frequency Skew
        LFU_Access["Access Sequence:<br/>A, B, C, A, A, B"]
        LFU_Cache["Cache State:<br/>A:3, B:2, C:1<br/><i>Frequency counters</i>"]
        LFU_Evict["Next Eviction: C<br/><i>Lowest frequency</i>"]
        LFU_Access --> LFU_Cache --> LFU_Evict
    end
    
    subgraph FIFO - Insertion Order
        FIFO_Access["Access Sequence:<br/>A, B, C, D, A, B"]
        FIFO_Cache["Cache State:<br/>[A, B, C, D]<br/><i>Insertion order preserved</i>"]
        FIFO_Evict["Next Eviction: A<br/><i>First inserted, ignores access</i>"]
        FIFO_Access --> FIFO_Cache --> FIFO_Evict
    end
    
    subgraph MRU - Sequential Scans
        MRU_Access["Scan Sequence:<br/>Row1, Row2, Row3, Row4"]
        MRU_Cache["Cache State:<br/>[Row1, Row2, Row3, Row4]<br/><i>Recent scan data</i>"]
        MRU_Evict["Next Eviction: Row4<br/><i>Most recent, won't reuse</i>"]
        MRU_Access --> MRU_Cache --> MRU_Evict
    end

Each eviction policy responds differently to the same access patterns. LRU tracks recency, LFU tracks frequency, FIFO ignores access entirely, and MRU evicts recent items to prevent scan pollution. The choice depends on whether your workload exhibits temporal locality, frequency skew, time-based relevance, or sequential scanning.

Deep Analysis

LRU: The Industry Standard

LRU dominates production systems because it balances effectiveness with implementation simplicity. The classic implementation uses a hash map for O(1) lookups and a doubly-linked list for O(1) reordering. Every cache access moves the item to the list head; eviction removes from the tail. Redis implements “approximate LRU” by sampling a small subset of keys rather than maintaining perfect ordering—this trades slight accuracy for massive memory savings (no pointers needed). The approximation works because you don’t need perfect LRU ordering; you just need to avoid evicting hot items. Twitter’s timeline cache uses LRU because users repeatedly check recent tweets—temporal locality is strong. The weakness emerges during traffic spikes: a celebrity tweet that gets accessed once by millions of users can evict consistently popular content, even though those spike accesses won’t repeat. This is the “one-hit-wonder” problem.

LFU: When Popularity Trumps Recency

LFU shines when certain items remain popular over long periods. Netflix’s video metadata cache (titles, thumbnails, descriptions) uses frequency-based eviction because popular shows stay popular for weeks. Implementing efficient LFU is harder than LRU. A naive approach using a min-heap gives O(log n) eviction. The optimal approach uses frequency buckets—a hash map where keys are frequencies and values are doubly-linked lists of items with that frequency. This achieves O(1) operations but requires careful bookkeeping when incrementing frequencies. The “pollution problem” is LFU’s Achilles heel: items that were popular months ago still have high counts, preventing new trending content from staying cached. Solutions include periodic counter decay (halve all counts every hour) or time-windowed counting (only count accesses in the last N hours). Spotify’s playlist cache uses a variant that decays frequency over time to handle shifting music trends.

FIFO: Simplicity for Specific Scenarios

FIFO is rarely optimal but sometimes sufficient. It works when access patterns are truly random or when data has a natural time-to-live. Log aggregation systems use FIFO because recent logs are inherently more valuable—access patterns don’t matter. The implementation is trivial: a circular buffer or queue. Cloudflare’s edge cache offers FIFO as an option for customers serving time-sensitive content like news articles, where age is the primary relevance signal. The danger is that FIFO ignores usage entirely—your most-accessed item can be evicted simply because it was cached early. This makes FIFO unsuitable for general-purpose caching.

MRU: The Counterintuitive Choice

MRU seems backwards—why evict what you just used? It makes sense for sequential scans that exhibit anti-temporal locality. Consider a database query that scans an entire table: each row is accessed once and never again. With LRU, this scan would fill the cache with useless data, evicting actually hot items. MRU evicts the scan data immediately, protecting the cache. Database buffer pools use MRU for full table scans while using LRU for index lookups. The key insight: not all accesses are equal. A scan access shouldn’t promote an item to “hot” status. Modern systems detect scan patterns and apply MRU selectively rather than globally.

Advanced Variants: Solving Real Problems

LRU-K emerged from database research to solve the one-hit-wonder problem. By requiring K accesses before considering an item “hot,” it filters out temporary spikes. PostgreSQL uses 2-distance LRU (tracks the last two access times) for its shared buffer cache. This prevents a single sequential scan from evicting the entire working set. The cost is K timestamps per item—for K=2, that’s 16 extra bytes per cache entry.

2Q takes a different approach: new items enter a probationary queue (A1) and only graduate to the main queue (Am) after a second access. This two-tier system protects against pollution while being simpler than LRU-K. The tuning challenge is setting the A1/Am size ratio—too small A1 and you don’t filter enough; too large and you waste space on cold items.

ARC is the most sophisticated: it adapts the balance between recency and frequency based on workload. It maintains four lists (recent singles, recent multiples, frequent singles, frequent multiples) and dynamically adjusts their sizes based on which evictions would have been hits. ZFS uses ARC because filesystem workloads vary dramatically—sometimes sequential (backups), sometimes random (databases). The self-tuning eliminates manual configuration. The downside is complexity: ARC’s implementation spans thousands of lines and requires careful tuning of adaptation parameters.

LFU Frequency Bucket Implementation for O(1) Operations

graph TB
    subgraph Frequency Buckets - Doubly-Linked Lists
        F1["Freq=1<br/>[item_C ↔ item_D]"]
        F2["Freq=2<br/>[item_B]"]
        F3["Freq=3<br/>[item_A]"]
        MinFreq["Min Frequency: 1<br/><i>Track for O(1) eviction</i>"]
    end
    
    subgraph Key to Node Map
        KA["'item_A' → Node@Freq3"]
        KB["'item_B' → Node@Freq2"]
        KC["'item_C' → Node@Freq1"]
        KD["'item_D' → Node@Freq1"]
    end
    
    KA -.-> F3
    KB -.-> F2
    KC -.-> F1
    KD -.-> F1
    
    Access["Access 'item_B'"] --"1. Lookup in map O(1)"--> KB
    KB --"2. Get current freq (2)"--> F2
    F2 --"3. Remove from freq=2 list O(1)"--> Update["4. Increment freq to 3<br/>5. Insert into freq=3 list O(1)"]
    Update --> F3
    
    Eviction["Eviction Needed"] --"1. Check min frequency (1)"--> MinFreq
    MinFreq --"2. Remove tail from freq=1 O(1)"--> F1
    F1 --"Evict 'item_D'"--> Result["Update min freq if<br/>freq=1 list now empty"]

Efficient LFU uses frequency buckets (hash map of doubly-linked lists) to achieve O(1) operations. Each frequency has a list of items with that count. Incrementing frequency requires removing from one list and inserting into the next. Tracking the minimum frequency enables O(1) eviction. This avoids the O(log n) cost of a min-heap but requires careful bookkeeping.

Implementation Complexity

Data Structures and Algorithmic Trade-offs

LRU Implementation: The canonical approach combines a hash map (key → node pointer) with a doubly-linked list (ordered by recency). Access: O(1) hash lookup, O(1) list reordering (remove node, insert at head). Eviction: O(1) remove tail. Space: 16 bytes per entry for prev/next pointers plus hash map overhead (typically 24 bytes per entry). Total: ~40 bytes per key beyond the cached value. For a 1M-entry cache, that’s 40MB of metadata. Redis’s approximate LRU eliminates pointers by sampling: on eviction, randomly sample N keys (default 5), evict the one with the oldest access time. This reduces metadata to 3 bytes per key (24-bit timestamp) but means eviction isn’t perfectly LRU—you might evict the 6th-least-recently-used item. In practice, this barely affects hit rates while saving massive memory.

LFU Implementation: Naive approach uses a min-heap keyed by frequency: O(log n) eviction, O(log n) frequency updates. This is too slow for high-throughput caches. The optimal approach uses frequency buckets: a hash map where keys are frequencies (1, 2, 3…) and values are doubly-linked lists of items with that frequency. Another hash map maps cache keys to their frequency list nodes. Access: O(1) to find node, O(1) to remove from current frequency list, O(1) to insert into next frequency list. Eviction: O(1) to find minimum frequency bucket, O(1) to remove tail. Space: ~48 bytes per entry (two hash maps plus list pointers). The complexity is in maintaining consistency—incrementing frequency requires moving nodes between lists atomically. Memcached’s LFU implementation uses 8-bit counters with probabilistic increment (increment with probability 1/current_count) to prevent counter overflow and provide natural decay.

FIFO Implementation: Trivial—a queue or circular buffer. Insertion adds to tail, eviction removes from head. O(1) for both, minimal space overhead (just queue pointers). The hash map for key lookups is still needed, so total overhead is ~24 bytes per entry.

MRU Implementation: Identical structure to LRU (hash map + doubly-linked list) but evicts from head instead of tail. Same O(1) operations, same space overhead.

Practical Considerations: Lock contention is the real bottleneck in multi-threaded caches. A single global lock on the LRU list becomes a bottleneck at high concurrency. Solutions include lock striping (partition the cache into segments, each with its own lock) or lock-free data structures (using compare-and-swap). Redis is single-threaded, avoiding this entirely. Memcached uses fine-grained locking per slab class. The choice of eviction policy interacts with concurrency: LFU’s frequency updates create more write contention than LRU’s access-time updates.

Decision Framework

Choosing the Right Eviction Policy

Start with your access pattern:

Strong temporal locality (recently accessed items are accessed again soon): Use LRU
- Social media feeds, user session data, recent search results
- Example: Twitter timeline cache—users refresh to see new tweets, repeatedly accessing recent content
Frequency skew (some items are consistently popular): Use LFU or LRU-K
- Video streaming, product catalogs, trending content
- Example: Netflix show metadata—popular shows stay popular for weeks
- Caveat: Add counter decay if popularity shifts over time (music streaming, news)
Time-based relevance (age matters more than access): Use FIFO
- Log aggregation, time-series data, news articles
- Example: Metrics dashboard caching recent data points
Sequential scans (large scans that shouldn’t pollute cache): Use MRU or scan-resistant variant
- Database query cache, analytics workloads
- Example: Data warehouse queries that scan entire tables

Consider workload characteristics:

Bursty traffic: LRU-K or 2Q to filter one-time spikes
- E-commerce during flash sales—temporary product page views shouldn’t evict consistently popular items
Shifting patterns: ARC or LFU with decay
- News sites where trending topics change daily
Mixed workload: Adaptive policies (ARC) or separate caches with different policies
- Airbnb search: recent searches (LRU) + popular destinations (LFU) in separate cache tiers

Implementation constraints:

Memory-constrained: Approximate LRU (Redis-style sampling) or FIFO
- Embedded systems, mobile apps
High concurrency: Simpler policies (FIFO, approximate LRU) to reduce lock contention
- High-throughput APIs serving thousands of requests per second
Ease of implementation: Start with LRU, optimize later if needed
- Most workloads have temporal locality; LRU is a safe default

Red flags:

Using FIFO for general-purpose caching (ignores access patterns)
Using LFU without counter decay (old items never evict)
Using LRU for scan-heavy workloads (cache pollution)
Choosing complex policies (ARC, LRU-K) without measuring that LRU is insufficient

Decision tree: Does your workload have sequential scans? → Yes: MRU or scan-resistant variant. No: Do some items stay popular for long periods? → Yes: LFU with decay. No: Is access pattern random? → Yes: FIFO. No: Default to LRU.

Eviction Policy Selection Decision Tree

flowchart TB
    Start["Analyze Access Pattern"] --> Scan{"Sequential scans<br/>that shouldn't<br/>pollute cache?"}
    
    Scan -->|Yes| MRU["Use MRU or<br/>Scan-Resistant Variant<br/><i>Example: Analytics queries</i>"]
    
    Scan -->|No| Popular{"Some items<br/>consistently popular<br/>over long periods?"}
    
    Popular -->|Yes| Shift{"Popularity<br/>shifts over time?"}
    Shift -->|Yes| LFU_Decay["Use LFU with<br/>Counter Decay<br/><i>Example: Music streaming</i>"]
    Shift -->|No| LFU["Use LFU<br/><i>Example: Video metadata</i>"]
    
    Popular -->|No| Random{"Access pattern<br/>uniformly random<br/>or time-based?"}
    
    Random -->|Yes| FIFO["Use FIFO<br/><i>Example: Log aggregation</i>"]
    
    Random -->|No| Spike{"Bursty traffic<br/>with one-time<br/>access spikes?"}
    
    Spike -->|Yes| LRU_K["Use LRU-K or 2Q<br/><i>Example: E-commerce flash sales</i>"]
    
    Spike -->|No| Default["Use LRU<br/><i>Default: Temporal locality</i><br/><i>Example: Social media feeds</i>"]
    
    MRU --> Measure["Monitor Hit Rate<br/>& Eviction Patterns"]
    LFU_Decay --> Measure
    LFU --> Measure
    FIFO --> Measure
    LRU_K --> Measure
    Default --> Measure
    
    Measure --> Optimize{"Hit rate<br/>acceptable?"}
    Optimize -->|No| Tune["Tune cache size,<br/>admission policy,<br/>or try adaptive (ARC)"]
    Optimize -->|Yes| Done["Production Ready"]

Select eviction policy by analyzing workload characteristics in order: sequential scans (MRU), frequency skew (LFU), random access (FIFO), traffic spikes (LRU-K), or default to LRU for temporal locality. Always measure hit rate in production—the decision tree provides a starting point, but real-world performance determines the final choice.

Real-World Examples

Twitter: LRU for Timeline Caching

Twitter’s timeline cache uses LRU because users exhibit strong temporal locality—they repeatedly check recent tweets. When you refresh your timeline, you’re likely to see the same tweets you saw 5 minutes ago, plus new ones. LRU keeps these recent tweets cached. The interesting detail: Twitter uses a two-tier cache. The L1 cache (in-memory, LRU) holds timelines for active users. The L2 cache (Redis, also LRU) holds timelines for recently active users. When a user hasn’t checked their timeline in hours, it’s evicted from L1 but stays in L2. This tiered approach balances hit rate with memory cost. During major events (Super Bowl, elections), traffic spikes cause cache thrashing—millions of users suddenly accessing timelines they haven’t checked in weeks. Twitter’s solution: temporarily increase cache size and use admission policies to prevent one-time accesses from evicting regular users’ timelines.

Airbnb: Hybrid Eviction for Search Results

Airbnb’s search cache uses a hybrid approach because search workloads have mixed patterns. Popular destinations (Paris, New York) are searched constantly—frequency matters. But users also search obscure locations once—these shouldn’t pollute the cache. Airbnb’s solution: a two-tier cache with different eviction policies. The “hot destinations” cache uses LFU with hourly counter decay to handle seasonal shifts (ski resorts in winter, beaches in summer). The “recent searches” cache uses LRU to serve users refining their search. The interesting detail: Airbnb tracks cache performance per destination and dynamically adjusts which tier a destination belongs to. If a previously cold destination suddenly gets hot (maybe a celebrity posted about it), it graduates to the LFU tier. This adaptive approach improved cache hit rates by 12% compared to pure LRU, reducing backend load significantly during peak booking season.

PostgreSQL: LRU-K for Buffer Cache

PostgreSQL’s shared buffer cache uses 2-distance LRU (a variant of LRU-K with K=2) to handle mixed database workloads. The problem: a single sequential scan (SELECT * FROM large_table) would evict the entire working set with standard LRU. The solution: track the last two access times for each buffer page. A page only becomes “hot” after its second access. Sequential scans access each page once, so they never become hot and are evicted first. Frequently accessed index pages get accessed repeatedly, so they stay cached. The interesting detail: PostgreSQL uses a clock-sweep algorithm (circular buffer with a “clock hand”) rather than a true doubly-linked list for efficiency. Each buffer has a usage counter that increments on access (up to 5) and decrements when the clock hand passes. This approximation avoids the overhead of maintaining perfect LRU ordering while achieving similar behavior. The 2-distance LRU policy improved TPC-C benchmark performance by 30% compared to standard LRU by preventing scan pollution.

Airbnb’s Hybrid Cache Architecture with Multi-Tier Eviction

graph TB
    subgraph Client Layer
        User["User Search:<br/>'Paris hotels'"]
    end
    
    subgraph Hot Destinations Cache - LFU with Decay
        LFU_Cache["Popular Destinations<br/>Paris: freq=1000<br/>NYC: freq=850<br/>London: freq=720<br/><i>Hourly counter decay</i>"]
        LFU_Policy["Eviction: Lowest frequency<br/>Decay: freq = freq * 0.9 every hour"]
    end
    
    subgraph Recent Searches Cache - LRU
        LRU_Cache["User Recent Searches<br/>[Obscure_Town_A]<br/>[Small_Village_B]<br/>[Remote_Island_C]<br/><i>Recency-ordered</i>"]
        LRU_Policy["Eviction: Least recently used"]
    end
    
    subgraph Adaptive Promotion Logic
        Monitor["Track per-destination<br/>search frequency"]
        Promote["If freq > threshold:<br/>Graduate to Hot tier"]
        Demote["If freq drops:<br/>Move to Recent tier"]
    end
    
    subgraph Backend
        SearchAPI["Search Service<br/><i>PostgreSQL</i>"]
    end
    
    User --"1. Search request"--> LFU_Cache
    LFU_Cache --"2. Cache hit (Paris)"--> User
    
    User --"3. Search obscure location"--> LRU_Cache
    LRU_Cache --"4. Cache miss"--> SearchAPI
    SearchAPI --"5. Query results"--> LRU_Cache
    LRU_Cache --"6. Cache & return"--> User
    
    LFU_Cache -."Metrics".-> Monitor
    LRU_Cache -."Metrics".-> Monitor
    Monitor --> Promote
    Monitor --> Demote
    Promote -."Move destination".-> LFU_Cache
    Demote -."Move destination".-> LRU_Cache
    
    LFU_Cache --> LFU_Policy
    LRU_Cache --> LRU_Policy
    
    Result["Result: 12% hit rate improvement<br/>vs pure LRU during peak season"]

Airbnb’s search cache uses a hybrid two-tier approach: LFU with hourly decay for consistently popular destinations (Paris

Interview Essentials

Mid-Level

Explain LRU, LFU, and FIFO clearly with examples. Describe LRU’s O(1) implementation using hash map + doubly-linked list. Discuss when each policy is appropriate: LRU for temporal locality, LFU for frequency skew, FIFO for time-based data. Be ready to implement LRU in code (common interview question). Understand the one-hit-wonder problem with LRU and how it affects cache hit rates during traffic spikes. Know that Redis uses approximate LRU and why (memory savings).

Senior

Compare eviction policies across multiple dimensions: time/space complexity, implementation difficulty, workload suitability. Discuss advanced variants (LRU-K, 2Q, ARC) and why they exist—what problems do they solve? Explain cache pollution and scan resistance. Analyze real-world trade-offs: why would you choose approximate LRU over perfect LRU? Discuss how eviction policy interacts with cache warming, admission policies, and multi-tier caching. Be ready to design a cache for a specific system (e.g., video streaming, social feed) and justify your eviction policy choice. Understand how to measure eviction policy effectiveness (hit rate, eviction rate, memory efficiency).

Staff+

Architect multi-tier caching strategies with different eviction policies per tier. Discuss adaptive eviction policies that respond to workload changes. Analyze the interaction between eviction policy and system-wide behavior: how does eviction policy affect database load, tail latency, and cost? Propose custom eviction policies for specific workloads (e.g., time-decayed LFU for news, scan-resistant LRU for analytics). Discuss implementation at scale: lock contention, memory overhead, monitoring, and tuning. Explain how to A/B test eviction policies in production safely. Understand the economics: calculate the ROI of improving cache hit rate by 5% (reduced database load, lower latency, cost savings). Be ready to debug cache performance issues: how would you diagnose whether eviction policy is the problem?

Common Interview Questions

Implement an LRU cache with O(1) get and put operations (classic coding question)

When would you choose LFU over LRU? Give a concrete example.

How does Redis implement approximate LRU, and why is it better than perfect LRU?

Design a cache for a video streaming service. What eviction policy would you use and why?

What is the cache pollution problem, and how do LRU-K and 2Q solve it?

How would you handle a workload with both sequential scans and random access?

Explain the trade-offs between LRU and FIFO. When is FIFO acceptable?

How would you measure whether your eviction policy is working well?

What happens to cache hit rate during a traffic spike with LRU? How would you mitigate this?

Design an eviction policy for a news site where trending topics change hourly.

Red Flags to Avoid

Not knowing LRU’s O(1) implementation (hash map + doubly-linked list)—this is fundamental

Claiming one eviction policy is always best—it depends on workload

Not understanding the one-hit-wonder problem with LRU

Suggesting complex policies (ARC, LRU-K) without justifying why LRU is insufficient

Ignoring implementation complexity—saying ‘use LFU’ without discussing the O(log n) naive approach vs. O(1) optimized approach

Not connecting eviction policy to business metrics (hit rate, latency, cost)

Forgetting about concurrency—eviction policies interact with locking strategies

Not considering memory overhead—pointers and metadata matter at scale

Treating cache as a black box—not understanding how eviction affects downstream systems (database load)

Key Takeaways

LRU is the default choice for most workloads because it handles temporal locality well, has O(1) operations, and is simple to implement. Use it unless you have a specific reason not to. Redis’s approximate LRU shows that perfect eviction ordering isn’t necessary—sampling works fine and saves memory.

Match eviction policy to access pattern: LRU for recency (social feeds), LFU for popularity (video streaming), FIFO for time-based data (logs), MRU for scans (analytics). The wrong policy can halve your hit rate. Measure your workload before choosing—don’t guess.

Advanced variants solve specific problems: LRU-K and 2Q prevent cache pollution from one-time accesses (traffic spikes, sequential scans). ARC adapts to changing workloads. Use these when standard LRU demonstrably fails, not as premature optimization. The added complexity must be justified by measurable improvement.

Implementation complexity matters at scale: LRU’s doubly-linked list requires 40+ bytes per entry. LFU’s frequency buckets are complex to maintain correctly. Approximate algorithms (sampling, clock-sweep) trade slight accuracy for massive simplicity. Concurrency is the real bottleneck—lock contention on eviction structures limits throughput more than the policy itself.

Eviction policy is one piece of the puzzle: It interacts with cache size, admission policies, warming strategies, and multi-tier architectures. A 5% hit rate improvement at Netflix scale means millions in savings, but you need monitoring to measure it. Design for observability: track hit rate, eviction rate, and latency per policy to know if it’s working.