Database Caching: Query Cache, Buffer Pool & Redis

TL;DR

Database caching stores frequently accessed data in memory layers within the database itself—query cache, buffer pool, and materialized views—to reduce disk I/O and query execution time. Unlike application-level caching (Redis/Memcached), database caching is transparent to applications but requires careful configuration to balance memory usage against invalidation complexity. Cheat sheet: Query cache stores result sets; buffer pool caches data pages; materialized views precompute expensive queries.

The Analogy

Think of database caching like a restaurant kitchen’s mise en place. The query cache is like pre-portioned sauces ready to serve (complete dishes stored). The buffer pool is your prep station with chopped vegetables and marinated proteins (raw ingredients ready for assembly). Materialized views are like pre-cooked components you reheat during service (expensive prep done once, served many times). Without these layers, every order would require starting from scratch—washing, chopping, cooking—making service impossibly slow during dinner rush.

Why This Matters in Interviews

Interviewers use database caching to assess whether you understand the full caching hierarchy, not just Redis. Senior candidates should explain why MySQL’s query cache was deprecated (invalidation storms) while PostgreSQL never implemented it, demonstrating awareness of real-world trade-offs. Staff+ engineers are expected to size buffer pools based on working set calculations and explain when materialized views justify their maintenance overhead. This topic separates candidates who’ve only used caches from those who’ve debugged production cache invalidation issues at scale.

Core Concept

Database caching operates at three distinct layers within the database engine itself, each optimizing different aspects of query execution. The query cache stores complete result sets keyed by the exact SQL query text, eliminating query parsing and execution entirely on cache hits. The buffer pool (InnoDB buffer pool in MySQL, shared buffers in PostgreSQL) caches data and index pages in memory, reducing disk I/O but still requiring query execution. Materialized views precompute and store expensive query results—particularly aggregations and joins—refreshing periodically or on-demand. Unlike application caching covered in Application Caching, database caching is transparent to application code but offers less flexibility in cache key design and eviction policies. The fundamental trade-off is operational simplicity versus control: database caches require no application changes but give you coarse-grained invalidation and limited observability compared to application-managed caches.

Database Caching Layers Architecture

graph LR
    Client["Application Client"]
    QC["Query Cache<br/>(Deprecated)"]
    Parser["Query Parser &<br/>Optimizer"]
    BP["Buffer Pool<br/>(Data & Index Pages)"]
    MV[("Materialized<br/>Views")]
    Disk[("Disk Storage<br/>(Tables & Indexes)")]
    
    Client --"1. SELECT query"--> QC
    QC --"Cache Hit<br/>(sub-ms)"--> Client
    QC --"2. Cache Miss"--> Parser
    Parser --"3. Execute query"--> BP
    BP --"4. Page Hit<br/>(microseconds)"--> Parser
    BP --"5. Page Miss"--> Disk
    Disk --"6. Load pages<br/>(milliseconds)"--> BP
    Parser --"Result"--> Client
    
    Client --"Query MV directly"--> MV
    MV --"Fast lookup"--> Client
    Disk -."Periodic refresh".-> MV

Three-layer database caching hierarchy showing query cache (complete result sets), buffer pool (data pages), and materialized views (precomputed aggregations). Each layer has different latency characteristics and invalidation complexity, with query cache offering fastest hits but highest invalidation overhead.

Query Cache Invalidation Storm

sequenceDiagram
    participant App as Application
    participant QC as Query Cache
    participant DB as Database
    
    Note over App,DB: Read-Heavy Phase (Cache Effective)
    App->>QC: SELECT * FROM users WHERE id=1
    QC->>App: Cache Hit (0.1ms)
    App->>QC: SELECT * FROM users WHERE id=2
    QC->>App: Cache Hit (0.1ms)
    App->>QC: SELECT * FROM posts WHERE user_id=1
    QC->>App: Cache Hit (0.1ms)
    
    Note over App,DB: Single Write Triggers Invalidation Storm
    App->>DB: UPDATE users SET status='active' WHERE id=5
    DB->>QC: Invalidate ALL queries touching 'users' table
    Note over QC: 10,000+ cached queries invalidated
    
    Note over App,DB: All Queries Now Miss Cache
    App->>QC: SELECT * FROM users WHERE id=1
    QC->>DB: Cache Miss - Execute query
    DB->>App: Result (15ms)
    App->>QC: SELECT * FROM users WHERE id=2
    QC->>DB: Cache Miss - Execute query
    DB->>App: Result (15ms)
    
    Note over App,DB: CPU spent on invalidation > CPU saved by caching

Query cache invalidation storm in write-heavy workloads. A single UPDATE to the users table invalidates all cached queries touching that table, forcing expensive re-execution. This is why MySQL deprecated query cache—high-throughput systems spent more CPU on invalidation than they saved.

Buffer Pool Sizing Decision Tree

graph TD
    Start["Analyze Database<br/>Workload"]
    Dedicated{"Dedicated<br/>Database Server?"}
    WorkingSet{"Working Set<br/>Size Known?"}
    HitRate{"Current Hit Rate<br/>>95%?"}
    
    Dedicated -->|Yes| Size75["Start: 75% of RAM<br/>for Buffer Pool"]
    Dedicated -->|No| Size50["Start: 50% of RAM<br/>for Buffer Pool"]
    
    Size75 --> Monitor["Monitor Hit Rate<br/>for 1 Week"]
    Size50 --> Monitor
    
    Monitor --> HitRate
    HitRate -->|Yes| Optimal["✓ Optimal Size<br/>No Action Needed"]
    HitRate -->|No| CheckRAM{"RAM Available<br/>to Increase?"}
    
    CheckRAM -->|Yes| Increase["Increase Buffer Pool<br/>by 25%"]
    CheckRAM -->|No| Vertical["Consider Vertical<br/>Scaling or Sharding"]
    
    Increase --> Monitor
    
    WorkingSet -->|Yes| Calculate["Buffer Pool =<br/>Working Set × 1.2"]
    WorkingSet -->|No| Dedicated
    Calculate --> Monitor
    
    Start --> WorkingSet

Decision tree for buffer pool sizing based on server type, working set size, and hit rate metrics. Start with 75% of RAM on dedicated servers or 50% on shared instances, then tune based on hit rate monitoring. Target >95% hit rate; lower rates indicate undersizing or inefficient queries requiring investigation.

How It Works

When a query arrives, the database checks the query cache first (if enabled). A cache hit returns results immediately, bypassing parsing, optimization, and execution—typically sub-millisecond response times. On a miss, the query proceeds to the optimizer, which generates an execution plan. During execution, the database checks the buffer pool for required data pages. Buffer pool hits avoid disk reads (microseconds vs. milliseconds), but the query still executes. Pages are loaded into the buffer pool using an LRU-based eviction policy, with frequently accessed pages remaining resident. For materialized views, the database maintains a precomputed table that applications query like any other table. Refreshes happen via scheduled jobs (complete refresh) or incrementally as base tables change (fast refresh), depending on the view definition and database capabilities. PostgreSQL’s materialized views require explicit REFRESH MATERIALIZED VIEW commands, giving developers control but requiring orchestration. Oracle and SQL Server support automatic incremental refreshes for certain view types, trading control for convenience.

Buffer Pool Page Management Flow

graph TB
    Query["Query Execution"]
    Check{"Page in<br/>Buffer Pool?"}
    Hit["Buffer Pool Hit<br/>(~10 microseconds)"]
    Miss["Buffer Pool Miss"]
    DiskRead["Read from Disk<br/>(~5 milliseconds)"]
    LRU{"Buffer Pool<br/>Full?"}
    Evict["Evict LRU Page<br/>(least recently used)"]
    Load["Load Page into<br/>Buffer Pool"]
    Execute["Execute Query<br/>on Cached Pages"]
    
    Query --> Check
    Check -->|Yes| Hit
    Check -->|No| Miss
    Hit --> Execute
    Miss --> DiskRead
    DiskRead --> LRU
    LRU -->|Yes| Evict
    LRU -->|No| Load
    Evict --> Load
    Load --> Execute
    
    subgraph Buffer Pool Management
        Check
        Hit
        LRU
        Evict
        Load
    end

Buffer pool page management showing the dramatic latency difference between cache hits (microseconds) and disk reads (milliseconds). LRU eviction ensures frequently accessed pages remain resident, but thrashing occurs when working set exceeds buffer pool size.

Key Principles

principle: Invalidation Complexity Scales with Write Patterns explanation: Query caches must invalidate all cached results when any underlying table changes, creating invalidation storms in write-heavy workloads. This is why MySQL deprecated query_cache in 8.0—high-throughput systems spent more CPU invalidating caches than they saved. Buffer pools avoid this by caching at the page level; writes invalidate only affected pages, not entire result sets. example: At LinkedIn’s early scale, enabling MySQL query cache actually decreased throughput by 15% because their social graph writes triggered constant invalidations. Disabling it and relying solely on buffer pool improved p99 latency from 45ms to 12ms.

principle: Working Set Size Determines Buffer Pool Effectiveness explanation: The buffer pool only helps if your frequently accessed data (working set) fits in memory. If your working set exceeds buffer pool size, you get cache thrashing—pages constantly evicted before reuse. The 80/20 rule often applies: 80% of queries touch 20% of data. Size your buffer pool to hold that 20%. example: Amazon RDS recommends setting innodb_buffer_pool_size to 75% of available RAM for dedicated database instances. For a 64GB instance with a 40GB working set, a 48GB buffer pool achieves 95%+ hit rates. Undersizing to 32GB drops hit rates to 60%, doubling query latency.

principle: Materialized Views Trade Freshness for Query Performance explanation: Materialized views precompute expensive aggregations, turning O(n) scans into O(1) lookups, but introduce staleness. The refresh strategy—real-time, periodic, or on-demand—depends on whether your use case tolerates eventual consistency. Analytics dashboards often accept 5-minute staleness; fraud detection cannot. example: Stripe’s revenue dashboard uses materialized views refreshed every 5 minutes to aggregate billions of payment records. Direct queries took 30+ seconds; materialized view queries return in 200ms. For real-time fraud scores, they use streaming aggregations in Flink instead, accepting higher infrastructure cost for zero staleness.

Deep Dive

Types / Variants

Query Cache (deprecated in MySQL 8.0, never in PostgreSQL): Stores complete result sets keyed by exact query text, including whitespace and comments. Invalidates all cached results when any table in the query changes. Effective only for read-heavy workloads with identical repeated queries. Buffer Pool/Shared Buffers: Caches data and index pages using LRU eviction. MySQL’s InnoDB buffer pool supports multiple instances (innodb_buffer_pool_instances) to reduce contention on multi-core systems. PostgreSQL’s shared_buffers is typically smaller (25% of RAM) because it relies more on OS page cache. Both support page compression and adaptive hash indexes for frequently accessed pages. Materialized Views: PostgreSQL requires explicit refresh commands (REFRESH MATERIALIZED VIEW CONCURRENTLY allows queries during refresh). Oracle supports fast refresh using materialized view logs that track base table changes. SQL Server’s indexed views automatically maintain themselves but have strict syntax requirements (no outer joins, subqueries, or non-deterministic functions). Snowflake’s materialized views refresh automatically in the background, abstracting refresh management entirely.

Materialized View Refresh Strategies

graph TB
    BaseTable[("Base Tables<br/>(Orders, Products)")]
    
    subgraph Complete Refresh
        CR["DROP + REBUILD<br/>Entire View"]
        CRLock["View Locked<br/>During Refresh"]
        CRSimple["Simple Logic<br/>Always Consistent"]
    end
    
    subgraph Incremental Refresh
        MVLog[("Materialized<br/>View Log")]
        IRDelta["Apply Only<br/>Changes Since Last Refresh"]
        IRConcurrent["Queries Allowed<br/>During Refresh"]
        IRComplex["Complex Failure Modes<br/>(Log Overflow)"]]
    end
    
    subgraph Automatic Refresh - Snowflake
        ARBackground["Background Service<br/>Monitors Base Tables"]
        ARTransparent["Transparent to Users<br/>No Manual Commands"]
        ARCost["Higher Compute Cost"]
    end
    
    BaseTable -->|"Full Scan"| CR
    CR --> CRLock
    CRLock --> CRSimple
    
    BaseTable -->|"Track Changes"| MVLog
    MVLog --> IRDelta
    IRDelta --> IRConcurrent
    IRConcurrent --> IRComplex
    
    BaseTable -->|"Change Detection"| ARBackground
    ARBackground --> ARTransparent
    ARTransparent --> ARCost

Three materialized view refresh strategies with different trade-offs. Complete refresh is simple but locks the view; incremental refresh is faster but requires materialized view logs and has complex failure modes; automatic refresh (Snowflake) abstracts complexity but increases compute costs.

Trade-offs

dimension: Invalidation Granularity option_a: Query Cache: Invalidates all cached queries touching a table on any write. Simple logic but causes invalidation storms in write-heavy systems. option_b: Buffer Pool: Invalidates only affected pages. More complex (tracks page versions) but scales to high write throughput. decision_framework: Use query cache only if writes are rare (<1% of operations) and queries are identical. Otherwise, rely on buffer pool and application caching.

dimension: Memory Allocation option_a: Large Buffer Pool: Maximizes cache hit rate but leaves less memory for OS cache, connection buffers, and sort operations. Can cause OOM if oversized. option_b: Moderate Buffer Pool: Balances database cache with OS cache and operational memory. Safer but may underutilize RAM on dedicated database servers. decision_framework: On dedicated database instances, allocate 75% of RAM to buffer pool. On shared instances, start at 50% and tune based on cache hit rate metrics (>95% is ideal).

dimension: Materialized View Refresh Strategy option_a: Complete Refresh: Rebuilds entire view from scratch. Simple, consistent, but locks view during refresh and scales poorly with view size. option_b: Incremental Refresh: Applies only changes since last refresh. Faster, allows concurrent queries, but requires materialized view logs and has complex failure modes. decision_framework: Use complete refresh for small views (<1M rows) or when base tables change completely. Use incremental refresh for large views with append-mostly base tables (event logs, time-series data).

Common Pitfalls

pitfall: Enabling Query Cache in Write-Heavy Systems why_it_happens: Developers see ‘cache’ and assume it always helps. Query cache invalidation overhead exceeds its benefit when write rate is high, actually degrading performance. how_to_avoid: Measure query_cache_hits vs. query_cache_inserts and query_cache_lowmem_prunes. If inserts exceed hits or prunes are frequent, disable query cache. Modern MySQL versions disable it by default.

pitfall: Undersizing Buffer Pool on Dedicated Database Servers why_it_happens: Conservative defaults (128MB in MySQL) or fear of OOM. Leaving RAM unused while database thrashes on disk I/O wastes hardware. how_to_avoid: Monitor Innodb_buffer_pool_reads vs. Innodb_buffer_pool_read_requests. If read ratio is <95%, increase buffer pool size. On dedicated servers, use 75% of RAM as a starting point.

pitfall: Materialized Views Without Monitoring Staleness why_it_happens: Developers create materialized views for performance but don’t track when refreshes fail or fall behind, leading to stale data bugs in production. how_to_avoid: Add monitoring for last_refresh_time and refresh_duration. Alert if staleness exceeds SLA (e.g., >10 minutes for a 5-minute refresh schedule). Test refresh failure scenarios—what happens if base table schema changes?

Real-World Examples

company: LinkedIn system: Social Graph Database usage_detail: LinkedIn disabled MySQL query cache after discovering it caused 15% throughput degradation due to invalidation storms from high write rates in their social graph. They increased InnoDB buffer pool from 32GB to 128GB on their primary database servers, achieving 98% buffer pool hit rates. For expensive profile aggregation queries (mutual connections, skill endorsements), they implemented materialized views refreshed every 15 minutes, reducing p95 query latency from 800ms to 50ms while accepting eventual consistency for non-critical features.

company: Amazon RDS system: Managed Database Service usage_detail: Amazon RDS automatically tunes buffer pool size based on instance class, allocating 75% of RAM to innodb_buffer_pool_size for MySQL instances. For PostgreSQL, they set shared_buffers to 25% of RAM, relying on the OS page cache for additional caching. RDS Performance Insights exposes buffer pool hit rate metrics, alerting customers when hit rates drop below 95%, indicating undersized instances or inefficient queries. They recommend materialized views for analytics workloads but warn that refreshes can cause replication lag on read replicas if not carefully scheduled during low-traffic windows.

Interview Expectations

Mid-Level

Explain the difference between query cache and buffer pool, including why query cache was deprecated in MySQL. Describe when materialized views are appropriate and their refresh trade-offs. Calculate buffer pool hit rate from given metrics (e.g., 1000 reads, 950 from cache = 95% hit rate). Discuss basic invalidation challenges—why does updating one row invalidate query cache entries?

Senior

Design a caching strategy for a read-heavy analytics system, justifying buffer pool sizing and materialized view refresh intervals based on query patterns and SLA requirements. Explain why PostgreSQL never implemented query cache while MySQL did and later deprecated it. Discuss buffer pool internals: LRU eviction, page compression, adaptive hash indexes. Size buffer pool for a given working set (e.g., 80GB working set, 128GB RAM → 96GB buffer pool). Explain incremental materialized view refresh mechanisms and when they fail (non-deterministic functions, outer joins).

Staff+

Architect a multi-tier caching strategy combining database caching, application caching, and CDN, explaining which layer owns which data and why. Analyze a production incident where buffer pool thrashing caused cascading failures (e.g., buffer pool too small → disk I/O spikes → connection pool exhaustion → application timeouts). Design a materialized view refresh system that handles base table schema changes gracefully, including rollback strategies. Discuss advanced buffer pool tuning: multiple buffer pool instances for concurrency, page prefetching strategies, NUMA-aware allocation. Explain how database caching interacts with replication lag—why do materialized views on read replicas sometimes serve staler data than expected?

Common Interview Questions

Why did MySQL deprecate query cache in version 8.0? (Invalidation overhead exceeded benefits in modern workloads)

How do you size the InnoDB buffer pool? (75% of RAM on dedicated servers, tune based on working set size and hit rate metrics)

When should you use materialized views instead of application caching? (Complex aggregations across large datasets where refresh latency is acceptable)

What’s the difference between buffer pool and OS page cache? (Buffer pool is database-managed with query-aware eviction; OS cache is transparent but less intelligent)

How do you monitor buffer pool effectiveness? (Hit rate >95%, low eviction rate, monitor Innodb_buffer_pool_read_requests vs. Innodb_buffer_pool_reads)

Key Takeaways

Database caching operates at three layers: query cache (deprecated due to invalidation complexity), buffer pool (page-level caching that scales to high writes), and materialized views (precomputed query results trading freshness for performance).

Buffer pool sizing is critical—allocate 75% of RAM on dedicated database servers and monitor hit rates (target >95%). Undersizing causes disk I/O thrashing; oversizing risks OOM and starves other database operations.

Query cache invalidation doesn’t scale: any write to a table invalidates all cached queries touching that table. This is why MySQL deprecated it and PostgreSQL never implemented it. Modern systems rely on buffer pool and application caching instead.

Materialized views are powerful for expensive aggregations but require careful refresh strategy design. Complete refresh is simple but locks the view; incremental refresh is faster but has complex failure modes and syntax restrictions.

Database caching is transparent to applications but offers less control than application caching. Use database caching for broad performance improvements; use application caching (see Application Caching) for fine-grained control over cache keys, TTLs, and invalidation logic.

Prerequisites

Caching Overview - Understanding cache hierarchies and memory fundamentals

Data Modeling - Database schema design that enables effective caching

Next Steps

Application Caching - Redis and Memcached for application-managed caches

Cache Invalidation Strategies - TTL, write-through, and event-based invalidation patterns

Database Replication - How caching interacts with read replicas and replication lag

Query Optimization - Index design and query tuning that complements caching

CDN Caching - Edge caching for static assets and API responses