Client-Side Caching: Browser & HTTP Cache Guide

After this topic, you will be able to:

Implement HTTP cache headers (Cache-Control, ETag, Last-Modified) for optimal client caching
Analyze the trade-offs between aggressive client caching and content freshness
Design a client caching strategy for different resource types (static vs dynamic)
Evaluate browser caching behavior and design cache-friendly APIs

TL;DR

Client caching stores resources directly in the user’s browser or device, eliminating network requests for repeated content. HTTP cache headers (Cache-Control, ETag, Last-Modified) control what gets cached, for how long, and when to revalidate. This is the fastest caching layer—memory cache serves resources in ~0ms, disk cache in ~10-50ms—but requires careful management to balance performance with content freshness.

Cheat Sheet: Use Cache-Control: max-age=31536000, immutable for versioned static assets (CSS, JS with hashes). Use Cache-Control: no-cache with ETag for dynamic content that needs validation. Use Cache-Control: no-store for sensitive data. Browser caches are private (single-user) and have 50MB-1GB limits depending on the browser.

Background

Client caching emerged from a simple observation: users visit the same websites repeatedly, downloading identical resources each time. In the early web (1990s), every page load meant re-downloading every image, stylesheet, and script, even if nothing had changed. The HTTP/1.0 specification introduced the Expires header in 1996, allowing servers to tell browsers “this resource is good until midnight.” HTTP/1.1 (1999) added Cache-Control and ETags, giving developers fine-grained control over caching behavior.

The problem client caching solves is fundamental: network latency is expensive. Even with modern CDNs, a round-trip to a nearby edge server takes 20-50ms. A typical webpage loads 50-100 resources. Without caching, that’s 1-5 seconds of network time alone. With aggressive client caching, subsequent page loads can render in under 200ms because most resources come from local disk or memory.

Modern client caching has evolved beyond simple browser caches. Service Workers (introduced in 2015) give developers programmatic control over caching logic, enabling offline-first web apps. Progressive Web Apps (PWAs) use service workers to cache entire application shells, making web apps feel as fast as native apps. IndexedDB and LocalStorage provide structured storage for application data. But the foundation remains HTTP cache headers—understanding them is critical for any system that serves content to browsers.

Internals

Browser caching operates in three layers, checked in order: memory cache, disk cache, and network. When a browser requests a resource, it first checks the memory cache (RAM)—this is blazingly fast (~0ms) but limited to the current browser session. If not found, it checks the disk cache (SSD/HDD), which persists across sessions but is slower (~10-50ms). Only if both caches miss does the browser make a network request.

The decision to cache and how long to cache is controlled by HTTP response headers. When a server sends a resource, it includes headers like Cache-Control: max-age=3600, telling the browser “this resource is fresh for 3600 seconds (1 hour).” During that time, the browser serves the cached copy without any network request—this is called a “strong cache hit.” The browser doesn’t even check if the resource has changed; it trusts the max-age directive.

After the max-age expires, the resource becomes “stale.” But stale doesn’t mean unusable. The browser can perform a “conditional request” to validate whether the cached copy is still good. This is where ETags and Last-Modified headers come in. An ETag is a unique identifier (usually a hash) for a resource version. When the browser makes a conditional request, it sends If-None-Match: "abc123" (the ETag from the cached copy). If the server’s current version still has ETag “abc123”, it responds with 304 Not Modified and an empty body—the browser reuses the cached copy. This saves bandwidth (no body transfer) and is much faster than downloading the full resource.

The Last-Modified header works similarly but uses timestamps. The browser sends If-Modified-Since: Wed, 21 Oct 2024 07:28:00 GMT, and the server responds with 304 if the resource hasn’t changed since that time. ETags are more reliable because they’re content-based, while Last-Modified can have issues with clock skew or sub-second modifications.

Service Workers add a programmable layer between the browser and network. A service worker is JavaScript code that intercepts fetch requests and can implement custom caching strategies: cache-first (serve from cache, fall back to network), network-first (try network, fall back to cache), or stale-while-revalidate (serve cached copy immediately while fetching a fresh copy in the background). Spotify’s web player uses service workers to cache audio chunks, enabling seamless playback even when network quality degrades.

Browser Cache Lookup Flow

graph TB
    Start["Browser requests resource<br/>(e.g., app.js)"] --> MemCheck{"Check Memory Cache<br/>(RAM)"}
    MemCheck -->|"Hit (~0ms)"| MemServe["Serve from Memory<br/>No network request"]
    MemCheck -->|"Miss"| DiskCheck{"Check Disk Cache<br/>(SSD/HDD)"}
    DiskCheck -->|"Hit (~10-50ms)"| ValidCheck{"Is resource fresh?<br/>(within max-age)"}
    ValidCheck -->|"Yes (Strong Hit)"| DiskServe["Serve from Disk<br/>No network request"]
    ValidCheck -->|"No (Stale)"| Conditional["Conditional Request<br/>If-None-Match: ETag"]
    DiskCheck -->|"Miss"| Network["Network Request<br/>to CDN/Origin"]
    Conditional --> ServerCheck{"Server validates ETag"}
    ServerCheck -->|"304 Not Modified<br/>(~50ms)"| Reuse["Reuse cached copy<br/>Update freshness"]
    ServerCheck -->|"200 OK<br/>(~200-500ms)"| Download["Download new version<br/>Store in cache"]
    Network --> Download

Browser checks three cache layers in order: memory (fastest), disk (fast if fresh), then network (slowest). Stale resources trigger conditional requests, which return 304 if unchanged, avoiding full downloads.

ETag Validation Sequence

sequenceDiagram
    participant Browser
    participant Cache as Disk Cache
    participant Server
    
    Note over Browser,Server: Initial Request (Cache Miss)
    Browser->>Server: 1. GET /api/user/profile
    Server->>Browser: 2. 200 OK<br/>ETag: "5d8c72a5"<br/>Cache-Control: no-cache<br/>{user data}
    Browser->>Cache: 3. Store with ETag
    
    Note over Browser,Server: Subsequent Request (Cache Hit, Stale)
    Browser->>Cache: 4. Check cache
    Cache->>Browser: 5. Found (stale)<br/>ETag: "5d8c72a5"
    Browser->>Server: 6. GET /api/user/profile<br/>If-None-Match: "5d8c72a5"
    
    alt Content Unchanged
        Server->>Browser: 7a. 304 Not Modified<br/>(no body, ~500 bytes)
        Note over Browser: Reuse cached data<br/>Total time: ~50ms
    else Content Changed
        Server->>Browser: 7b. 200 OK<br/>ETag: "9f3a1b2c"<br/>{new user data}
        Browser->>Cache: 8. Update cache
        Note over Browser: Use new data<br/>Total time: ~200ms
    end

ETag validation allows efficient revalidation. The browser sends the cached ETag in If-None-Match. If content is unchanged, the server returns 304 with no body (saving bandwidth). If changed, it returns 200 with new content and new ETag.

HTTP Cache Headers Guide

Cache-Control Directives: The Cache-Control header is the primary caching control mechanism. Key directives include:

max-age=<seconds>: How long the resource is fresh. max-age=86400 means 24 hours. After this, the resource is stale and requires revalidation. Use long max-age (1 year) for versioned static assets like app.a3f2b1.js—if the content changes, the filename changes, so aggressive caching is safe.

no-cache: Confusingly named—it doesn’t mean “don’t cache.” It means “cache, but always revalidate before using.” The browser stores the resource but sends a conditional request (If-None-Match) every time. Use this for dynamic content that changes frequently but benefits from 304 responses, like API responses or user dashboards.

no-store: Actually means “don’t cache.” The browser must fetch fresh every time. Use for sensitive data like banking transactions or personal health information. This prevents the resource from being stored in any cache (memory, disk, or intermediary proxies).

must-revalidate: Once the resource is stale, the browser must revalidate with the origin server before using the cached copy. Without this, browsers might serve stale content in certain scenarios (like when offline).

immutable: Tells the browser this resource will never change. Even if the user hits refresh, the browser won’t revalidate. Combine with long max-age for versioned assets: Cache-Control: max-age=31536000, immutable. This is what Amazon uses for product images with content-addressed URLs.

private vs public: private means only the browser can cache (not CDNs or proxies). public allows shared caches. Use private for user-specific content like personalized recommendations.

ETag Validation: ETags enable efficient revalidation. The server generates an ETag (e.g., "5d8c72a5") based on content hash or version number. The browser stores this with the cached resource. On revalidation, it sends If-None-Match: "5d8c72a5". If the ETag matches, the server responds with 304 Not Modified (no body), saving bandwidth. If the content changed, the server sends 200 OK with the new content and new ETag.

Last-Modified: An older alternative to ETags using timestamps. The server sends Last-Modified: Wed, 21 Oct 2024 07:28:00 GMT. The browser revalidates with If-Modified-Since: Wed, 21 Oct 2024 07:28:00 GMT. Less precise than ETags (1-second granularity) but simpler to implement.

Expires: Legacy header from HTTP/1.0. Specifies an absolute expiration time: Expires: Wed, 21 Oct 2025 07:28:00 GMT. Cache-Control max-age is preferred because relative times are easier to manage. If both are present, max-age takes precedence.

Resource-Specific Strategies: For HTML pages, use Cache-Control: no-cache with ETag—pages change frequently, but 304 responses are fast. For CSS/JS with version hashes, use max-age=31536000, immutable—these never change, so aggressive caching is safe. For images, use max-age=86400 (1 day) for frequently updated content or max-age=2592000 (30 days) for stable content. For API responses, use no-cache with ETag for personalized data or max-age=60 for public data that updates every minute.

Cache-Control Directives Decision Tree

graph TB
    Start{"What type of content?"}
    
    Start -->|"Static assets with<br/>version hash"| Versioned["Cache-Control:<br/>max-age=31536000, immutable"]
    Versioned --> V_Ex["Example: app.a3f2b1.js<br/>✓ Cached for 1 year<br/>✓ Never revalidated<br/>✓ URL changes if content changes"]
    
    Start -->|"Dynamic content<br/>needs freshness"| Dynamic["Cache-Control: no-cache<br/>+ ETag"]
    Dynamic --> D_Ex["Example: API responses, HTML<br/>✓ Always revalidates<br/>✓ 304 if unchanged<br/>✓ Balances speed & freshness"]
    
    Start -->|"Semi-static content<br/>changes rarely"| SemiStatic["Cache-Control:<br/>max-age=3600-86400"]
    SemiStatic --> S_Ex["Example: Product images<br/>✓ Cached for hours/days<br/>✓ Some staleness OK<br/>✓ Reduces server load"]
    
    Start -->|"Sensitive data<br/>must not cache"| Sensitive["Cache-Control: no-store"]
    Sensitive --> Sen_Ex["Example: Auth tokens, payments<br/>✓ Never cached<br/>✓ Always fetched fresh<br/>✓ Privacy & security"]
    
    Start -->|"User-specific<br/>content"| Private["Cache-Control:<br/>private, no-cache + ETag"]
    Private --> P_Ex["Example: User dashboard<br/>✓ Only browser caches<br/>✓ Not CDN/proxies<br/>✓ Always revalidates"]

Choose Cache-Control directives based on content type. Versioned assets get aggressive caching (immutable), dynamic content gets validation-based caching (no-cache + ETag), and sensitive data gets no caching (no-store).

Performance Characteristics

Client caching delivers the best performance of any caching layer because it eliminates network latency entirely. Memory cache hits serve resources in ~0ms—the resource is already in RAM. Disk cache hits take ~10-50ms depending on SSD speed. Compare this to CDN cache hits (~20-50ms for nearby edges, 100-200ms for distant edges) or origin server requests (200-500ms+). For a typical webpage loading 100 resources, client caching can reduce load time from 3-5 seconds to under 200ms.

Cache hit rates depend on user behavior and cache configuration. For returning users on stable sites, hit rates of 80-95% are common. First-time visitors have 0% hit rate, which is why CDN caching (see CDN Caching) is critical for initial page loads. Mobile browsers have smaller cache sizes (50-200MB) than desktop browsers (250MB-1GB), affecting hit rates for users who visit many sites.

Validation requests (304 responses) are much faster than full downloads. A 304 response for a 500KB JavaScript file takes ~50ms (network round-trip) and transfers ~500 bytes (headers only), compared to ~500ms and 500KB for a full download. This is why no-cache with ETag is a good middle ground for dynamic content—you get freshness guarantees without full re-downloads.

Browser cache eviction uses LRU (Least Recently Used) policies. When the cache fills up, the browser evicts the oldest unused resources. This means frequently accessed resources (like your site’s main CSS file) stay cached, while one-off resources (like a news article image) get evicted. Service Workers can override this with custom eviction logic, prioritizing critical resources.

Cache Layer Performance Comparison

graph LR
    subgraph "Client Caching (This Layer)"
        Mem["Memory Cache<br/>~0ms<br/>Session only"]
        Disk["Disk Cache<br/>~10-50ms<br/>Persistent"]
        Val["304 Validation<br/>~50ms<br/>Network RTT"]
    end
    
    subgraph "Other Layers"
        CDN["CDN Cache<br/>~20-200ms<br/>Geographic distance"]
        Origin["Origin Server<br/>~200-500ms+<br/>Full processing"]
    end
    
    Request["Resource Request"] --> Mem
    Mem -->|"Miss"| Disk
    Disk -->|"Miss or Stale"| Val
    Val -->|"Cache Miss"| CDN
    CDN -->|"Miss"| Origin
    
    Mem -.->|"Best: 0ms"| Result["Resource Delivered"]
    Disk -.->|"Great: 10-50ms"| Result
    Val -.->|"Good: 50ms"| Result
    CDN -.->|"OK: 20-200ms"| Result
    Origin -.->|"Slow: 200-500ms+"| Result

Client caching is the fastest layer. Memory cache serves instantly (0ms), disk cache is very fast (10-50ms), and even 304 validation (50ms) beats CDN cache (20-200ms) and origin requests (200-500ms+). Each cache miss cascades to the next slower layer.

Trade-offs

Client caching excels at performance—it’s the fastest caching layer—but introduces complexity around cache invalidation and content freshness. The classic problem: “There are only two hard things in Computer Science: cache invalidation and naming things.” When you deploy a new version of your JavaScript, how do you ensure users get it?

The standard solution is versioned URLs: app.a3f2b1.js instead of app.js. When the content changes, the filename changes, so browsers fetch the new version. This requires build tooling to generate hashed filenames and update HTML references. Amazon uses this extensively—product images have content-addressed URLs like /images/I/51Ga5GuElyL._AC_SL1500_.jpg, where the hash ensures the URL changes if the image changes.

Aggressive caching (long max-age) maximizes performance but increases staleness risk. If you set max-age=86400 (1 day) on your API responses, users might see day-old data. Conservative caching (short max-age or no-cache) ensures freshness but increases server load and latency. The right balance depends on your content: static assets can be cached aggressively, dynamic content needs shorter TTLs.

Browser cache storage is limited and uncontrolled. Users can clear their cache anytime, and browsers evict entries when storage fills up. You can’t rely on client caching for correctness—it’s a performance optimization, not a data persistence layer. For critical data, use server-side caching or databases.

Privacy is another consideration. Cached resources persist across sessions, which can leak information. If a user visits a sensitive site (e.g., healthcare portal), cached resources might reveal that visit to someone else using the same device. Use Cache-Control: no-store for sensitive content. Service Workers can access cached data even when the page is closed, which has privacy implications—browsers require HTTPS for service workers to mitigate this.

Client caching is single-user (private). Unlike CDN caching, which serves many users from one cached copy, each browser has its own cache. This means cache efficiency doesn’t scale with user count—1 million users means 1 million separate caches. For shared resources, CDN caching is more efficient.

When to Use (and When Not To)

Use aggressive client caching (long max-age, immutable) for versioned static assets: CSS, JavaScript, images, fonts with content hashes in filenames. These resources never change at a given URL, so there’s no staleness risk. This is what Spotify does for their web player assets—player.f3a2b1.js is cached for a year because if the code changes, the filename changes.

Use moderate caching (max-age of hours to days) for semi-static content: product images, blog post images, marketing pages. These change infrequently, and some staleness is acceptable. Amazon caches product images for 24 hours—if a product image updates, users might see the old version for up to a day, which is acceptable for most products.

Use validation-based caching (no-cache with ETag) for dynamic content that changes frequently but benefits from 304 responses: API responses, user dashboards, personalized content. The browser always checks for updates, but if nothing changed, it reuses the cached copy. This balances freshness with performance.

Use no caching (no-store) for sensitive data: authentication tokens, payment information, personal health records. These should never be stored in browser caches. Also use no-store for content behind paywalls or login walls to prevent unauthorized access via cached copies.

Avoid client caching for real-time data: stock prices, live sports scores, chat messages. The staleness window of even max-age=1 (1 second) is too long. Use WebSockets or Server-Sent Events for real-time updates instead.

Consider service workers for offline-first applications: email clients, note-taking apps, news readers. Service workers can cache the application shell and data, enabling the app to work without network connectivity. But this adds complexity—you need to handle cache updates, version migrations, and storage limits.

Real-World Caching Strategy (Amazon Product Page)

graph TB
    subgraph "Amazon Product Page Components"
        HTML["HTML Page<br/>product.html"]
        CSS["Stylesheet<br/>styles.f3a2b1.css"]
        JS["JavaScript<br/>app.9d4c5e.js"]
        ProductImg["Product Images<br/>/images/I/51Ga5G...jpg"]
        API["API: Price & Inventory<br/>/api/product/B08X123"]
    end
    
    HTML --> H_Cache["Cache-Control: no-cache<br/>ETag: 'abc123'<br/><br/>✓ Always validates<br/>✓ 304 if unchanged<br/>✓ Fresh structure"]
    
    CSS --> C_Cache["Cache-Control:<br/>max-age=31536000, immutable<br/><br/>✓ Cached 1 year<br/>✓ Hash in filename<br/>✓ Never revalidates"]
    
    JS --> J_Cache["Cache-Control:<br/>max-age=31536000, immutable<br/><br/>✓ Cached 1 year<br/>✓ Hash in filename<br/>✓ Never revalidates"]
    
    ProductImg --> P_Cache["Cache-Control:<br/>max-age=86400<br/><br/>✓ Cached 24 hours<br/>✓ Content-addressed URL<br/>✓ Rare updates OK"]
    
    API --> A_Cache["Cache-Control: no-cache<br/>ETag: 'xyz789'<br/><br/>✓ Always validates<br/>✓ Fresh prices<br/>✓ 304 if no change"]
    
    H_Cache --> Result["Result:<br/>First visit: 3-5s<br/>Return visit: <500ms<br/><br/>50-100 images from cache<br/>Only HTML + API hit network"]
    C_Cache --> Result
    J_Cache --> Result
    P_Cache --> Result
    A_Cache --> Result

Amazon uses different caching strategies per resource type: immutable for versioned CSS/JS (1 year), moderate for product images (24 hours), and validation-based for HTML/API (always fresh). This hybrid approach delivers fast loads (images from cache) while maintaining data accuracy (prices from server).

Real-World Examples

company: Amazon system: E-commerce product pages implementation: Amazon uses aggressive client caching for product images with content-addressed URLs. Each image has a hash in its URL (e.g., /images/I/51Ga5GuElyL._AC_SL1500_.jpg), and the Cache-Control header is set to max-age=31536000 (1 year). This means once a user views a product, the images are cached locally for a year. If Amazon updates a product image, the URL changes, forcing browsers to fetch the new version. For HTML pages, Amazon uses Cache-Control: no-cache with ETags—the page structure is cached, but the browser validates on every visit to ensure users see current prices and inventory. This hybrid approach gives them fast page loads (images from cache) while maintaining data freshness (prices from server). interesting_detail: Amazon’s product pages load 50-100 images. Without client caching, a returning user would download 5-10MB of images every visit. With caching, subsequent visits load in under 500ms because only the HTML and API calls hit the network. This is critical for mobile users on slow connections.

company: Spotify system: Web Player implementation: Spotify’s web player uses service workers to implement a sophisticated caching strategy. The application shell (HTML, CSS, JavaScript) is cached with a cache-first strategy—the service worker serves the cached version immediately and updates it in the background. Audio chunks use a network-first strategy with fallback to cache—Spotify tries to stream fresh audio but falls back to cached chunks if the network is slow or unavailable. This enables seamless playback even when network quality degrades. Spotify also uses Cache-Control: immutable for versioned assets—their JavaScript bundles have hashes in filenames (player.f3a2b1.js), so they can be cached aggressively without staleness concerns. interesting_detail: Spotify’s service worker implements custom cache eviction logic. Instead of LRU, it prioritizes recently played songs and the user’s playlists. This means if you listen to a song once, it might get evicted quickly, but songs in your “Liked Songs” stay cached longer. This domain-specific eviction policy improves cache hit rates for the content users care about most.

Interview Essentials

Mid-Level

Mid-level engineers should understand HTTP cache headers and their effects. Explain the difference between max-age, no-cache, and no-store. Describe how ETag validation works: the browser sends If-None-Match with the cached ETag, and the server responds with 304 if the content hasn’t changed. Discuss the trade-off between aggressive caching (long max-age) and content freshness. Know that versioned URLs (e.g., app.a3f2b1.js) enable aggressive caching without staleness risk. Be able to design a caching strategy for a simple website: long max-age for static assets, no-cache with ETag for HTML pages.

Senior

Senior engineers should design caching strategies for complex applications with multiple content types. Explain when to use immutable and why it’s important for versioned assets. Discuss cache invalidation strategies: versioned URLs, cache busting query parameters, and their trade-offs. Understand service workers and their caching strategies (cache-first, network-first, stale-while-revalidate). Know the performance characteristics: memory cache (~0ms), disk cache (~10-50ms), 304 responses (~50ms). Discuss privacy implications of client caching and when to use no-store. Be able to debug caching issues: why is the browser serving stale content? Why isn’t the cache being used?

Staff+

Staff+ engineers should design caching architectures that balance performance, freshness, and operational complexity across multiple systems. Discuss how client caching interacts with CDN caching—what happens when both layers cache the same resource? Explain cache coherence problems: if a user has a resource cached with max-age=3600, how do you force an update before expiration? (Answer: you can’t directly; you need versioned URLs or cache busting.) Design a migration strategy for changing cache policies on a high-traffic site. Understand browser cache implementation details: how do different browsers handle cache eviction? What are the storage limits? Discuss the trade-offs between service worker complexity and reliability—service workers can break your entire site if they have bugs. Know how to measure cache effectiveness: hit rate, byte hit rate, latency improvements.

Common Interview Questions

How does ETag validation work, and when would you use it instead of Last-Modified?

What’s the difference between no-cache and no-store? When would you use each?

How would you design a caching strategy for a news website with frequently updated articles?

Why do we use versioned URLs (e.g., app.a3f2b1.js) instead of just setting a short max-age?

What happens if you set Cache-Control: max-age=31536000 on your HTML page?

How would you force all users to get a new version of a JavaScript file that’s currently cached?

What are the privacy implications of aggressive client caching?

Red Flags to Avoid

Not understanding the difference between no-cache and no-store (this is a fundamental misunderstanding)

Suggesting to cache sensitive data (authentication tokens, payment info) in the browser

Not knowing that you can’t force cache invalidation—once a resource is cached with max-age, it’s cached until expiration

Thinking client caching alone is sufficient for a high-traffic site (you need CDN caching too)

Not considering cache invalidation strategy when designing a caching policy

Confusing browser caching with server-side caching or CDN caching

Key Takeaways

Client caching is the fastest caching layer (memory cache ~0ms, disk cache ~10-50ms) but requires careful management to balance performance with freshness. Use HTTP cache headers to control caching behavior.

Use Cache-Control: max-age=31536000, immutable for versioned static assets (CSS, JS with hashes). Use Cache-Control: no-cache with ETag for dynamic content. Use Cache-Control: no-store for sensitive data.

ETag validation enables efficient revalidation: the browser sends If-None-Match with the cached ETag, and the server responds with 304 Not Modified if the content hasn’t changed, saving bandwidth and latency.

Versioned URLs (e.g., app.a3f2b1.js) solve the cache invalidation problem: when content changes, the URL changes, forcing browsers to fetch the new version. This enables aggressive caching without staleness risk.

Client caching is single-user and limited by browser storage (50MB-1GB). For shared resources and first-time visitors, CDN caching is critical. Client caching complements, not replaces, server-side and CDN caching.