Deployment Stamps for Availability: Multi-Region Guide

intermediate 28 min read Updated 2026-02-11

TL;DR

Deployment stamps (also called scale units or cells) are self-contained, identical copies of your entire application stack deployed independently to serve a subset of users or tenants. Instead of scaling one massive deployment, you replicate the entire stack—databases, services, caches—creating isolated islands that limit blast radius and enable linear scaling. Think of it as franchising your infrastructure: each location is complete and independent, rather than building one giant restaurant that serves everyone.

Cheat Sheet: Deploy multiple identical, isolated copies of your full stack. Each stamp serves a bounded set of users. Route traffic via global load balancer. Limits blast radius to one stamp, enables geographic distribution, and provides near-linear scalability.

The Analogy

Imagine McDonald’s scaling strategy. Instead of building one enormous restaurant in Kansas City that serves the entire Midwest with 500 cash registers and a parking lot for 10,000 cars, they build hundreds of identical restaurants across different cities. Each location has the same kitchen equipment, menu, and staff structure—just serving a different neighborhood. If one restaurant’s fryer breaks, only that location’s customers are affected. If demand grows in Dallas, they open more Dallas locations rather than making the Kansas City mega-restaurant even bigger. Deployment stamps work the same way: you replicate your entire application stack into independent units, each serving a bounded set of customers, rather than trying to scale one monolithic deployment infinitely.

Why This Matters in Interviews

Deployment stamps come up when discussing multi-tenant SaaS architectures, global availability strategies, or how to scale beyond single-region limits. Interviewers want to see that you understand the difference between scaling up (bigger instances) and scaling out (more stamps), and that you can articulate when the operational complexity of managing multiple stamps is worth the isolation and availability benefits. This pattern is particularly relevant for B2B SaaS discussions where customer data isolation and SLA guarantees matter. Strong candidates explain the routing layer, blast radius containment, and the operational trade-offs of managing N identical deployments versus one large one.


Core Concept

Deployment stamps represent a fundamental shift in how we think about scaling distributed systems. Rather than continuously enlarging a single deployment—adding more servers, bigger databases, wider load balancers—you create multiple complete, independent copies of your entire application stack. Each stamp is a self-contained unit containing all the components needed to serve requests: web servers, application servers, databases, caches, message queues, and storage. The key insight is that each stamp serves a bounded subset of your total user base, typically determined by tenant assignment, geographic region, or capacity limits.

This pattern emerged from the operational reality that scaling a single deployment has practical limits. Beyond a certain size, deployments become fragile—a configuration error affects millions of users, database migrations take days, and the blast radius of any failure is catastrophic. Companies like Microsoft Azure, Salesforce, and Slack adopted stamps to break through these scaling ceilings while simultaneously improving availability. When one stamp fails, only its assigned users are affected, not your entire customer base.

The architecture requires a global routing layer that directs each request to the appropriate stamp based on tenant ID, user location, or other sharding keys. This router becomes your single point of coordination, while the stamps themselves remain completely independent. The pattern trades operational complexity—you’re now managing multiple production environments—for improved isolation, availability, and the ability to scale almost linearly by adding more stamps.

Deployment Stamps Architecture Overview

graph TB
    subgraph Global Layer
        Router["Global Router<br/><i>Traffic Director</i>"]
        Registry[("Routing Registry<br/><i>Tenant → Stamp Mapping</i>")]
    end
    
    subgraph Stamp 1: US-East
        LB1["Load Balancer"]
        App1["App Servers<br/><i>5-10 instances</i>"]
        DB1[("Database<br/><i>Primary + Replicas</i>")]
        Cache1[("Redis Cache")]
    end
    
    subgraph Stamp 2: US-West
        LB2["Load Balancer"]
        App2["App Servers<br/><i>5-10 instances</i>"]
        DB2[("Database<br/><i>Primary + Replicas</i>")]
        Cache2[("Redis Cache")]
    end
    
    subgraph Stamp 3: EU-West
        LB3["Load Balancer"]
        App3["App Servers<br/><i>5-10 instances</i>"]
        DB3[("Database<br/><i>Primary + Replicas</i>")]
        Cache3[("Redis Cache")]
    end
    
    User1["👤 Tenant A Users"] --"1. Request"--> Router
    User2["👤 Tenant B Users"] --"1. Request"--> Router
    User3["👤 Tenant C Users"] --"1. Request"--> Router
    
    Router --"2. Lookup tenant"--> Registry
    Router --"3. Route to Stamp 1"--> LB1
    Router --"3. Route to Stamp 2"--> LB2
    Router --"3. Route to Stamp 3"--> LB3
    
    LB1 --> App1
    App1 --> DB1
    App1 --> Cache1
    
    LB2 --> App2
    App2 --> DB2
    App2 --> Cache2
    
    LB3 --> App3
    App3 --> DB3
    App3 --> Cache3

Each stamp is a complete, isolated copy of the application stack. The global router directs tenants to their assigned stamp, while stamps operate independently with no shared infrastructure. If Stamp 2 fails, only Tenant B users are affected.

How It Works

Step 1: Define the Stamp Boundary

First, you determine what constitutes one complete stamp. This includes every component needed to serve requests independently: application servers, databases (with full schema), caches, message queues, blob storage, and any supporting services. The stamp must be entirely self-contained—no shared databases or services between stamps. For a typical three-tier web application, one stamp might include: 5-10 application servers, a primary database with 2 read replicas, a Redis cluster with 3 nodes, and dedicated object storage buckets. You define the maximum capacity of one stamp, typically based on database limits or connection pool constraints. For example, you might design each stamp to handle 10,000 active tenants or 50,000 requests per second.

Step 2: Deploy Multiple Identical Stamps

You provision multiple stamps across different availability zones or regions using infrastructure-as-code (Terraform, CloudFormation, Pulumi). Each stamp gets identical configuration except for environment-specific values like region, network CIDR blocks, and DNS names. Critically, you use the same deployment pipeline and configuration templates for all stamps—they’re truly identical copies. You might start with 3 stamps (us-east-1, us-west-2, eu-west-1) and add more as demand grows. Each stamp is named (stamp-001, stamp-002) and registered in a central registry that tracks its location, capacity, and health status.

Step 3: Implement Global Routing

You deploy a global routing layer that sits in front of all stamps and directs incoming requests to the correct stamp. This router maintains a mapping of tenants/users to stamps, typically stored in a highly available, globally replicated database (DynamoDB Global Tables, Cosmos DB, or Spanner). When a request arrives with a tenant ID or user identifier, the router looks up which stamp serves that tenant and proxies the request there. The router can be implemented as a global load balancer (AWS Global Accelerator, Azure Front Door), API gateway, or custom proxy service. The routing decision is cached aggressively to avoid lookup latency on every request.

Step 4: Assign Tenants to Stamps

As new customers sign up, you assign them to a stamp using a placement algorithm. The simplest approach is round-robin assignment to the stamp with the most available capacity. More sophisticated strategies consider geographic proximity (assign European customers to EU stamps), data residency requirements, or customer tier (enterprise customers get dedicated stamps). You store this assignment in the routing registry and ensure it’s replicated globally. Once assigned, a tenant typically stays on that stamp permanently, though you can implement migration capabilities for rebalancing.

Step 5: Monitor and Scale Stamps

You monitor each stamp independently for capacity utilization, error rates, and latency. When a stamp approaches its capacity limit (typically 70-80% of maximum), you provision a new stamp and start assigning new tenants to it. Existing tenants remain on their current stamp—you’re scaling by adding capacity for new customers, not by moving existing ones. If a stamp fails health checks, the router marks it unhealthy and stops routing new requests there, while you work to restore it or migrate its tenants to other stamps.

Step 6: Handle Stamp Failures

When a stamp becomes unavailable, only the tenants assigned to that stamp are affected—this is the blast radius containment benefit. Your monitoring detects the failure, alerts on-call engineers, and the router automatically stops sending traffic there. For critical customers, you might implement automatic failover to a standby stamp, though this requires replicating their data across stamps (which somewhat defeats the isolation purpose). More commonly, you focus on rapid stamp recovery—since stamps are identical and deployed via IaC, you can redeploy a failed stamp quickly.

Request Flow Through Stamp Architecture

sequenceDiagram
    participant User as 👤 User<br/>(acme.app.com)
    participant DNS as DNS/CDN
    participant Router as Global Router
    participant Registry as Routing Registry
    participant Cache as Router Cache
    participant Stamp as Stamp-US-East-1
    participant DB as Stamp Database
    
    User->>DNS: 1. Resolve acme.app.com
    DNS->>Router: 2. Route to nearest router
    
    Router->>Cache: 3. Check cached mapping<br/>for tenant "acme"
    
    alt Cache Hit
        Cache-->>Router: Stamp-US-East-1
    else Cache Miss
        Router->>Registry: 4. Query tenant mapping
        Registry-->>Router: Stamp-US-East-1
        Router->>Cache: 5. Cache result (TTL: 60s)
    end
    
    Router->>Stamp: 6. Proxy request to stamp
    Stamp->>DB: 7. Query tenant data
    DB-->>Stamp: 8. Return data
    Stamp-->>Router: 9. Response
    Router-->>User: 10. Response
    
    Note over Router,Cache: Cached routing decisions<br/>avoid registry lookup<br/>on subsequent requests

The router caches tenant-to-stamp mappings to minimize latency. On cache miss, it queries the routing registry and caches the result. This keeps routing overhead under 1-2ms for cached lookups while maintaining flexibility for tenant migrations.

Key Principles

Principle 1: Complete Isolation Between Stamps

Each stamp must be entirely self-contained with no shared infrastructure between stamps except the global router. This means separate databases, separate caches, separate message queues—everything. The isolation is what limits blast radius and enables independent operations. When you’re tempted to share a database or service between stamps “for efficiency,” you’re undermining the core value of the pattern. Example: Slack runs separate Elasticsearch clusters per stamp rather than one giant shared cluster, so a search index corruption affects only one stamp’s users.

Principle 2: Stamps Are Identical, Not Customized

Every stamp should be deployed from the same infrastructure-as-code templates with identical configurations (except environment-specific values). This uniformity is crucial for operational sanity—you can’t manage 50 stamps if each is a unique snowflake. Any feature or configuration change should be deployed to all stamps simultaneously or in a controlled rollout. Example: Microsoft Azure’s deployment stamps use ARM templates that are identical across all stamps, with only region and capacity parameters varying.

Principle 3: Design for Bounded Capacity Per Stamp

Each stamp has a maximum capacity defined by its weakest component (usually the database). You must design stamps to operate comfortably within these bounds and provision new stamps before hitting limits. This requires accurate capacity planning and monitoring. Don’t try to squeeze infinite growth from one stamp—that defeats the purpose. Example: If your database can handle 100,000 connections and each tenant averages 50 connections, design each stamp for 1,500 tenants maximum (leaving 25% headroom), then add a new stamp.

Principle 4: Routing Layer Must Be Highly Available

The global router becomes a critical dependency—if it fails, all stamps become unreachable even if they’re healthy. Invest heavily in making the routing layer simple, fast, and redundant. Use managed services with built-in HA (global load balancers, DNS-based routing) rather than building custom solutions. Cache routing decisions aggressively to minimize lookup latency. Example: Stripe’s routing layer uses GeoDNS with health checks and cached tenant-to-stamp mappings at the edge, so routing decisions happen in microseconds without database lookups.

Principle 5: Optimize for Operational Simplicity

The operational burden of managing N stamps is real—deployments, monitoring, incident response all multiply by N. Invest in automation, unified observability, and tooling that treats stamps as a fleet rather than individual snowflakes. Your deployment pipeline should be able to roll out changes to all stamps with one command. Your monitoring should aggregate metrics across stamps while allowing drill-down to individual stamps. Example: Netflix’s Spinnaker deployment system can orchestrate blue-green deployments across hundreds of regional stamps simultaneously, with automatic rollback if any stamp shows elevated errors.


Deep Dive

Types / Variants

Geographic Stamps

Each stamp serves a specific geographic region (US-East, EU-West, APAC). Tenants are assigned to stamps based on their primary location or data residency requirements. This variant optimizes for latency (users connect to nearby stamps) and compliance (EU data stays in EU stamps). When to use: Global applications with latency-sensitive workloads or strict data residency requirements. Pros: Reduced latency, compliance with data sovereignty laws, natural disaster isolation. Cons: Uneven load distribution (US might need 10 stamps while APAC needs 2), complexity in handling users who travel. Example: Salesforce deploys stamps in each major region (NA1, EU5, AP3) and assigns customers to stamps based on their contract’s data residency requirements.

Tenant-Based Stamps

Tenants are assigned to stamps based on capacity or tier, regardless of geography. Each stamp serves a mix of tenants up to its capacity limit. High-value enterprise customers might get dedicated stamps, while smaller customers share multi-tenant stamps. When to use: SaaS applications where isolation and SLA guarantees matter more than geographic proximity. Pros: Even capacity utilization, ability to offer premium “dedicated stamp” tiers, easier load balancing. Cons: Higher latency for geographically distant users, more complex routing logic. Example: Slack assigns enterprise customers to dedicated stamps while smaller teams share multi-tenant stamps, allowing them to offer 99.99% SLA guarantees to enterprises.

Hybrid Stamps (Geographic + Tenant)

Combines both approaches: stamps are deployed in multiple regions, and within each region, tenants are distributed across multiple stamps based on capacity. This provides both geographic optimization and capacity scaling. When to use: Large-scale global SaaS with diverse customer base. Pros: Best of both worlds—low latency and capacity scaling. Cons: Most complex routing logic, highest operational overhead. Example: Microsoft 365 uses this approach with regional stamp clusters (North America has 20+ stamps, Europe has 15+) where tenants are assigned to stamps within their region based on capacity and tier.

Active-Active Stamps with Replication

Tenants’ data is replicated across multiple stamps, and requests can be served by any stamp. This variant sacrifices some isolation for higher availability and disaster recovery. When to use: Mission-critical systems where even single-stamp downtime is unacceptable. Pros: Zero downtime during stamp failures, true active-active DR. Cons: Significantly higher complexity, data consistency challenges, higher infrastructure costs. Example: Stripe’s payment processing replicates critical transaction data across multiple stamps so payment requests can be processed even if the primary stamp fails.

Ephemeral Stamps

Short-lived stamps created for specific purposes (testing, demos, temporary capacity) and destroyed when no longer needed. When to use: Development/testing environments, handling temporary traffic spikes, customer demos. Pros: Cost-efficient (pay only when needed), clean test isolation, rapid provisioning. Cons: Requires sophisticated automation, not suitable for production data. Example: GitHub creates ephemeral stamps for large CI/CD workloads, spinning up complete environments for major open-source projects’ test suites and tearing them down after completion.

Geographic vs Tenant-Based Stamp Distribution

graph TB
    subgraph Geographic Stamps
        G_Router["Global Router<br/><i>GeoDNS-based</i>"]
        
        subgraph US Region
            G_US1["Stamp US-1<br/>Tenants: A,B,C"]
            G_US2["Stamp US-2<br/>Tenants: D,E,F"]
        end
        
        subgraph EU Region
            G_EU1["Stamp EU-1<br/>Tenants: G,H,I"]
        end
        
        subgraph APAC Region
            G_AP1["Stamp APAC-1<br/>Tenants: J,K"]
        end
        
        G_Router --"Low latency<br/>Data residency"--> G_US1
        G_Router --> G_US2
        G_Router --> G_EU1
        G_Router --> G_AP1
    end
    
    subgraph Tenant-Based Stamps
        T_Router["Global Router<br/><i>Tenant ID-based</i>"]
        
        T_Ent1["Enterprise Stamp 1<br/>Tenant: MegaCorp<br/><i>Dedicated, 99.99% SLA</i>"]
        T_Ent2["Enterprise Stamp 2<br/>Tenant: BigCo<br/><i>Dedicated, 99.99% SLA</i>"]
        T_Shared1["Shared Stamp 1<br/>Tenants: 150 SMBs<br/><i>Multi-tenant, 99.9% SLA</i>"]
        T_Shared2["Shared Stamp 2<br/>Tenants: 150 SMBs<br/><i>Multi-tenant, 99.9% SLA</i>"]
        
        T_Router --"Premium tier"--> T_Ent1
        T_Router --"Premium tier"--> T_Ent2
        T_Router --"Standard tier"--> T_Shared1
        T_Router --"Standard tier"--> T_Shared2
    end

Geographic stamps optimize for latency and data residency by placing stamps near users. Tenant-based stamps optimize for isolation and SLA tiers, allowing premium customers to get dedicated stamps while smaller customers share multi-tenant stamps.

Trade-offs

Isolation vs. Resource Efficiency

Option A (Strong Isolation): Each stamp has completely separate infrastructure with no sharing. Every stamp runs its own database cluster, cache cluster, and services. Option B (Shared Services): Stamps share some infrastructure like databases (with separate schemas) or centralized logging/monitoring. Decision Framework: Choose strong isolation when blast radius containment and security isolation are paramount (B2B SaaS, healthcare, finance). Choose shared services when cost optimization matters more than isolation and you have strong multi-tenancy controls. The hybrid approach—isolate data plane (databases, caches) but share control plane (monitoring, logging)—works well for most cases. Example: Shopify uses strong isolation for merchant data (separate databases per stamp) but shares Kafka clusters for event streaming across stamps with strict topic-level isolation.

Stamp Size vs. Stamp Count

Option A (Fewer Large Stamps): Deploy 10 large stamps, each handling 10,000 tenants with powerful instances and large databases. Option B (Many Small Stamps): Deploy 100 small stamps, each handling 1,000 tenants with modest instances. Decision Framework: Larger stamps are more resource-efficient (better utilization, fewer management endpoints) but have larger blast radius and longer recovery times. Smaller stamps provide finer-grained isolation and faster recovery but increase operational complexity. The sweet spot is typically stamps sized to handle 1-2 hours of recovery time if rebuilt from scratch—if your stamp takes 6 hours to restore, it’s too big. Example: Atlassian’s Jira Cloud uses medium-sized stamps (2,000-3,000 tenants each) balancing operational overhead with acceptable recovery times.

Static vs. Dynamic Tenant Assignment

Option A (Static Assignment): Tenants are permanently assigned to stamps at signup and never move. Option B (Dynamic Rebalancing): Tenants can be migrated between stamps to balance load or consolidate underutilized stamps. Decision Framework: Static assignment is simpler and avoids migration complexity but can lead to uneven load distribution over time. Dynamic rebalancing optimizes resource utilization but requires sophisticated migration tooling and causes temporary disruption. Start with static assignment and add migration capabilities only when you have clear evidence of imbalance problems. Example: AWS RDS uses static assignment for most customers but offers migration tools for customers who outgrow their stamp’s capacity or need to move regions.

Synchronous vs. Asynchronous Routing

Option A (Synchronous Lookup): Router queries the tenant-to-stamp mapping database on every request to determine routing. Option B (Cached/Embedded Routing): Routing decisions are cached at the edge or embedded in client tokens. Decision Framework: Synchronous lookup is always accurate but adds latency (5-20ms) and creates a dependency on the routing database. Cached routing is fast (sub-millisecond) but can route to wrong stamps during migrations or failures. Use cached routing with short TTLs (30-60 seconds) and fallback to synchronous lookup on errors. Example: Cloudflare’s Workers platform embeds stamp routing information in JWT tokens issued at authentication, eliminating per-request lookups while maintaining security.

Uniform vs. Tiered Stamps

Option A (Uniform Stamps): All stamps have identical capacity and configuration. Any tenant can be assigned to any stamp. Option B (Tiered Stamps): Different stamp sizes for different customer tiers (small/medium/large) or use cases. Decision Framework: Uniform stamps simplify operations and capacity planning—you have one stamp design to manage. Tiered stamps allow cost optimization (don’t over-provision for small customers) and performance isolation (enterprise customers get beefier stamps). The operational complexity of managing multiple stamp types is significant; only introduce tiers when you have clear business justification. Example: Datadog uses tiered stamps with different retention periods and query performance characteristics for different pricing tiers, allowing them to optimize infrastructure costs per customer segment.

Common Pitfalls

Pitfall 1: Shared Database “Just for Metadata”

Why it happens: Teams deploy stamps with separate application tiers but share a central database for “just user accounts” or “just configuration data.” This seems efficient and avoids data synchronization complexity. Why it’s dangerous: That shared database becomes a single point of failure that can take down all stamps simultaneously. It also becomes a scaling bottleneck and eliminates the blast radius containment benefit. How to avoid: Replicate all data needed for stamp operation into each stamp, even if it means data duplication. Use eventual consistency and background sync jobs to keep shared data (like user profiles) synchronized across stamps. Only truly global, read-only reference data (like country codes) should be shared. Real example: A fintech company shared their user authentication database across stamps “for consistency.” When that database had a slow query incident, all 15 stamps became unavailable simultaneously, affecting 100% of customers instead of the 7% that would have been affected with proper isolation.

Pitfall 2: Inconsistent Stamp Configurations

Why it happens: Over time, stamps diverge as engineers make “quick fixes” or “temporary changes” directly in production. One stamp gets a performance patch, another gets a different version of a library, and suddenly you’re managing 20 unique snowflakes. Why it’s dangerous: Configuration drift makes debugging impossible (“it works in stamp-003 but fails in stamp-007”), prevents confident deployments, and creates security vulnerabilities when patches aren’t applied uniformly. How to avoid: Enforce infrastructure-as-code for all stamp configurations with no exceptions for manual changes. Use configuration management tools (Ansible, Chef, Puppet) to detect and remediate drift. Implement automated testing that validates all stamps have identical configurations. Make it easier to deploy a change to all stamps than to change one stamp manually. Real example: Netflix discovered that 30% of their regional stamps had different JVM garbage collection settings due to manual tuning, causing inconsistent performance characteristics that made capacity planning nearly impossible.

Pitfall 3: Undersizing the Routing Layer

Why it happens: Teams focus on stamp capacity and treat the router as an afterthought, deploying it on modest infrastructure since “it’s just doing lookups.” Why it’s dangerous: The router handles 100% of your traffic and becomes a bottleneck if undersized. Even worse, router failures make all stamps unreachable, so a small infrastructure investment in the router can take down your entire service. How to avoid: Over-provision the routing layer significantly—it should be able to handle 3-5x your peak traffic. Use managed, globally distributed services (CloudFront, Cloudflare, Azure Front Door) rather than self-hosted routers. Implement aggressive caching of routing decisions with fallback mechanisms. Monitor router latency and error rates as your most critical metrics. Real example: A SaaS company ran their custom routing service on 3 small instances “to save costs.” During a traffic spike, the routers became CPU-bound, causing 30-second latencies and cascading failures across all stamps despite the stamps themselves being healthy.

Pitfall 4: No Tenant Migration Strategy

Why it happens: Teams assume static tenant assignment will work forever and don’t build migration capabilities. When they need to move tenants (stamp decommissioning, rebalancing, customer requests), they have no tooling. Why it’s dangerous: Without migration capabilities, you can’t decommission old stamps, rebalance load, or respond to customer data residency requests. You’re stuck with whatever initial assignment you made. How to avoid: Build tenant migration tooling from day one, even if you don’t plan to use it immediately. The tooling should handle: exporting tenant data, importing to new stamp, updating routing registry, validating migration success, and rolling back if needed. Test migrations regularly in staging environments. Real example: Dropbox had to build emergency migration tooling when they needed to move customers off aging stamps, resulting in a 6-month project that could have been avoided with upfront investment in migration capabilities.

Pitfall 5: Ignoring Cross-Stamp Operations

Why it happens: Teams design stamps for single-tenant operations but forget about cross-tenant features like global search, analytics, or admin dashboards that need data from all stamps. Why it’s dangerous: You end up with features that work within a stamp but break across stamps, or you’re forced to add cross-stamp queries that violate isolation principles and create performance bottlenecks. How to avoid: Design cross-stamp operations from the start using event streaming or data replication to a central analytics store. Use eventual consistency models for global features—they don’t need real-time accuracy. For admin operations, build tools that can query multiple stamps in parallel and aggregate results. Real example: Zendesk’s initial stamp design didn’t account for global reporting, forcing them to add a separate analytics pipeline that streams events from all stamps to a central data warehouse, adding significant complexity they could have designed in from the start.

Blast Radius: Isolated Stamps vs Shared Database

graph LR
    subgraph Correct: Isolated Stamps
        C_Router["Router"]
        
        subgraph C_Stamp1[Stamp 1]
            C_App1["App Servers"]
            C_DB1[("Database")]
        end
        
        subgraph C_Stamp2[Stamp 2]
            C_App2["App Servers"]
            C_DB2[("Database")]
        end
        
        subgraph C_Stamp3[Stamp 3]
            C_App3["App Servers"]
            C_DB3[("Database")]
        end
        
        C_Router --> C_App1
        C_Router --> C_App2
        C_Router --> C_App3
        C_App1 --> C_DB1
        C_App2 --> C_DB2
        C_App3 --> C_DB3
        
        C_DB2 -."❌ Failure<br/>affects only<br/>Stamp 2 users".-> C_DB2
    end
    
    subgraph Wrong: Shared Database
        W_Router["Router"]
        
        subgraph W_Stamp1[Stamp 1]
            W_App1["App Servers"]
        end
        
        subgraph W_Stamp2[Stamp 2]
            W_App2["App Servers"]
        end
        
        subgraph W_Stamp3[Stamp 3]
            W_App3["App Servers"]
        end
        
        W_Shared[("Shared Database<br/><i>Single Point of Failure</i>")]
        
        W_Router --> W_App1
        W_Router --> W_App2
        W_Router --> W_App3
        W_App1 --> W_Shared
        W_App2 --> W_Shared
        W_App3 --> W_Shared
        
        W_Shared -."đź’Ą Failure<br/>affects ALL<br/>stamps".-> W_Shared
    end

Sharing a database between stamps defeats the isolation benefit. In the correct architecture, a database failure affects only one stamp’s users (33% blast radius). With a shared database, all stamps fail simultaneously (100% blast radius), eliminating the primary value of stamps.


Math & Calculations

Stamp Capacity Planning

Determining the right size for each stamp requires calculating the bottleneck resource. Let’s work through a realistic example for a SaaS application.

Given Variables:

  • Database connection limit: 5,000 connections (PostgreSQL max_connections)
  • Average connections per active tenant: 25 connections (web servers + background workers)
  • Peak concurrent users per tenant: 50 users
  • Application server capacity: 100 requests/second per server
  • Target peak load per tenant: 10 requests/second
  • Desired headroom: 25% (operate at 75% max capacity)

Calculation 1: Database-Limited Capacity

Maximum tenants per stamp based on database connections:

Max tenants = (Connection limit Ă— (1 - Headroom)) / Connections per tenant
Max tenants = (5,000 Ă— 0.75) / 25
Max tenants = 3,750 / 25 = 150 tenants

Calculation 2: Application Server Capacity

Required application servers for 150 tenants at peak load:

Total peak RPS = Tenants Ă— Peak RPS per tenant
Total peak RPS = 150 Ă— 10 = 1,500 requests/second

Servers needed = Total peak RPS / Server capacity
Servers needed = 1,500 / 100 = 15 servers

With 25% headroom: 15 / 0.75 = 20 servers

Calculation 3: Stamp Cost Analysis

Monthly infrastructure cost per stamp:

Application servers: 20 Ă— $200/month = $4,000
Database (primary + 2 replicas): 3 Ă— $800/month = $2,400
Cache cluster (3 nodes): 3 Ă— $150/month = $450
Load balancers: 2 Ă— $100/month = $200
Storage (1TB): $100/month
Networking: $300/month

Total per stamp: $7,450/month
Cost per tenant: $7,450 / 150 = $49.67/month

Calculation 4: Scaling Decision Point

When to provision a new stamp:

Current utilization threshold: 75% of capacity
Trigger point = 150 Ă— 0.75 = 112 tenants

Provisioning lead time: 2 hours (automated deployment)
Tenant growth rate: 5 tenants/day

Alert threshold = Trigger point - (Growth rate Ă— Lead time buffer)
Alert threshold = 112 - (5 Ă— 2) = 102 tenants

Start provisioning new stamp when current stamp reaches 102 tenants

Calculation 5: Geographic Distribution

For a global service with 10,000 total tenants:

Tenants per stamp: 150
Total stamps needed: 10,000 / 150 = 67 stamps

Geographic distribution (based on customer locations):
North America (50%): 67 Ă— 0.50 = 34 stamps
Europe (30%): 67 Ă— 0.30 = 20 stamps
Asia-Pacific (15%): 67 Ă— 0.15 = 10 stamps
Other regions (5%): 67 Ă— 0.05 = 3 stamps

Total infrastructure cost: 67 Ă— $7,450 = $499,150/month
Revenue needed (assuming 40% margin): $499,150 / 0.40 = $1,247,875/month
Minimum ARPU required: $1,247,875 / 10,000 = $124.79/tenant/month

Key Insight: The database connection limit is typically the bottleneck in stamp sizing. In this example, even though we could theoretically fit more tenants based on compute capacity, the database connections limit us to 150 tenants per stamp. This is why many companies invest in connection pooling (PgBouncer, ProxySQL) to increase effective tenant density per stamp.


Real-World Examples

Microsoft Azure (Azure Resource Manager)

Azure’s control plane uses deployment stamps extensively, with each stamp called a “scale unit” serving a subset of Azure subscriptions. Each scale unit contains a complete copy of the Azure Resource Manager service, including API servers, databases, and orchestration engines. When you create Azure resources, your subscription is assigned to a specific scale unit based on region and capacity. Microsoft runs hundreds of scale units globally, with each unit designed to handle approximately 50,000 active subscriptions. The interesting detail: Azure’s scale units are versioned, allowing them to run different versions of the control plane simultaneously during upgrades. This enables zero-downtime deployments—new subscriptions go to updated scale units while old ones remain on previous versions until migrated. When a scale unit experiences issues, only the subscriptions in that unit are affected, limiting blast radius to typically less than 1% of total Azure customers. This architecture enabled Azure to scale from thousands to millions of subscriptions without fundamental redesign.

Slack (Workspace Isolation)

Slack assigns each workspace (team) to a specific deployment stamp, which they call “shards.” Each shard is a complete Slack deployment including MySQL databases, Redis caches, application servers, and Elasticsearch clusters, deployed across multiple availability zones within a region. Large enterprise customers often get dedicated shards, while smaller teams share multi-tenant shards with capacity limits (typically 100-200 workspaces per shard). The routing happens at the workspace subdomain level—when you access yourcompany.slack.com, DNS and load balancers route you to the specific shard hosting your workspace. The interesting detail: Slack’s shards are sized based on message volume rather than user count, because some workspaces have thousands of users but low activity, while others have hundreds of users posting thousands of messages daily. They monitor message throughput per shard and rebalance by migrating workspaces between shards during maintenance windows. This stamp architecture allowed Slack to maintain 99.99% availability even as they scaled to millions of workspaces, because individual shard failures affect only a small percentage of customers.

GitHub (Regional Stamps for GitHub Actions)

GitHub Actions uses deployment stamps for its CI/CD runner infrastructure. Each stamp contains compute clusters, job queues, artifact storage, and orchestration services deployed in specific regions (US-East, US-West, EU-West, APAC). When you trigger a workflow, GitHub’s routing layer assigns it to a stamp based on your repository’s primary region and current stamp capacity. Each stamp is designed to handle approximately 50,000 concurrent jobs. The interesting detail: GitHub uses “ephemeral stamps” for large open-source projects that generate massive CI workloads. When the Linux kernel or Kubernetes projects trigger builds, GitHub spins up dedicated temporary stamps with additional capacity, runs the workload, and tears down the stamp afterward. This prevents large OSS projects from overwhelming shared stamps and affecting smaller projects. The stamp architecture also enables GitHub to offer region-specific runners for customers with data residency requirements—your CI jobs and artifacts never leave your specified region because they’re processed entirely within that region’s stamp.

Stripe (Payment Processing Stamps)

Stripe’s payment processing infrastructure uses deployment stamps with active-active replication for high availability. Each stamp contains API servers, payment processing engines, fraud detection systems, and databases deployed across multiple availability zones. Stripe runs stamps in every major region (US, EU, APAC, etc.) and assigns merchants to stamps based on their primary business location. The interesting detail: Stripe replicates critical payment state across multiple stamps using consensus protocols (similar to Raft), so if a merchant’s primary stamp fails, their payment requests can be processed by a secondary stamp without data loss. This is more complex than typical stamp isolation but necessary for payment processing where even seconds of downtime means lost revenue. Stripe’s routing layer uses GeoDNS to direct merchants to their nearest stamp for low latency, but can automatically failover to distant stamps if the primary is unhealthy. This architecture enabled Stripe to achieve 99.999% availability for payment processing while handling millions of transactions per second globally, because stamp failures are isolated and automatically recovered through replication.

Tenant Migration Process Between Stamps

sequenceDiagram
    participant Admin as Migration Tool
    participant Registry as Routing Registry
    participant Source as Source Stamp
    participant Dest as Destination Stamp
    participant Router as Global Router
    
    Admin->>Source: 1. Take snapshot of<br/>Tenant X data
    Source-->>Admin: Snapshot complete
    
    Admin->>Dest: 2. Import snapshot to<br/>destination stamp
    Dest-->>Admin: Import complete
    
    Admin->>Registry: 3. Enable dual-write mode<br/>for Tenant X
    Note over Source,Dest: Both stamps receive writes<br/>Reads still from source
    
    loop Dual-Write Period (24-48 hours)
        Router->>Source: Write operations
        Router->>Dest: Write operations (replicated)
        Router->>Source: Read operations
    end
    
    Admin->>Dest: 4. Validate data consistency<br/>(checksums, row counts)
    Dest-->>Admin: âś“ Validation passed
    
    Admin->>Registry: 5. Update routing:<br/>Tenant X → Destination Stamp
    Registry-->>Router: Routing updated
    
    Note over Router,Dest: Traffic now flows to<br/>destination stamp
    
    Admin->>Source: 6. Monitor for 24 hours
    
    alt Migration Successful
        Admin->>Source: 7. Delete Tenant X data
        Source-->>Admin: Cleanup complete
    else Errors Detected
        Admin->>Registry: Rollback: Tenant X → Source
        Registry-->>Router: Routing reverted
    end

Zero-downtime tenant migration uses dual-write mode where both stamps receive updates while reads come from the source. After validation, routing is updated to the destination stamp. The process includes rollback capability if errors are detected during the monitoring period.


Interview Expectations

Mid-Level

What you should know: Explain deployment stamps as independent copies of your application stack that serve different subsets of users. Describe the basic architecture: multiple stamps, a global router, and tenant-to-stamp assignment. Understand the primary benefit is blast radius containment—when one stamp fails, only its users are affected. Be able to discuss when stamps make sense (multi-tenant SaaS, global applications) versus when they’re overkill (single-tenant applications, small user bases). Explain the trade-off between operational complexity (managing N stamps) and availability benefits.

Bonus points: Discuss specific routing strategies (GeoDNS, API gateway with lookup, embedded tokens). Mention capacity planning—how you determine stamp size and when to provision new stamps. Describe how you’d handle stamp failures (monitoring, alerting, failover). Reference real companies using stamps (Azure, Slack, Stripe) and explain why they chose this pattern. Discuss the difference between stamps and traditional sharding—stamps replicate the entire stack, while sharding typically splits just the database.

Senior

What you should know: Design a complete stamp architecture including routing layer, stamp provisioning, tenant assignment, and failure handling. Explain the different stamp variants (geographic, tenant-based, hybrid) and when to use each. Discuss operational challenges: configuration management, deployment orchestration, monitoring across stamps, and incident response. Analyze the cost implications—stamps are more expensive than monolithic deployments due to resource duplication, but provide better isolation. Explain how to handle cross-stamp operations like global search or analytics without violating isolation principles. Discuss migration strategies for moving tenants between stamps.

Bonus points: Quantify stamp capacity with calculations (database connections, compute capacity, storage). Discuss advanced routing strategies like weighted routing for gradual migrations or A/B testing. Explain how to implement zero-downtime deployments across stamps (blue-green per stamp, canary rollouts). Describe observability strategies for stamp fleets (aggregated metrics, per-stamp drill-down, anomaly detection). Discuss security implications—stamps provide natural security boundaries for tenant isolation. Explain how stamps interact with other patterns like circuit breakers, rate limiting, and chaos engineering. Mention specific technologies (Terraform for IaC, Spinnaker for orchestration, Prometheus for monitoring).

Staff+

What you should know: Architect stamp systems that balance isolation, cost, and operational complexity for specific business contexts. Discuss the evolution from monolithic to stamp-based architectures—when to make the transition, how to migrate incrementally, and how to avoid premature optimization. Analyze second-order effects: how stamps affect development velocity (testing becomes more complex), organizational structure (who owns stamp operations?), and business model (can you offer tiered pricing based on stamp isolation?). Design hybrid approaches that combine stamps with other patterns (multi-region active-active, edge computing, serverless). Discuss the long-term operational model—how to maintain consistency across hundreds of stamps over years.

Distinguishing signals: Propose novel stamp architectures for specific domains (IoT, gaming, financial services) with domain-specific constraints. Discuss the organizational and process changes needed to operate stamps successfully—this isn’t just a technical pattern, it requires changes to deployment processes, on-call rotations, and incident response. Analyze when stamps are the wrong choice and propose alternatives (better sharding, improved caching, vertical scaling). Discuss the economics of stamps at scale—at what point does the operational overhead outweigh the availability benefits? Explain how emerging technologies (Kubernetes, service mesh, edge computing) change the stamp calculus. Share specific war stories about stamp migrations, failures, or scaling challenges and the lessons learned.

Common Interview Questions

Q1: When would you choose deployment stamps over traditional database sharding?

60-second answer: Choose stamps when you need complete isolation of the entire application stack, not just the database. Stamps provide blast radius containment for all components (app servers, caches, queues), while sharding only isolates data. Use stamps for multi-tenant SaaS where tenant isolation matters for security, compliance, or SLA guarantees. Use sharding when you’re scaling a single-tenant system and just need to distribute data.

2-minute answer: The key difference is scope of isolation. Database sharding splits your data across multiple databases but typically shares application servers, caches, and other infrastructure. This is efficient but means a bug in the application code or a cache failure affects all shards. Deployment stamps replicate the entire stack, so a bug or failure in one stamp doesn’t affect others. Choose stamps when: (1) You’re building multi-tenant SaaS and need strong tenant isolation for security or compliance, (2) You need to offer different SLA tiers (enterprise customers get dedicated stamps), (3) You have geographic data residency requirements, or (4) Your blast radius from failures is unacceptably large. Choose sharding when you’re scaling a single-tenant system or when operational complexity of managing multiple stamps outweighs the isolation benefits. Many systems use both—stamps for tenant isolation and sharding within each stamp for data scaling.

Red flags: Saying stamps and sharding are the same thing. Not mentioning operational complexity. Claiming stamps are always better without discussing trade-offs.

Q2: How do you handle cross-stamp operations like global search or analytics?

60-second answer: Use eventual consistency and data replication. Each stamp streams events or data changes to a central analytics store (data warehouse, Elasticsearch cluster) that aggregates data across all stamps. Global search queries hit this central store, not individual stamps. Accept that results might be slightly stale (seconds to minutes) rather than real-time.

2-minute answer: Cross-stamp operations violate the isolation principle, so you need to design them carefully. The most common approach is event streaming: each stamp publishes events (user actions, data changes) to a central message bus (Kafka, Kinesis) that feeds a global analytics store. This store is read-only and eventually consistent—it might lag behind stamps by seconds or minutes. Global search, reporting, and admin dashboards query this central store, not individual stamps. For operations that need real-time data (like admin tools to look up a specific user), implement a scatter-gather pattern: the router queries all stamps in parallel, aggregates results, and returns them. This is expensive so you rate-limit these operations. For features that absolutely need cross-stamp real-time consistency (rare), you might need to rethink whether stamps are the right pattern. Example: Slack’s global search indexes messages from all shards into a central Elasticsearch cluster with 1-2 minute lag, which is acceptable for search use cases.

Red flags: Suggesting synchronous queries across all stamps for every request. Not mentioning eventual consistency. Proposing shared databases between stamps.

Q3: How do you handle tenant migration between stamps?

60-second answer: Build migration tooling that: (1) Exports tenant data from source stamp, (2) Imports to destination stamp, (3) Runs in dual-write mode where both stamps receive updates, (4) Validates data consistency, (5) Updates routing to point to new stamp, (6) Monitors for errors and can rollback. The migration happens during a maintenance window or gradually with zero downtime using dual-write.

2-minute answer: Tenant migration is complex but necessary for rebalancing or decommissioning stamps. The process: First, identify the tenant to migrate and provision capacity in the destination stamp. Second, take a snapshot of the tenant’s data in the source stamp and import it to the destination stamp—this is the bulk transfer. Third, enter dual-write mode where the application writes to both stamps simultaneously while reads still come from the source. This ensures no data loss during migration. Fourth, validate that destination stamp has complete, consistent data by comparing checksums or running validation queries. Fifth, update the routing registry to point the tenant to the destination stamp—this is the cutover. Sixth, monitor the tenant’s traffic on the new stamp for errors or performance issues. If problems occur, rollback by updating routing to point back to source. Finally, after a soak period (24-48 hours), delete the tenant’s data from the source stamp. For zero-downtime migrations, the dual-write period might last hours or days. For acceptable downtime, you can skip dual-write and just have a brief outage during cutover. Example: Dropbox’s stamp migration tool runs dual-write for 24 hours, validates 100% data consistency, then cuts over during low-traffic hours with automatic rollback if error rates spike.

Red flags: Not mentioning dual-write or validation steps. Suggesting migrations are simple. Not discussing rollback procedures.

Q4: What’s the right size for a deployment stamp?

60-second answer: Size stamps based on your recovery time objective (RTO). If you can rebuild or restore a stamp in 1-2 hours, that’s your blast radius during failures. Calculate capacity based on your bottleneck resource (usually database connections or storage). A common pattern is 100-500 tenants per stamp for B2B SaaS, but it varies widely based on tenant size and activity.

2-minute answer: Stamp sizing involves multiple factors. First, calculate your bottleneck resource—usually database connections, storage capacity, or compute throughput. For example, if your database supports 5,000 connections and each tenant uses 25 connections, you can fit 200 tenants per stamp (with headroom). Second, consider recovery time—if a stamp fails, how long to restore it? Larger stamps take longer to restore from backups or rebuild. A good rule of thumb: size stamps so they can be restored in 1-2 hours, which is your effective RTO. Third, consider blast radius—how many customers can you afford to impact during a stamp failure? If 5% is acceptable, you need at least 20 stamps. Fourth, factor in cost—smaller stamps have more overhead (more load balancers, more management endpoints) but better isolation. The sweet spot for many B2B SaaS companies is 100-500 tenants per stamp, but high-volume B2C might have stamps serving millions of users. Start with conservative sizing and adjust based on operational experience. Monitor your bottleneck metrics (database connections, CPU, storage) and provision new stamps when you hit 70-75% of capacity.

Red flags: Giving a specific number without context. Not mentioning recovery time or blast radius. Ignoring the bottleneck resource.

Q5: How does the routing layer work in a stamp architecture?

60-second answer: The router maintains a mapping of tenants/users to stamps in a highly available database. When a request arrives, the router extracts the tenant ID (from subdomain, header, or token), looks up which stamp serves that tenant, and proxies the request there. The lookup is cached aggressively to avoid latency. The router is typically a global load balancer or API gateway.

2-minute answer: The routing layer has several components. First, a routing registry—a globally replicated, highly available database (DynamoDB, Cosmos DB, Spanner) that stores tenant-to-stamp mappings. This registry is updated when tenants are assigned to stamps or migrated. Second, the router service itself—this can be a global load balancer (AWS Global Accelerator, Azure Front Door), API gateway (Kong, Ambassador), or custom proxy service. When a request arrives, the router extracts the tenant identifier (from subdomain like acme.yourapp.com, from a header, or from a JWT token), queries the routing registry for the stamp assignment, and proxies the request to that stamp’s load balancer. Third, aggressive caching—routing decisions are cached at multiple layers (CDN edge, router memory, client tokens) with TTLs of 30-300 seconds to avoid database lookups on every request. Fourth, health checking—the router monitors stamp health and stops routing to unhealthy stamps. Fifth, fallback logic—if the primary routing registry is unavailable, the router can use cached mappings or fallback to a secondary registry. The router is your most critical component—it must be over-provisioned, globally distributed, and have multiple layers of redundancy. Example: Stripe embeds stamp routing information in JWT tokens issued at authentication, so most requests don’t require a routing lookup at all—the token says “this user belongs to stamp-us-east-1” and the router trusts that.

Red flags: Not mentioning caching. Suggesting the router queries the database on every request. Not discussing router availability and redundancy.

Red Flags to Avoid

Red Flag 1: “Stamps are just horizontal scaling”

Why it’s wrong: Horizontal scaling typically means adding more instances of the same service behind a load balancer—you’re scaling one deployment. Stamps are fundamentally different: you’re replicating the entire application stack into multiple independent deployments. Horizontal scaling shares infrastructure (same database, same cache), while stamps isolate everything. The operational model is completely different—horizontal scaling is about adding capacity to one system, stamps are about managing multiple independent systems.

What to say instead: “Stamps are about replicating the entire stack for isolation, not just adding capacity. Unlike horizontal scaling where you add servers to one deployment, stamps create multiple complete, independent deployments. This provides blast radius containment and tenant isolation that horizontal scaling can’t achieve, but at the cost of operational complexity.”

Red Flag 2: “We’ll share the database between stamps for efficiency”

Why it’s wrong: Sharing a database between stamps defeats the primary purpose of stamps—isolation and blast radius containment. If stamps share a database, a database failure or slow query affects all stamps simultaneously. You’ve added all the operational complexity of managing multiple stamps without getting the availability benefits. Shared databases also become scaling bottlenecks and single points of failure.

What to say instead: “Each stamp must have its own database to maintain isolation. Yes, this means data duplication and higher costs, but that’s the trade-off for blast radius containment. If cost is a concern, we might need to reconsider whether stamps are the right pattern, or we could use larger stamps to reduce overhead. The only acceptable sharing is read-only reference data that doesn’t change frequently.”

Red Flag 3: “We’ll migrate tenants between stamps automatically based on load”

Why it’s wrong: Automatic tenant migration sounds elegant but is extremely complex and risky. Migrations involve data movement, dual-write periods, validation, and potential rollback—all of which can fail. Doing this automatically without human oversight is a recipe for data loss or corruption. Most companies that use stamps keep tenant assignments static and only migrate manually during planned maintenance windows.

What to say instead: “Tenant migration should be a manual, carefully orchestrated process, not automatic. We’ll build migration tooling and test it thoroughly, but migrations will be triggered by humans during maintenance windows. We’ll design stamps with enough capacity headroom that we rarely need to migrate for load balancing. Automatic migration adds significant complexity and risk that usually isn’t worth it.”

Red Flag 4: “Stamps are too complex; we’ll just scale vertically”

Why it’s wrong: While it’s true that stamps add operational complexity, dismissing them entirely means you’re accepting that a single failure can take down your entire service. For multi-tenant SaaS, especially B2B where customers expect high availability and data isolation, the complexity is often justified. Vertical scaling has hard limits—eventually you can’t buy a bigger server. The question isn’t whether stamps are complex, but whether the complexity is justified for your availability and isolation requirements.

What to say instead: “Stamps do add operational complexity, so we need to evaluate whether that’s justified for our requirements. For a B2B SaaS with enterprise customers expecting 99.99% uptime and data isolation, stamps make sense despite the complexity. For a B2C app with less stringent requirements, we might start with a simpler architecture and adopt stamps later when we’ve proven the need. The key is matching the architecture to the business requirements, not avoiding complexity for its own sake.”

Red Flag 5: “We’ll use microservices instead of stamps for isolation”

Why it’s wrong: Microservices and stamps solve different problems. Microservices decompose your application into smaller services for development velocity and team autonomy—they’re about organizational scaling. Stamps replicate your entire application for availability and tenant isolation—they’re about operational scaling. You can have microservices within each stamp, or monoliths within each stamp. They’re orthogonal concerns, not alternatives.

What to say instead: “Microservices and stamps address different concerns. Microservices help with development and team scaling by decomposing the application. Stamps help with availability and tenant isolation by replicating the entire deployment. We could have a microservices architecture where each stamp contains all the microservices, or a monolithic architecture where each stamp is one monolith. The choice of microservices vs. monolith is separate from the choice of stamps vs. single deployment.”


Key Takeaways

  • Deployment stamps replicate your entire application stack into multiple independent copies (stamps), each serving a bounded subset of users or tenants. This provides blast radius containment—when one stamp fails, only its assigned users are affected, not your entire customer base.

  • The routing layer is your most critical component and must be over-provisioned and highly available. It maintains tenant-to-stamp mappings and directs each request to the correct stamp. Use managed global load balancers or API gateways rather than custom solutions, and cache routing decisions aggressively.

  • Stamps trade operational complexity for availability and isolation. Managing N identical deployments is harder than managing one deployment, but the isolation benefits are significant for multi-tenant SaaS. Only adopt stamps when the availability and isolation requirements justify the operational overhead.

  • Size stamps based on recovery time, not just capacity. If a stamp fails, you need to restore it quickly. Design stamps that can be rebuilt or restored in 1-2 hours, which typically means 100-500 tenants per stamp for B2B SaaS. Your bottleneck resource (usually database connections) determines maximum capacity.

  • Complete isolation is non-negotiable—no shared databases or services between stamps except the routing layer. Any shared infrastructure defeats the blast radius containment benefit and creates single points of failure. Use event streaming and eventual consistency for cross-stamp operations like analytics rather than shared databases.

Prerequisites: Load Balancing • Database Sharding • Multi-Tenancy Patterns • Infrastructure as Code

Related Patterns: Circuit Breakers • Bulkhead Pattern • Chaos Engineering • Blue-Green Deployment

Next Steps: Geographic Distribution • Multi-Region Active-Active • Disaster Recovery • Capacity Planning