Database Replication: Master-Slave & Multi-Master

After this topic, you will be able to:

Compare master-slave vs master-master replication trade-offs
Evaluate replication strategies for read scalability and fault tolerance
Assess replication lag implications for application consistency requirements

TL;DR

Replication copies database data across multiple servers to improve read scalability, fault tolerance, and availability. Master-slave replication separates reads from writes by directing writes to a master and reads to replicas, while master-master allows writes on multiple nodes with conflict resolution. The fundamental trade-off is between consistency (synchronous replication) and performance/availability (asynchronous replication), with replication lag being the key operational challenge.

Cheat Sheet: Master-slave = writes to one, reads from many (Instagram’s feed). Master-master = writes to multiple with conflict resolution (rare in RDBMS). Async replication = fast but eventual consistency. Sync replication = consistent but slower. Replication lag = time for replica to catch up to master.

The Problem It Solves

Single-database architectures hit three critical walls as systems scale. First, read traffic overwhelms the database—when Instagram users scroll their feeds, millions of SELECT queries hammer the same database, creating a read bottleneck that no amount of connection pooling can solve. Second, hardware failures become inevitable at scale, and a single database represents a catastrophic single point of failure. When that database goes down, your entire application stops serving traffic. Third, geographic distribution becomes impossible—users in Singapore experience 200ms latency querying a database in Virginia, making real-time applications feel sluggish.

The traditional solution of vertical scaling (bigger servers) hits economic and physical limits quickly. A database server with 1TB of RAM and 96 cores costs exponentially more than four servers with 256GB and 24 cores each, yet still provides no redundancy. Replication solves these problems by maintaining synchronized copies of your data across multiple servers, enabling horizontal read scaling, automatic failover, and geographic distribution without application-level complexity.

Solution Overview

Replication maintains multiple synchronized copies of your database across different servers. One or more servers accept writes (masters), while other servers (replicas or slaves) receive a continuous stream of changes from the master and apply them locally. Applications can then distribute read queries across replicas, multiplying read capacity linearly with each replica added.

The replication process works through a transaction log—every write operation (INSERT, UPDATE, DELETE) is recorded in the master’s write-ahead log (WAL in PostgreSQL, binlog in MySQL). Replicas continuously pull or receive these log entries and replay them against their local data, maintaining an eventually-consistent copy. This architecture separates the write path (master only) from the read path (master plus all replicas), allowing you to scale reads independently of writes.

The critical design choice is synchronous versus asynchronous replication. Synchronous replication waits for replicas to acknowledge each write before confirming success to the application, guaranteeing consistency but adding latency. Asynchronous replication confirms writes immediately and replicates in the background, providing better performance but introducing replication lag—the time window where replicas are behind the master. Most production systems use asynchronous replication for performance, accepting eventual consistency as a trade-off.

Master-Slave Replication Architecture

graph LR
    subgraph Application Layer
        App["Application<br/><i>Web Server</i>"]
    end
    
    subgraph Database Layer
        Master["Master DB<br/><i>Accepts Writes</i>"]
        Replica1["Replica 1<br/><i>Read Only</i>"]
        Replica2["Replica 2<br/><i>Read Only</i>"]
        Replica3["Replica 3<br/><i>Read Only</i>"]
    end
    
    App --"1. Write Queries<br/>(INSERT/UPDATE/DELETE)"--> Master
    App --"2. Read Queries<br/>(SELECT)"--> Replica1
    App --"2. Read Queries<br/>(SELECT)"--> Replica2
    App --"2. Read Queries<br/>(SELECT)"--> Replica3
    
    Master --"3. Transaction Log<br/>(Binlog/WAL)"--> Replica1
    Master --"3. Transaction Log<br/>(Binlog/WAL)"--> Replica2
    Master --"3. Transaction Log<br/>(Binlog/WAL)"--> Replica3

Master-slave replication separates write and read paths. All writes go to the master, which replicates changes via transaction logs to read replicas. This architecture provides 4x read capacity (1 master + 3 replicas) while maintaining a single source of truth for writes.

How It Works

Step 1: Master Receives Write When an application executes UPDATE users SET last_login = NOW() WHERE id = 12345, the master database processes this transaction. Before confirming success, it writes the operation to its transaction log (binlog in MySQL, WAL in PostgreSQL). This log entry contains the exact change: table name, affected rows, old values, new values, and a log sequence number (LSN) that establishes ordering.

Step 2: Log Transmission Replicas maintain a persistent connection to the master and continuously request new log entries. In asynchronous replication, the master immediately returns success to the application after writing to its own log, then transmits log entries to replicas in the background. In synchronous replication, the master waits for at least one replica to acknowledge receiving and persisting the log entry before confirming success. Semi-synchronous replication (MySQL’s default) waits for acknowledgment that the replica received the entry, but not that it applied it.

Step 3: Replica Applies Changes Each replica maintains a replication position—the LSN of the last log entry it applied. The replica reads incoming log entries sequentially and replays them against its local data. For the example above, the replica executes the same UPDATE statement. This process is single-threaded in many databases (though PostgreSQL 10+ and MySQL 8+ support parallel replication), which is why replicas can lag behind a high-write-volume master.

Step 4: Read Distribution Applications connect to replicas for read queries. A connection pool might maintain connections to one master and three replicas, routing all writes to the master and distributing reads across all four servers. This provides 4x read capacity. However, the application must handle replication lag—a user who updates their profile might immediately query a replica that hasn’t received the change yet, seeing stale data.

Step 5: Failover (When Master Dies) When the master fails, a replica must be promoted to master. In manual failover, an operator identifies the replica with the most recent data (highest LSN), promotes it, and updates application configuration to point to the new master. In automatic failover (using tools like Patroni for PostgreSQL or Orchestrator for MySQL), a consensus system detects master failure, promotes the most up-to-date replica, and updates DNS or a proxy layer. The critical challenge is split-brain—ensuring the old master doesn’t come back online and accept writes, creating divergent data.

Replication Flow: Write to Read

sequenceDiagram
    participant App as Application
    participant Master as Master DB
    participant Log as Transaction Log<br/>(Binlog/WAL)
    participant Replica as Replica DB
    participant User as User Query
    
    App->>Master: 1. UPDATE users SET last_login=NOW()<br/>WHERE id=12345
    Master->>Log: 2. Write to transaction log<br/>(LSN: 1000)
    Master->>App: 3. Confirm success<br/>(async mode)
    
    Note over Master,Replica: Asynchronous Replication
    
    Log->>Replica: 4. Stream log entry<br/>(LSN: 1000)
    Replica->>Replica: 5. Apply UPDATE locally<br/>(single-threaded replay)
    
    Note over Replica: Replication Lag: 50-200ms
    
    User->>Replica: 6. SELECT * FROM users<br/>WHERE id=12345
    Replica->>User: 7. Return data<br/>(may be stale if lag exists)

The replication process shows how writes flow from application to master, get logged, and asynchronously replicate to replicas. The key challenge is replication lag—the time between step 3 (write confirmed) and step 5 (replica applies change), during which reads may return stale data.

Automatic Failover Process

stateDiagram-v2
    [*] --> NormalOperation: System starts
    
    NormalOperation: Master accepts writes<br/>Replicas serve reads
    NormalOperation --> MasterFailure: Master crashes/network partition
    
    MasterFailure: Consensus system detects failure<br/>(3-5 second timeout)
    MasterFailure --> ReplicaSelection: Health check fails
    
    ReplicaSelection: Select replica with highest LSN<br/>(most up-to-date data)
    ReplicaSelection --> Promotion: Replica chosen
    
    Promotion: Promote replica to master<br/>Enable write mode<br/>Fence old master (prevent split-brain)
    Promotion --> ConfigUpdate: Promotion complete
    
    ConfigUpdate: Update DNS/proxy<br/>Point apps to new master<br/>Reconfigure remaining replicas
    ConfigUpdate --> NormalOperation: Failover complete (10-30 sec)
    
    state MasterFailure {
        [*] --> DetectFailure
        DetectFailure --> VerifyFailure: Heartbeat timeout
        VerifyFailure --> ConsensusReached: Multiple nodes agree
    }
    
    state Promotion {
        [*] --> StopOldMaster
        StopOldMaster --> EnableWrites: Fencing complete
        EnableWrites --> [*]
    }

Automatic failover detects master failure through consensus, promotes the most up-to-date replica, and updates routing configuration. The critical challenge is preventing split-brain by fencing the old master. Total failover time is typically 10-30 seconds, during which writes are unavailable.

Variants

Master-Slave (Primary-Replica) Replication

One master accepts all writes, multiple replicas serve reads. This is the most common pattern, used by Instagram for their feed database, Dropbox for metadata storage, and virtually every high-traffic web application. Writes are serialized through the master, providing strong consistency for write operations. Reads scale horizontally by adding replicas.

When to use: Read-heavy workloads where reads outnumber writes 10:1 or more. Social media feeds, content delivery, e-commerce product catalogs.

Pros: Simple to reason about, no write conflicts, proven at massive scale.

Cons: Master is a write bottleneck and single point of failure. Replication lag affects read consistency.

Master-Master (Multi-Master) Replication

Multiple masters accept writes simultaneously, replicating changes to each other. Each master can serve both reads and writes. When two masters receive conflicting writes (updating the same row), conflict resolution determines which write wins—typically last-write-wins based on timestamp, or application-defined resolution logic.

When to use: Rare in RDBMS due to conflict complexity. Useful for geographically distributed writes where network partitions are common, or when write availability is more critical than consistency. Some CRM systems use this for offline-capable sales tools.

Pros: No single write bottleneck, survives master failures without promotion, enables multi-region writes.

Cons: Complex conflict resolution, potential data loss from conflicts, difficult to reason about consistency. Most teams avoid this pattern in RDBMS, preferring it for NoSQL databases designed for eventual consistency.

Synchronous vs Asynchronous Replication

Synchronous replication waits for replica acknowledgment before confirming writes. Asynchronous replication confirms writes immediately and replicates in the background. Semi-synchronous (MySQL) waits for one replica to acknowledge receipt but not application.

When to use synchronous: Financial transactions, inventory management, any system where reading stale data causes business problems. Expect 2-5ms additional write latency per synchronous replica.

When to use asynchronous: High-throughput systems, social media, analytics, content delivery. Accept eventual consistency for performance.

Cascading Replication

Replicas replicate from other replicas instead of all connecting to the master. The master replicates to 2-3 primary replicas, which each replicate to 2-3 secondary replicas. This reduces load on the master’s network and CPU.

When to use: When you need 10+ replicas and the master’s network becomes saturated sending replication traffic. Instagram uses this for their feed database with dozens of replicas per shard.

Pros: Reduces master load, enables massive replica counts.

Cons: Increases replication lag (secondary replicas lag behind primary replicas), more complex topology.

Replication Topology Comparison

graph TB
    subgraph Master-Slave Simple
        MS_Master["Master<br/><i>All Writes</i>"]
        MS_R1["Replica 1"]
        MS_R2["Replica 2"]
        MS_R3["Replica 3"]
        MS_Master --> MS_R1
        MS_Master --> MS_R2
        MS_Master --> MS_R3
    end
    
    subgraph Cascading Replication
        C_Master["Master<br/><i>All Writes</i>"]
        C_P1["Primary<br/>Replica 1"]
        C_P2["Primary<br/>Replica 2"]
        C_S1["Secondary<br/>Replica 1"]
        C_S2["Secondary<br/>Replica 2"]
        C_S3["Secondary<br/>Replica 3"]
        C_S4["Secondary<br/>Replica 4"]
        C_Master --> C_P1
        C_Master --> C_P2
        C_P1 --> C_S1
        C_P1 --> C_S2
        C_P2 --> C_S3
        C_P2 --> C_S4
    end
    
    subgraph Master-Master Rare
        MM_M1["Master 1<br/><i>Writes + Reads</i>"]
        MM_M2["Master 2<br/><i>Writes + Reads</i>"]
        MM_M1 <--"Bi-directional<br/>replication<br/>(conflict resolution)"--> MM_M2
    end

Three replication topologies: Master-slave (simple, most common), cascading replication (reduces master load for 10+ replicas, used by Instagram), and master-master (rare in RDBMS due to conflict resolution complexity). Cascading increases lag but scales to dozens of replicas.

Trade-offs

Consistency vs Performance

Synchronous replication: Every write waits for replica acknowledgment, guaranteeing replicas are never behind. Adds 2-10ms latency per write depending on network distance. Writes fail if replicas are unavailable. Choose this when reading stale data causes business problems—financial balances, inventory counts, booking systems.

Asynchronous replication: Writes complete immediately, replicas catch up in the background. Provides maximum write throughput and availability. Replicas lag behind by milliseconds to seconds (or minutes during load spikes). Choose this when eventual consistency is acceptable—social media feeds, analytics, content delivery.

Decision framework: Can your application tolerate reading data that’s 100ms old? If yes, use async. If no, use sync or semi-sync. Most systems use async and handle staleness at the application layer.

Availability vs Consistency (During Failures)

Automatic failover: Promotes a replica to master within seconds when the master fails, maintaining write availability. Risks split-brain if the old master comes back online, potentially losing data if the promoted replica was behind. Choose this for consumer-facing applications where downtime is unacceptable.

Manual failover: Requires human intervention to promote a replica, taking minutes to hours. Eliminates split-brain risk and allows careful data verification. Choose this for internal systems or when data integrity is more critical than availability.

Decision framework: What’s the cost of 5 minutes of write downtime versus the cost of losing 10 seconds of data? Consumer apps choose automatic failover; financial systems choose manual.

Read Scalability vs Operational Complexity

More replicas: Each replica multiplies read capacity. Instagram runs 10+ replicas per master for their feed database. However, each replica increases replication lag (master CPU spends more time sending updates), monitoring complexity, and backup/maintenance overhead.

Fewer replicas: Simpler operations, lower replication lag, but limited read scaling. Most teams start with 2-3 replicas (one master, two replicas) and add more only when read traffic demands it.

Decision framework: Add replicas when read query latency exceeds SLA or CPU utilization on existing replicas exceeds 70%. Don’t add replicas preemptively—they have real operational cost.

Synchronous vs Asynchronous Replication Timing

sequenceDiagram
    participant App
    participant Master
    participant Replica
    
    rect rgb(255, 200, 200)
        Note over App,Replica: Synchronous Replication (Consistent but Slow)
        App->>Master: Write Request
        Master->>Master: Write to local log
        Master->>Replica: Send log entry
        Replica->>Replica: Persist log entry
        Replica->>Master: ACK received
        Master->>App: Confirm success<br/>(+2-10ms latency)
        Note over App,Replica: Replica guaranteed consistent<br/>Write fails if replica unavailable
    end
    
    rect rgb(200, 255, 200)
        Note over App,Replica: Asynchronous Replication (Fast but Eventual)
        App->>Master: Write Request
        Master->>Master: Write to local log
        Master->>App: Confirm success immediately
        Note over Master,Replica: Background replication
        Master->>Replica: Send log entry (async)
        Replica->>Replica: Apply when received
        Note over App,Replica: Replica eventually consistent<br/>50-500ms replication lag
    end

Synchronous replication waits for replica acknowledgment before confirming writes, guaranteeing consistency but adding latency. Asynchronous replication confirms immediately and replicates in the background, providing better performance but introducing replication lag. Most production systems choose async for performance.

When to Use (and When Not To)

Use Master-Slave Replication When:

Your read traffic significantly exceeds write traffic (10:1 ratio or higher). E-commerce product catalogs, social media feeds, content management systems, and analytics dashboards all fit this pattern. You need fault tolerance but can tolerate brief write downtime during failover. Your data model has clear read/write separation—users read frequently but write occasionally.

You need geographic distribution for read latency but writes can be centralized. Dropbox replicates metadata databases to regional replicas so file listings load quickly, but all writes go to the primary region.

Avoid Replication When:

Writes dominate your workload. If you’re processing high-frequency sensor data or financial tick data with minimal reads, replication adds overhead without benefit. Consider time-series databases or write-optimized storage instead.

You need strong consistency for all reads. If your application cannot tolerate any replication lag (real-time trading systems, collaborative editing), replication introduces complexity. Use synchronous replication with performance penalties, or reconsider your consistency requirements.

Your dataset is small enough for a single server. If your entire database fits in memory on one server and handles your traffic comfortably, replication adds operational complexity without benefit. Instagram started with a single database; they added replication only when read traffic exceeded single-server capacity.

Use Master-Master Replication When:

You need active-active multi-region writes and can handle conflict resolution. This is rare in RDBMS. Most teams use master-slave with regional masters and cross-region replication instead. Consider NoSQL databases like Cassandra or DynamoDB for true multi-master scenarios.

Anti-Patterns:

Using replicas as backups. Replication propagates corruption and accidental deletes instantly. Use point-in-time backups separately. Assuming replicas are always consistent. Always design applications to handle replication lag. Reading from master after writes to avoid staleness—this defeats the purpose of replication. Use proper read-after-write consistency patterns instead.

Real-World Examples

Instagram: Feed Database Replication

Instagram’s feed database uses master-slave replication with 10+ replicas per master to handle billions of feed reads daily. When you open Instagram, your feed query hits a read replica, not the master. They use asynchronous replication to maximize write throughput for new posts, accepting that users might not see a friend’s post for 100-200ms after it’s published. Their interesting twist: they use cascading replication where the master replicates to 3 primary replicas, which each replicate to 3-4 secondary replicas, reducing network load on the master. They monitor replication lag closely and remove replicas from the read pool if lag exceeds 1 second, preventing users from seeing severely stale data.

Dropbox: Metadata Database with Regional Replicas

Dropbox replicates their metadata database (which stores file paths, permissions, and sharing info) to regional replicas across US, Europe, and Asia. All writes go to the primary master in the US, but reads are served from the nearest regional replica. This reduces file listing latency from 200ms to 20ms for international users. They use semi-synchronous replication to at least one replica to prevent data loss during master failures, accepting the 2-3ms write latency penalty. When a user shares a file, they write to the master and immediately read from the same master to display the updated sharing settings, avoiding replication lag issues for read-after-write scenarios.

GitHub: MySQL Replication for Repository Metadata

GitHub uses MySQL master-slave replication for repository metadata (stars, forks, issues). They run 5-6 replicas per master and distribute read queries using a connection proxy that tracks replication lag. When lag exceeds 500ms on a replica, the proxy stops sending queries to it until it catches up. During their 2018 incident, a network partition caused replicas to lag by 40+ minutes, and they had to pause all writes to let replicas catch up before resuming service. This taught them to implement lag-aware read routing and better monitoring of replication health as a critical operational metric.

Interview Essentials

Mid-Level

Explain master-slave replication architecture clearly: one master accepts writes, multiple replicas serve reads, replication happens via transaction log. Understand the difference between synchronous and asynchronous replication and when to use each. Know that replication lag exists and can cause users to read stale data. Be able to describe a basic failover process: detect master failure, promote replica, update application configuration. Understand that replication provides read scaling and fault tolerance but doesn’t solve write scaling (that requires sharding). Calculate read capacity: if one database handles 1000 QPS and you add 3 replicas, you can handle ~4000 read QPS (assuming even distribution).

Senior

Design a complete replication strategy for a specific use case, justifying synchronous vs asynchronous choice based on consistency requirements and performance needs. Explain replication lag implications deeply: why it happens (single-threaded replay, network delays, write bursts), how to monitor it (replication lag metrics, LSN tracking), and how to handle it in application code (read-after-write consistency, session stickiness, lag-aware routing). Describe failover mechanisms in detail: manual vs automatic, split-brain prevention, data loss scenarios during async replication failover. Understand when master-master replication makes sense (rarely in RDBMS) and why conflict resolution is complex. Discuss operational concerns: monitoring replication health, handling replica failures, backup strategies that complement replication. Be ready to combine replication with sharding: each shard has its own master and replicas.

Staff+

Architect replication strategies for multi-region systems, balancing latency, consistency, and cost. Explain why most companies avoid master-master in RDBMS despite its theoretical benefits—conflict resolution complexity, debugging difficulty, and operational burden outweigh benefits. Design sophisticated failover systems: consensus-based leader election (using etcd or ZooKeeper), automatic promotion with fencing to prevent split-brain, zero-downtime failover using connection proxies. Discuss replication lag at scale: Instagram’s cascading replication to reduce master load, lag-aware load balancing, degraded-mode operation when lag exceeds thresholds. Understand replication’s interaction with other patterns: combining with sharding for horizontal scaling, using replicas for analytics workloads, cross-region replication for disaster recovery. Explain why replication alone doesn’t provide backup—corruption and deletes replicate instantly—and how to architect proper backup strategies alongside replication. Discuss emerging patterns: logical replication for heterogeneous systems, change data capture (CDC) for event streaming, using replicas for zero-downtime migrations.

Common Interview Questions

How do you handle replication lag in your application? (Answer: Read-after-write consistency by reading from master after writes, session stickiness to same replica, lag-aware routing that removes slow replicas, or accepting eventual consistency and designing UI accordingly)

What happens if the master fails during a write? (Answer: In async replication, the write might be lost if not replicated yet. In sync replication, the write fails but no data is lost. This is why financial systems use sync replication despite performance cost)

How do you prevent split-brain during failover? (Answer: Fencing mechanisms that prevent old master from accepting writes, consensus systems that ensure only one master exists, STONITH—shoot the other node in the head—in extreme cases)

Why not use master-master replication everywhere? (Answer: Conflict resolution is complex, last-write-wins loses data, application-defined resolution is hard to get right, debugging is nightmare. Most teams use master-slave and accept write bottleneck)

How do you scale writes if replication only scales reads? (Answer: Sharding—partition data across multiple masters, each with their own replicas. See Sharding for details)

Red Flags to Avoid

Saying replication provides strong consistency without mentioning synchronous vs asynchronous trade-offs

Claiming replicas can be used as backups without discussing corruption propagation

Not understanding replication lag or assuming replicas are always up-to-date

Suggesting master-master for RDBMS without acknowledging conflict resolution complexity

Not knowing how failover works or what split-brain means

Confusing replication (copying entire database) with sharding (partitioning data)

Key Takeaways

Replication copies data across multiple servers to provide read scaling (horizontal) and fault tolerance, but doesn’t solve write scaling—that requires sharding. Master-slave is the dominant pattern: writes go to one master, reads distribute across replicas.

The fundamental trade-off is synchronous (consistent but slow) vs asynchronous (fast but eventual consistency) replication. Most production systems use async and handle replication lag at the application layer through read-after-write consistency patterns.

Replication lag—the time for replicas to catch up to the master—is the key operational challenge. Monitor it closely, remove slow replicas from rotation, and design applications to tolerate 100-500ms of staleness.

Failover is complex: automatic failover provides availability but risks split-brain and data loss; manual failover is safer but slower. Use fencing mechanisms and consensus systems to prevent split-brain in automatic failover.

Master-master replication is rare in RDBMS due to conflict resolution complexity. Most teams use master-slave with regional masters instead. Reserve multi-master for NoSQL databases designed for eventual consistency.