Monolithic Persistence Anti-Pattern: Polyglot Storage
After this topic, you will be able to:
- Identify single-database bottlenecks in growing systems
- Evaluate trade-offs between vertical scaling, read replicas, and sharding
- Recommend database decomposition strategies for microservices
- Justify polyglot persistence decisions based on access patterns
TL;DR
Monolithic persistence is the antipattern of using a single database to serve all application components, creating a bottleneck as systems scale. This approach couples unrelated domains, limits technology choices, and creates a single point of failure. The solution involves decomposing the database using strategies like database-per-service, read replicas, sharding, and polyglot persistence—choosing the right database technology for each workload’s access patterns.
Cheat Sheet: Single database → bottleneck. Solution: Database-per-service (microservices), read replicas (read-heavy), sharding (horizontal scale), polyglot persistence (workload-specific). Migration: identify bounded contexts → extract reads → CDC for sync → migrate writes → eventual consistency.
The Problem It Solves
When Airbnb started, a single MySQL database served everything: user profiles, listings, bookings, reviews, payments, and search. This worked fine at 10,000 listings. At 1 million listings with global traffic, that single database became the chokepoint for the entire platform. Every feature team competed for database resources. A slow analytics query could tank booking performance. Schema changes required coordinating across 15 teams. The database CPU hit 90% during peak hours, and vertical scaling (bigger machines) only delayed the inevitable.
Monolithic persistence creates three fundamental problems. First, it couples unrelated domains—your payment transactions share resources with image metadata queries, even though they have completely different performance characteristics. Second, it forces a one-size-fits-all technology choice—you’re stuck with relational databases even when your search workload needs full-text indexing or your session data needs sub-millisecond access. Third, it creates organizational bottlenecks—every team needs database admin approval for schema changes, and deployments become coordinated nightmares because everyone touches the same database.
The real pain emerges at scale. When your database handles 100,000 queries per second across 50 different tables serving 20 microservices, you can’t optimize for any specific workload. Read-heavy services suffer because write-heavy services lock tables. Time-series data (logs, metrics) bloats your transactional database. The database becomes the single point of failure—when it goes down, your entire platform dies, even services that could theoretically operate independently.
Monolithic Database Bottleneck
graph TB
subgraph "Monolithic Architecture"
UserApp["User App"]
DriverApp["Driver App"]
PricingService["Pricing Service"]
PaymentService["Payment Service"]
AnalyticsService["Analytics Service"]
MonoDB[("Single PostgreSQL DB<br/><i>CPU: 95%</i><br/><i>10M queries/day</i>")]
UserApp --"1. Read profiles"--> MonoDB
DriverApp --"2. Write locations<br/>(every 2s)"--> MonoDB
PricingService --"3. Calculate surge"--> MonoDB
PaymentService --"4. Process payments<br/>(ACID required)"--> MonoDB
AnalyticsService --"5. Scan millions of rows"--> MonoDB
end
Bottleneck["❌ Bottlenecks<br/>• Lock contention<br/>• Mixed workloads<br/>• Single point of failure<br/>• Schema coordination"]
MonoDB -.-> Bottleneck
A single database serving all services creates resource contention, lock conflicts, and becomes the system’s single point of failure. Different workloads (location updates, payments, analytics) compete for the same database resources despite having completely different performance characteristics.
Solution Overview
The solution is database decomposition: breaking the monolithic database into multiple specialized data stores, each optimized for its workload. This doesn’t mean blindly creating a database per microservice—that’s how you end up with a distributed monolith. Instead, you strategically separate concerns based on access patterns, consistency requirements, and team boundaries.
The core strategies are: Database-per-service for microservices architectures, where each bounded context owns its data and exposes it only through APIs. Read replicas for read-heavy workloads, offloading queries to separate instances while writes go to the primary. Sharding for horizontal scaling, partitioning data across multiple databases by a shard key (user ID, region, tenant). Polyglot persistence for workload-specific optimization—PostgreSQL for transactions, Elasticsearch for search, Redis for caching, Cassandra for time-series data.
The key insight: persistence should match your domain boundaries and access patterns, not be a shared resource. When Netflix moved from a monolithic Oracle database to microservices with Cassandra, DynamoDB, and MySQL, they didn’t just split the database—they chose each technology based on the service’s specific needs. Their viewing history service uses Cassandra for write-heavy, time-series data. Their billing service uses MySQL for ACID transactions. Their recommendation engine uses a graph database for relationship queries.
Database Decomposition Strategy
graph LR
subgraph "Decomposed Architecture"
subgraph "User Service"
UserAPI["User API"]
UserDB[("PostgreSQL<br/><i>ACID transactions</i>")]
UserAPI --> UserDB
end
subgraph "Location Service"
LocationAPI["Location API"]
LocationDB[("Cassandra<br/><i>Write-optimized</i>")]
LocationAPI --> LocationDB
end
subgraph "Payment Service"
PaymentAPI["Payment API"]
PaymentDB[("PostgreSQL<br/><i>Strong consistency</i>")]
PaymentAPI --> PaymentDB
end
subgraph "Search Service"
SearchAPI["Search API"]
SearchDB[("Elasticsearch<br/><i>Full-text search</i>")]
SearchAPI --> SearchDB
end
subgraph "Analytics Service"
AnalyticsAPI["Analytics API"]
Warehouse[("Snowflake<br/><i>Data warehouse</i>")]
AnalyticsAPI --> Warehouse
end
end
Client["Client Apps"] --"API calls"--> UserAPI
Client --"API calls"--> LocationAPI
Client --"API calls"--> PaymentAPI
Client --"API calls"--> SearchAPI
LocationAPI --"Events"--> SearchAPI
UserAPI --"Events"--> SearchAPI
Polyglot persistence strategy where each service owns its database and chooses the technology that fits its access patterns. Services communicate through APIs and events, never direct database access, enabling independent scaling and technology choices.
Migration Strategy
Migrating from monolithic persistence is a multi-phase journey that takes months, not weeks. Rushing it creates data inconsistencies and outages. Here’s the proven path:
Phase 1: Identify Bounded Contexts (2-4 weeks). Map your database tables to business domains. At Airbnb, this meant grouping tables: users, profiles, preferences → User Service; listings, amenities, photos → Listing Service; bookings, reservations, cancellations → Booking Service. Look for natural seams where tables rarely join across boundaries. If you’re constantly joining orders with inventory with shipping, those might belong together despite seeming like separate domains.
Phase 2: Extract Read Models (4-8 weeks). Start with reads because they’re safer—you’re not changing the source of truth yet. Create read replicas or materialized views for each service. Implement Change Data Capture (CDC) using tools like Debezium to stream changes from the monolithic database to service-specific read stores. The Booking Service gets a read-only copy of user data it needs, synced in near real-time. This phase proves your CDC pipeline works and lets you optimize read queries without touching writes.
Phase 3: Implement Dual Writes (4-8 weeks). Now tackle writes, but don’t cut over immediately. Write to both the monolithic database and the new service database simultaneously. Use the Strangler Fig pattern—new code writes to the new database, old code still writes to the monolith. Monitor for data drift between the two. This is your safety net. If the new database has issues, you can fall back to the monolith without data loss.
Phase 4: Migrate Write Traffic (2-4 weeks per service). Gradually shift write traffic to the new database. Start with 1% of writes, then 10%, 50%, 100%. Use feature flags to control the rollout. Keep dual-writing during this phase so you can roll back instantly. Monitor error rates, latency, and data consistency metrics obsessively.
Phase 5: Handle Cross-Service Data (ongoing). This is where most teams stumble. When the Booking Service needs user email addresses, it can’t just JOIN anymore. Options: API calls (synchronous, adds latency), event-driven sync (eventual consistency), data duplication (denormalize what you need). Choose based on consistency requirements. For displaying a user’s name on a booking, eventual consistency is fine—replicate the name into the Booking Service’s database. For charging a credit card, you need strong consistency—make a synchronous API call to the Payment Service.
Rollback Considerations: Keep the monolithic database as the source of truth for 3-6 months after migration. Continue dual-writing in the background. If you discover data corruption or consistency issues, you can revert to the monolith. Only decommission the old database after you’ve proven the new system handles peak load, edge cases, and disaster recovery scenarios.
Four-Phase Migration Journey
graph TB
Start(["Monolithic Database"])
Phase1["Phase 1: Identify Bounded Contexts<br/><i>2-4 weeks</i><br/>Map tables to domains"]
Phase2["Phase 2: Extract Read Models<br/><i>4-8 weeks</i><br/>CDC + Read replicas"]
Phase3["Phase 3: Dual Writes<br/><i>4-8 weeks</i><br/>Write to both databases"]
Phase4["Phase 4: Migrate Writes<br/><i>2-4 weeks per service</i><br/>Gradual traffic shift"]
Complete(["Decomposed Databases"])
Start --> Phase1
Phase1 --> Phase2
Phase2 --> Phase3
Phase3 --> Phase4
Phase4 --> Complete
Phase2 -."Rollback: Stop CDC".-> Start
Phase3 -."Rollback: Stop dual writes".-> Phase2
Phase4 -."Rollback: Route to monolith".-> Phase3
Safety["Safety Net:<br/>Keep monolith 3-6 months<br/>Monitor data consistency<br/>Feature flags for rollback"]
Phase4 -.-> Safety
Migration is a gradual, multi-phase process with rollback points at each stage. The key is maintaining the monolithic database as a safety net while progressively moving reads, then writes, to new service-specific databases. Each phase includes monitoring and validation before proceeding.
How It Works
Let’s walk through how monolithic persistence becomes a bottleneck and how decomposition solves it, using a ride-sharing platform like Uber as an example.
Step 1: The Monolithic Starting Point. Initially, one PostgreSQL database serves everything: users, drivers, rides, payments, ratings, locations. The Rider App queries user profiles. The Driver App updates locations every 2 seconds. The Pricing Service calculates surge pricing. The Payment Service processes transactions. All hitting the same database. At 10,000 rides per day, this works fine. At 10 million rides per day across 50 cities, the database CPU is pegged at 95%, and query latency spikes to 500ms during peak hours.
Step 2: Identify the Bottlenecks. Profile the workload. You discover: Location updates (write-heavy, high frequency, low consistency needs) consume 60% of database writes. Payment transactions (write-heavy, ACID required) need strong consistency but are only 5% of traffic. Ride history queries (read-heavy, analytical) scan millions of rows. User profile reads (read-heavy, cacheable) hit the same tables as payment writes, causing lock contention. These workloads have nothing in common—they shouldn’t share a database.
Step 3: Apply Read Replicas for Read-Heavy Services. The Ride History Service doesn’t need real-time data—showing a ride from 10 seconds ago is fine. Create read replicas and route all history queries there. This offloads 40% of read traffic from the primary database. The replicas can lag by a few seconds without impacting user experience. Suddenly, the primary database’s read load drops, and write performance improves because there’s less lock contention.
Step 4: Shard Write-Heavy Services. Location updates are killing write throughput. Shard the locations table by driver_id across 10 database instances. Each driver’s location updates go to a specific shard (hash driver_id % 10). Now you’ve 10x’d your write capacity. The Location Service owns these shards and exposes a “get driver location” API. Other services never query the location shards directly—they call the API.
Step 5: Polyglot Persistence for Specialized Workloads. The Payment Service needs ACID transactions—keep it on PostgreSQL. The Location Service needs fast writes and doesn’t need complex queries—migrate to Cassandra for better write throughput. The Surge Pricing Service needs to analyze real-time location data—stream location updates to a Redis cache for sub-millisecond access. The Ride History Service needs to query across millions of rows—move historical data (older than 30 days) to a data warehouse like Snowflake.
Step 6: Handle Cross-Service Data Needs. The Ride Service needs to show driver names and ratings when displaying active rides. Instead of joining across services, denormalize: when a driver updates their profile, publish a DriverProfileUpdated event. The Ride Service subscribes and updates its local copy of driver names. This is eventual consistency—the name might be stale for a few seconds, but that’s acceptable for this use case. For payment processing, where you need the current credit card status, make a synchronous API call to the Payment Service.
Cross-Service Data Access Patterns
sequenceDiagram
participant Client
participant RideService
participant DriverService
participant PaymentService
participant RideDB
participant DriverDB
participant PaymentDB
participant EventBus
Note over RideService,EventBus: Pattern 1: Eventual Consistency (Denormalization)
DriverService->>EventBus: Publish DriverProfileUpdated event
EventBus->>RideService: Subscribe to event
RideService->>RideDB: Update local copy of driver name
Client->>RideService: GET /rides/active
RideService->>RideDB: Query with denormalized driver name
RideService->>Client: Return ride with driver info
Note over RideService,PaymentDB: Pattern 2: Synchronous API Call (Strong Consistency)
Client->>RideService: POST /rides/complete
RideService->>PaymentService: POST /payments/charge
PaymentService->>PaymentDB: Process transaction (ACID)
PaymentDB-->>PaymentService: Success
PaymentService-->>RideService: Payment confirmed
RideService->>RideDB: Update ride status
RideService->>Client: Ride completed
Two strategies for cross-service data access: (1) Eventual consistency with event-driven denormalization for non-critical data like driver names, accepting seconds of staleness; (2) Synchronous API calls for critical operations like payments that require strong consistency and immediate validation.
Variants
Database-per-Service (Microservices Pattern): Each microservice owns its database schema and data. The User Service has a users_db, the Order Service has an orders_db. Services communicate only through APIs or events, never direct database access. Use this when you have clear bounded contexts and need team autonomy. The trade-off: you lose the ability to JOIN across services and must handle distributed transactions. Netflix uses this extensively—over 500 microservices, each with its own data store.
CQRS (Command Query Responsibility Segregation): Separate your read and write models entirely. Writes go to a normalized, transactional database optimized for consistency. Reads come from denormalized, eventually-consistent read models optimized for query performance. Use this for systems with vastly different read and write patterns—e-commerce product catalogs (millions of reads, few writes) or financial trading platforms (complex writes, simple reads). The trade-off: increased complexity and eventual consistency. Amazon uses CQRS for product catalog—writes update the source of truth, then asynchronously populate search indexes and recommendation engines.
Sharding with Consistent Hashing: Partition data across multiple databases using a shard key. Instagram shards user data by user_id—each user’s photos, posts, and followers live on a specific shard. Use this when you have a natural partition key and need horizontal scale. The trade-off: cross-shard queries become expensive, and rebalancing shards is complex. Instagram’s sharding strategy handles billions of photos by ensuring each user’s data is co-located.
Polyglot Persistence: Use different database technologies for different services based on access patterns. PostgreSQL for transactional data, MongoDB for document storage, Redis for caching, Elasticsearch for search, Cassandra for time-series. Use this when workloads have fundamentally different characteristics. The trade-off: operational complexity—you need expertise in multiple databases. Uber uses PostgreSQL for trip transactions, Redis for driver locations, and Cassandra for historical trip data.
CQRS: Separating Read and Write Models
graph LR
subgraph "Write Side (Command)"
WriteAPI["Write API<br/><i>Commands</i>"]
WriteDB[("PostgreSQL<br/><i>Normalized</i><br/><i>ACID transactions</i>")]
WriteAPI --"1. INSERT/UPDATE"--> WriteDB
end
subgraph "Event Stream"
CDC["Change Data Capture<br/><i>Debezium</i>"]
EventBus["Event Bus<br/><i>Kafka</i>"]
WriteDB --"2. Capture changes"--> CDC
CDC --"3. Publish events"--> EventBus
end
subgraph "Read Side (Query)"
ReadAPI["Read API<br/><i>Queries</i>"]
SearchIndex[("Elasticsearch<br/><i>Full-text search</i>")]
Cache[("Redis<br/><i>Hot data cache</i>")]
ReadDB[("PostgreSQL Replica<br/><i>Denormalized views</i>")]
EventBus --"4. Update"--> SearchIndex
EventBus --"4. Update"--> Cache
EventBus --"4. Update"--> ReadDB
ReadAPI --"5. Query"--> SearchIndex
ReadAPI --"5. Query"--> Cache
ReadAPI --"5. Query"--> ReadDB
end
Client["Client"] --"Write commands"--> WriteAPI
Client --"Read queries"--> ReadAPI
CQRS separates write operations (optimized for consistency and normalization) from read operations (optimized for query performance with denormalization). Changes flow asynchronously through CDC and event streams to multiple read models, each optimized for specific query patterns. Used when read and write patterns differ dramatically.
Sharding Strategy with Consistent Hashing
graph TB
Client["Client Request<br/><i>user_id: 12345</i>"]
Router["Shard Router<br/><i>hash(user_id) % num_shards</i>"]
Client --"1. Request"--> Router
subgraph "Shard Cluster"
Shard0[("Shard 0<br/><i>user_id % 4 = 0</i><br/>Users: 0,4,8...")]
Shard1[("Shard 1<br/><i>user_id % 4 = 1</i><br/>Users: 1,5,9...")]
Shard2[("Shard 2<br/><i>user_id % 4 = 2</i><br/>Users: 2,6,10...")]
Shard3[("Shard 3<br/><i>user_id % 4 = 3</i><br/>Users: 3,7,11...")]
end
Router --"2. Route to Shard 1<br/>(12345 % 4 = 1)"--> Shard1
Benefits["✓ Benefits<br/>• Horizontal write scaling<br/>• Data co-location<br/>• Independent failures"]
Tradeoffs["✗ Trade-offs<br/>• Cross-shard queries expensive<br/>• Rebalancing complex<br/>• Need consistent shard key"]
Shard1 -.-> Benefits
Shard1 -.-> Tradeoffs
Sharding partitions data across multiple databases using a shard key (typically user_id or tenant_id). Each shard handles a subset of users, enabling horizontal write scaling. The router uses consistent hashing to determine which shard owns each user’s data, ensuring all of a user’s data is co-located for efficient queries.
Trade-offs
Consistency vs. Availability: Monolithic databases give you strong consistency—every read sees the latest write because it’s all in one place. Decomposed databases force you to choose between consistency and availability (CAP theorem). If the User Service is down, can the Order Service still process orders without user data? Decision criteria: If you’re building a banking system, choose consistency (synchronous calls, distributed transactions). If you’re building a social feed, choose availability (eventual consistency, denormalize data).
Simplicity vs. Scalability: A single database is operationally simple—one backup strategy, one monitoring dashboard, one schema migration process. Decomposed databases scale horizontally but multiply operational complexity—10 databases mean 10 backup strategies, 10 monitoring dashboards, 10 schema migration processes. Decision criteria: If you’re under 10,000 QPS, stick with the monolith and scale vertically. If you’re hitting database limits and vertical scaling is too expensive, decompose strategically.
JOINs vs. API Calls: Monolithic databases let you JOIN across tables in a single query—fast and simple. Decomposed databases require API calls or event-driven sync to access related data—slower and more complex. Decision criteria: If your queries frequently JOIN across domains (orders + users + inventory), consider whether those domains should share a database. If JOINs are rare or can be replaced with denormalization, decompose.
Transaction Guarantees vs. Service Autonomy: Monolithic databases give you ACID transactions across all tables. Decomposed databases require distributed transactions (2PC, Saga pattern) or eventual consistency. Decision criteria: If you need atomic updates across multiple entities (transfer money between accounts), keep them in the same database or use a distributed transaction coordinator. If eventual consistency is acceptable (update user profile, then refresh search index), use events and accept temporary inconsistency.
When to Use (and When Not To)
Use monolithic persistence when you’re in the early stages of a product (pre-product-market-fit), your team is small (under 10 engineers), and your traffic is predictable (under 10,000 QPS). The operational simplicity and development speed of a single database outweigh the scalability limitations. Premature decomposition is worse than premature optimization—you’ll spend months building distributed systems infrastructure when you should be validating product ideas.
Decompose your database when you hit these signals: Database CPU consistently above 70% and vertical scaling is becoming prohibitively expensive (you’re already on the largest instance type). Schema changes require coordinating across multiple teams, slowing down development velocity. Different services have conflicting performance requirements—your analytics queries are slowing down transactional writes. You’re experiencing lock contention where unrelated workloads block each other. You need to scale teams independently—10 teams can’t all deploy safely when they share a database.
Avoid decomposition if: You don’t have clear bounded contexts—if your services are tightly coupled and constantly need data from each other, splitting the database will create a distributed monolith. You lack the operational maturity to run multiple databases—if you struggle to monitor and backup one database, 10 databases will be a nightmare. Your queries heavily rely on complex JOINs across domains—rewriting these as API calls or event-driven sync might be more expensive than the scalability gains.
Real-World Examples
company: Airbnb system: Listings and Booking Platform implementation: Started with a single MySQL database serving all services. As they scaled to millions of listings, they decomposed into service-specific databases: User Service (PostgreSQL for profiles), Listing Service (MySQL for listing data), Search Service (Elasticsearch for full-text search), Booking Service (PostgreSQL for transactions), and Pricing Service (Redis for real-time calculations). They used CDC (Change Data Capture) to sync data between services and implemented the Saga pattern for distributed transactions like booking flows that touch multiple services. interesting_detail: Airbnb’s migration took over 18 months and required building a custom data consistency framework. They kept dual-writing to the monolithic database for 6 months after migration as a safety net. The key learning: they decomposed based on team boundaries and access patterns, not just service boundaries. Some services still share databases when they’re owned by the same team and have similar consistency requirements.
company: Netflix system: Streaming Platform implementation: Migrated from a monolithic Oracle database to polyglot persistence with over 500 microservices. Viewing history uses Cassandra (write-heavy, time-series data). User profiles use MySQL (ACID transactions). Recommendations use a graph database (relationship queries). Search uses Elasticsearch (full-text indexing). Session state uses Redis (sub-millisecond access). Each service chooses the database that fits its access patterns, not a one-size-fits-all approach. interesting_detail: Netflix’s chaos engineering practice (Chaos Monkey) was born from their database decomposition. With hundreds of databases, they needed to ensure services could handle database failures gracefully. They deliberately kill database instances in production to verify that services degrade gracefully and don’t cascade failures. This forced them to implement circuit breakers, fallbacks, and eventual consistency patterns.
Interview Essentials
Mid-Level
Explain the difference between a monolithic database and database-per-service. Describe how read replicas help with read-heavy workloads. Walk through a simple sharding strategy using a user ID as the shard key. Understand the trade-off between JOINs (monolithic) and API calls (decomposed). Be able to identify when a single database is becoming a bottleneck (high CPU, slow queries, lock contention).
Senior
Design a migration strategy from a monolithic database to microservices with database-per-service. Explain CQRS and when to use it. Discuss consistency models (strong vs. eventual) and how to handle cross-service transactions using the Saga pattern. Describe polyglot persistence and justify database choices for different workloads (transactional, analytical, search, caching). Calculate the cost trade-offs between vertical scaling (bigger machines) and horizontal scaling (sharding, read replicas).
Staff+
Architect a complete database decomposition strategy for a large-scale system, including bounded context identification, CDC implementation, dual-write phases, and rollback plans. Discuss the organizational challenges of database decomposition—how to coordinate across teams, handle schema migrations, and maintain data quality. Explain how to measure success (latency improvements, cost reduction, deployment velocity). Debate the trade-offs between database-per-service and shared databases for tightly coupled services. Describe how to handle data consistency in a distributed system with hundreds of services.
Common Interview Questions
How do you decide when to split a monolithic database? (Answer: When you hit scalability limits, team coordination bottlenecks, or conflicting performance requirements. Look for natural bounded contexts and measure the cost of vertical scaling vs. decomposition.)
How do you handle transactions across multiple databases? (Answer: Saga pattern for eventual consistency, 2PC for strong consistency, or keep transactional entities in the same database.)
What’s the difference between sharding and read replicas? (Answer: Sharding partitions data across databases for write scalability. Read replicas duplicate data for read scalability. Use sharding when writes are the bottleneck, replicas when reads are.)
How do you maintain data consistency when services have their own databases? (Answer: Event-driven architecture with CDC, eventual consistency, denormalization, or synchronous API calls for strong consistency needs.)
Red Flags to Avoid
Suggesting database-per-service without considering the operational complexity and team maturity required to manage multiple databases.
Ignoring the CAP theorem—claiming you can have strong consistency, high availability, and partition tolerance simultaneously in a distributed database system.
Recommending complex distributed transactions (2PC) without understanding the performance and availability trade-offs.
Not discussing rollback strategies or dual-write phases during migration—treating database decomposition as a one-way door.
Focusing only on technical solutions without addressing organizational challenges—database decomposition requires team coordination and clear ownership boundaries.
Key Takeaways
Monolithic persistence creates bottlenecks at scale by coupling unrelated workloads, forcing one-size-fits-all technology choices, and creating organizational coordination overhead. The database becomes the single point of failure and the limiting factor for team velocity.
Decomposition strategies depend on your bottleneck: read replicas for read-heavy workloads, sharding for write scalability, database-per-service for team autonomy, and polyglot persistence for workload-specific optimization. Choose based on access patterns and consistency requirements, not architecture trends.
Migration is a multi-phase journey: identify bounded contexts, extract read models with CDC, implement dual writes, gradually shift write traffic, and handle cross-service data needs with eventual consistency or API calls. Keep the monolithic database as a safety net for 3-6 months.
The fundamental trade-off is simplicity vs. scalability. A single database is operationally simple but doesn’t scale horizontally. Decomposed databases scale but require distributed systems expertise, eventual consistency patterns, and sophisticated monitoring. Don’t decompose prematurely—wait until you have clear signals (high database CPU, team coordination bottlenecks, conflicting performance requirements).
Real-world success requires organizational alignment, not just technical solutions. Airbnb’s 18-month migration and Netflix’s chaos engineering practice show that database decomposition is as much about team boundaries, ownership models, and operational maturity as it is about database technology.