Publisher-Subscriber Pattern: Pub/Sub Architecture

After this topic, you will be able to:

Implement pub-sub pattern for event broadcasting
Design topic hierarchies and subscription filters
Compare pub-sub vs point-to-point messaging trade-offs

TL;DR

The Publisher/Subscriber (pub-sub) pattern decouples message producers from consumers by introducing an intermediary message broker. Publishers send messages to named topics without knowing who will receive them, while subscribers express interest in topics and receive all matching messages. This enables one-to-many broadcasting, dynamic scaling of consumers, and loose coupling between system components.

The Problem It Solves

Traditional point-to-point messaging creates tight coupling between senders and receivers. When a service needs to notify multiple downstream systems about an event—like a user registration, order placement, or sensor reading—direct API calls force the sender to know every recipient’s address, handle their failures independently, and scale proportionally with the number of consumers. This becomes unmaintainable at scale.

Consider an e-commerce order service that needs to notify inventory management, shipping, analytics, fraud detection, and email systems when an order is placed. With direct calls, the order service must maintain five separate integrations, retry logic for each, and block until all five respond. Adding a sixth consumer requires code changes to the order service. The coupling is so tight that a slow analytics service can delay order confirmations to customers.

The pub-sub pattern solves this by introducing a message broker that acts as an intermediary. Publishers announce events to topics without knowing or caring who subscribes. Subscribers register interest in topics and receive messages asynchronously. This decoupling enables independent scaling, failure isolation, and dynamic addition of consumers without touching publisher code. When Netflix’s recommendation engine needs to react to viewing events, it subscribes to the viewing topic—the video player doesn’t need to know the recommendation engine exists.

Solution Overview

The pub-sub pattern introduces three core components: publishers that emit messages, topics that categorize messages, and subscribers that consume messages from topics of interest. The message broker sits in the middle, managing topic routing and subscriber delivery.

Publishers send messages to named topics (like “user.registered” or “order.placed”) without addressing specific recipients. The broker receives these messages and fans them out to all active subscribers of that topic. Each subscriber receives its own copy of the message, processed independently and asynchronously. If a subscriber is slow or fails, it doesn’t affect other subscribers or the publisher.

This architecture enables powerful patterns: multiple teams can independently build services that react to the same events, new consumers can be added without publisher changes, and services can scale their subscriber instances based on message volume. Google Cloud Pub/Sub processes over 500 million messages per second using this model, supporting everything from IoT sensor data to YouTube video processing pipelines.

The pattern supports both push (broker delivers to subscriber endpoints) and pull (subscribers fetch messages) delivery models, topic hierarchies for filtering (“orders.*” matches “orders.placed” and “orders.cancelled”), and various delivery guarantees from at-most-once to exactly-once semantics.

Publisher/Subscriber Architecture with Fan-Out

graph LR
    Publisher1["Order Service<br/><i>Publisher</i>"]
    Publisher2["Payment Service<br/><i>Publisher</i>"]
    
    subgraph Message Broker
        TopicOrders["Topic: orders.placed"]
        TopicPayments["Topic: payments.processed"]
    end
    
    Inventory["Inventory Service<br/><i>Subscriber</i>"]
    Shipping["Shipping Service<br/><i>Subscriber</i>"]
    Analytics["Analytics Service<br/><i>Subscriber</i>"]
    Email["Email Service<br/><i>Subscriber</i>"]
    Fraud["Fraud Detection<br/><i>Subscriber</i>"]
    
    Publisher1 --"1. Publish event"--> TopicOrders
    Publisher2 --"1. Publish event"--> TopicPayments
    
    TopicOrders --"2a. Fan-out copy 1"--> Inventory
    TopicOrders --"2b. Fan-out copy 2"--> Shipping
    TopicOrders --"2c. Fan-out copy 3"--> Analytics
    TopicOrders --"2d. Fan-out copy 4"--> Email
    
    TopicPayments --"2a. Fan-out"--> Analytics
    TopicPayments --"2b. Fan-out"--> Fraud
    
    Inventory --"3. Ack"--> TopicOrders
    Shipping --"3. Ack"--> TopicOrders
    Analytics --"3. Ack"--> TopicOrders
    Email --"3. Ack"--> TopicOrders

The broker receives messages from publishers and creates independent copies for each subscriber. When Order Service publishes to ‘orders.placed’, four subscribers receive their own copy asynchronously. Publishers never wait for subscribers, and slow subscribers don’t affect others.

How It Works

Step 1: Topic Creation and Subscription Setup Before messages flow, topics must be created and subscribers must register their interest. A topic is a named channel like “payment.processed” or “sensor.temperature”. Subscribers create subscriptions to topics, optionally specifying filters (“only messages where region=us-west”) or delivery preferences (push vs pull). In Google Cloud Pub/Sub, you might create a subscription “analytics-team-payments” on the “payment.processed” topic with a push endpoint at “https://analytics.example.com/webhook”.

Step 2: Message Publication When an event occurs, the publisher constructs a message containing the event data (typically JSON) and optional attributes for filtering. The publisher calls the broker’s publish API with the topic name and message payload. The broker acknowledges receipt immediately—the publisher’s job is done. It doesn’t wait for subscribers to process the message or even know if any subscribers exist. This fire-and-forget model keeps publishers fast and decoupled.

Step 3: Message Routing and Fan-Out The broker examines the topic and identifies all active subscriptions. It creates a copy of the message for each subscription, applying any subscription filters. If three services subscribe to “payment.processed”, the broker creates three independent message copies. Each copy is tracked separately—if one subscriber fails to process its copy, the others are unaffected.

Step 4: Message Delivery For push subscriptions, the broker HTTP POSTs the message to the subscriber’s endpoint. For pull subscriptions, the message sits in a queue until the subscriber fetches it. Subscribers process the message and send an acknowledgment (ack) back to the broker. If no ack arrives within a timeout (typically 10-600 seconds), the broker redelivers the message. This ensures at-least-once delivery—messages may be delivered multiple times, so subscribers must be idempotent.

Step 5: Acknowledgment and Cleanup Once all subscriptions acknowledge a message, the broker can delete it. If a subscription never acks, the message is retried with exponential backoff and eventually moved to a dead-letter queue for manual inspection. This prevents poison messages from blocking the queue while ensuring no data loss.

Example: YouTube Video Upload When a creator uploads a video to YouTube, the upload service publishes a “video.uploaded” message containing the video ID and metadata. Multiple subscribers react independently: the transcoding service pulls the message and starts generating different resolutions, the thumbnail generator creates preview images, the content moderation service scans for policy violations, and the recommendation engine updates creator statistics. Each service scales independently—if transcoding is slow, it doesn’t delay thumbnail generation. The upload service never knew these downstream systems existed.

Message Lifecycle: Publish to Acknowledgment

sequenceDiagram
    participant P as Publisher<br/>(Order Service)
    participant B as Message Broker
    participant S1 as Subscriber 1<br/>(Inventory)
    participant S2 as Subscriber 2<br/>(Shipping)
    participant DLQ as Dead Letter Queue
    
    Note over P,S2: Step 1: Publication
    P->>B: 1. Publish(topic="orders.placed", data={orderId:123})
    B-->>P: 2. Ack (message accepted)
    Note over P: Publisher done - fire and forget
    
    Note over B,S2: Step 2: Fan-Out & Delivery
    B->>B: 3. Create copies for each subscription
    B->>S1: 4a. Deliver message copy 1
    B->>S2: 4b. Deliver message copy 2 (parallel)
    
    Note over S1,S2: Step 3: Processing
    S1->>S1: 5a. Process message
    S2->>S2: 5b. Process message (slower)
    
    Note over S1,DLQ: Step 4: Acknowledgment
    S1-->>B: 6a. Ack (success)
    Note over B,S1: Message deleted for S1
    
    Note over S2,DLQ: Step 5: Retry on Failure
    S2--xB: 6b. No ack (timeout)
    B->>S2: 7. Redeliver (attempt 2)
    S2--xB: No ack (still failing)
    B->>S2: 8. Redeliver (attempt 3)
    S2--xB: No ack (max retries)
    B->>DLQ: 9. Move to dead letter queue
    Note over DLQ: Manual inspection needed

The complete message flow from publication through delivery, acknowledgment, and retry handling. Publishers receive immediate acknowledgment and don’t wait for subscribers. Failed messages are retried with exponential backoff before moving to a dead-letter queue.

Handling 1M Followers: Partitioned Fan-Out Strategy

graph TB
    Creator["Content Creator<br/><i>Publishes post</i>"]
    
    subgraph Message Broker
        Topic["Topic: user.123.posted"]
        P1["Partition 0<br/><i>Followers 0-249K</i>"]
        P2["Partition 1<br/><i>Followers 250K-499K</i>"]
        P3["Partition 2<br/><i>Followers 500K-749K</i>"]
        P4["Partition 3<br/><i>Followers 750K-999K</i>"]
    end
    
    subgraph Feed Service - Consumer Group
        C1["Consumer 1<br/><i>Reads P0</i>"]
        C2["Consumer 2<br/><i>Reads P1</i>"]
        C3["Consumer 3<br/><i>Reads P2</i>"]
        C4["Consumer 4<br/><i>Reads P3</i>"]
    end
    
    subgraph Fan-Out Workers
        W1["Worker 1<br/><i>Writes 250K feeds</i>"]
        W2["Worker 2<br/><i>Writes 250K feeds</i>"]
        W3["Worker 3<br/><i>Writes 250K feeds</i>"]
        W4["Worker 4<br/><i>Writes 250K feeds</i>"]
    end
    
    Cache[("Redis Cache<br/><i>User feeds</i>")]
    
    Creator --"1. POST /api/post"--> Topic
    Topic --"2a. Route by hash"--> P1
    Topic --"2b. Route by hash"--> P2
    Topic --"2c. Route by hash"--> P3
    Topic --"2d. Route by hash"--> P4
    
    P1 --"3a. Pull"--> C1
    P2 --"3b. Pull"--> C2
    P3 --"3c. Pull"--> C3
    P4 --"3d. Pull"--> C4
    
    C1 --"4a. Batch process"--> W1
    C2 --"4b. Batch process"--> W2
    C3 --"4c. Batch process"--> W3
    C4 --"4d. Batch process"--> W4
    
    W1 & W2 & W3 & W4 --"5. Write to feeds<br/>(parallel)"--> Cache
    
    Note["Strategy: Partition followers into buckets,<br/>process each partition in parallel.<br/>Total time: ~10 seconds for 1M followers<br/>(vs 16 minutes sequential)"]

To handle 1 million followers, partition the follower list into buckets (e.g., 4 partitions of 250K each). Each partition is processed by a dedicated consumer in parallel. This reduces fan-out time from O(n) sequential to O(n/p) parallel, where p is the number of partitions.

Variants

Topic-Based Routing The standard pub-sub model where messages are routed purely by topic name. Subscribers choose topics, and all messages on that topic are delivered. This is simple and efficient but requires publishers and subscribers to agree on topic naming conventions. Use this when message categories are well-defined and relatively static. Google Cloud Pub/Sub and AWS SNS use this model. Pros: simple, fast, easy to reason about. Cons: limited filtering flexibility, topic proliferation if you need fine-grained routing.

Content-Based Routing Subscribers specify filters based on message content or attributes, not just topic names. A subscriber might request “all payment messages where amount > 1000 and country = ‘US’”. The broker evaluates filters against each message and delivers only matches. This reduces unnecessary message delivery and processing but requires more broker CPU for filter evaluation. AWS EventBridge supports this with pattern matching on event JSON. Pros: precise targeting, fewer wasted messages. Cons: complex filter syntax, broker performance overhead, harder to debug routing.

Hierarchical Topics Topics are organized in a tree structure like “orders/placed/us-west” and subscribers can use wildcards (“orders/placed/” or “orders//us-west”). This combines the simplicity of topic-based routing with some filtering flexibility. MQTT brokers use this extensively for IoT scenarios where devices publish to hierarchical paths. Pros: intuitive organization, efficient wildcard matching. Cons: topic hierarchy must be designed upfront, limited to path-based filtering.

Durable vs Ephemeral Subscriptions Durable subscriptions persist messages even when the subscriber is offline—messages accumulate until the subscriber reconnects. Ephemeral subscriptions only deliver messages while the subscriber is actively connected; offline periods mean lost messages. Use durable for critical workflows where every message matters (payment processing), ephemeral for real-time updates where stale data has no value (stock prices, live sports scores). Pros (durable): guaranteed delivery, survives restarts. Cons (durable): storage costs, backlog management. Pros (ephemeral): no storage overhead, always fresh data. Cons (ephemeral): message loss during downtime.

Topic Routing Variants: Topic-Based vs Content-Based

graph TB
    subgraph Topic-Based Routing
        P1["Publisher"]
        T1["Topic: payments"]
        S1["Subscriber A<br/><i>All payments</i>"]
        S2["Subscriber B<br/><i>All payments</i>"]
        
        P1 --"amount=500<br/>country=US"--> T1
        P1 --"amount=2000<br/>country=UK"--> T1
        T1 --"Delivers ALL"--> S1
        T1 --"Delivers ALL"--> S2
    end
    
    subgraph Content-Based Routing
        P2["Publisher"]
        T2["Topic: payments"]
        S3["Subscriber C<br/><i>Filter: amount>1000</i>"]
        S4["Subscriber D<br/><i>Filter: country=US</i>"]
        
        P2 --"amount=500<br/>country=US"--> T2
        P2 --"amount=2000<br/>country=UK"--> T2
        T2 --"❌ Filtered out<br/>(amount≤1000)"--> S3
        T2 --"✓ Delivers<br/>(amount>1000)"--> S3
        T2 --"✓ Delivers<br/>(country=US)"--> S4
        T2 --"❌ Filtered out<br/>(country≠US)"--> S4
    end
    
    subgraph Hierarchical Topics
        P3["Publisher"]
        T3["orders/placed/us-west"]
        T4["orders/placed/us-east"]
        T5["orders/cancelled/us-west"]
        S5["Subscriber E<br/><i>orders/placed/*</i>"]
        S6["Subscriber F<br/><i>orders/*/us-west</i>"]
        
        P3 --> T3
        P3 --> T4
        P3 --> T5
        T3 --"✓ Match"--> S5
        T4 --"✓ Match"--> S5
        T5 --"❌ No match"--> S5
        T3 --"✓ Match"--> S6
        T4 --"❌ No match"--> S6
        T5 --"✓ Match"--> S6
    end

Three routing approaches: Topic-based delivers all messages to all subscribers (simple but may waste processing). Content-based filters messages by attributes (precise but requires broker CPU). Hierarchical uses wildcard patterns on topic paths (balances simplicity and flexibility).

Trade-offs

Delivery Guarantees: At-Most-Once vs At-Least-Once vs Exactly-Once At-most-once delivers each message zero or one time (fire-and-forget, no retries). Fast and simple but risks data loss. At-least-once retries until acknowledged, guaranteeing delivery but possibly duplicating messages. Requires idempotent subscribers. Exactly-once uses distributed transactions or deduplication to deliver each message precisely once. Expensive and complex. Decision: Use at-least-once for most scenarios (99% of cases), at-most-once for non-critical telemetry, exactly-once only when duplicates are catastrophic (financial transactions) and you can afford the performance cost.

Push vs Pull Delivery Push: broker calls subscriber endpoints via HTTP. Low latency, simple subscriber code, but requires publicly accessible endpoints and the broker controls delivery rate. Pull: subscribers fetch messages on demand. Subscribers control throughput and can batch requests, but adds polling overhead and latency. Decision: Use push for low-latency webhooks and serverless functions, pull for batch processing and when subscribers need flow control. Google Cloud Pub/Sub supports both; AWS SNS is push-only.

Ordering: Unordered vs Ordered Delivery Unordered delivery processes messages in parallel for maximum throughput. Ordered delivery (per partition/key) processes messages sequentially, preserving order but limiting parallelism. Decision: Use unordered unless order matters for correctness (e.g., account balance updates must apply in sequence). If you need ordering, partition by entity ID (user ID, order ID) so unrelated entities can still process in parallel. Kafka and AWS Kinesis support per-partition ordering.

Coupling: Tight vs Loose Schema Evolution Pub-sub decouples deployment but not data contracts. Publishers and subscribers must agree on message schemas. Tight coupling (shared schema registry, breaking changes require coordinated deploys) ensures correctness but reduces agility. Loose coupling (schema versioning, backward compatibility rules) enables independent deploys but risks runtime errors. Decision: Use a schema registry (Confluent Schema Registry, AWS Glue) with compatibility checks. Require backward compatibility for publishers, forward compatibility for subscribers. This lets publishers add fields without breaking old subscribers, and subscribers tolerate unknown fields from new publishers.

Delivery Guarantees: At-Most-Once vs At-Least-Once vs Exactly-Once

graph TB
    subgraph At-Most-Once: Fire and Forget
        P1["Publisher"] --"1. Send message"--> B1["Broker"]
        B1 --"2. Deliver (no retry)"--> S1["Subscriber"]
        S1 --"❌ Crash before processing"--> Lost1["Message Lost"]
        Note1["✓ Fast, simple<br/>✓ No duplicates<br/>❌ May lose messages"]
    end
    
    subgraph At-Least-Once: Retry Until Ack
        P2["Publisher"] --"1. Send message"--> B2["Broker"]
        B2 --"2. Deliver"--> S2["Subscriber"]
        S2 --"3. Process"--> S2
        S2 --"❌ Crash before ack"--> B2
        B2 --"4. Redeliver"--> S2
        S2 --"5. Process again (duplicate)"--> S2
        S2 --"6. Ack"--> B2
        Note2["✓ No message loss<br/>✓ Good performance<br/>❌ Duplicates possible<br/>⚠️ Requires idempotent subscribers"]
    end
    
    subgraph Exactly-Once: Distributed Transaction
        P3["Publisher"] --"1. Send with txn ID"--> B3["Broker"]
        B3 --"2. Check dedup table"--> Dedup[("Deduplication<br/>Store")]
        B3 --"3. Deliver with txn ID"--> S3["Subscriber"]
        S3 --"4. Process + write txn ID"--> S3DB[("Subscriber<br/>DB")]
        S3 --"5. Ack"--> B3
        B3 --"6. Mark complete"--> Dedup
        Note3["✓ No loss or duplicates<br/>❌ Expensive (10x cost)<br/>❌ Complex coordination<br/>❌ Lower throughput"]
    end
    
    Decision{"Which to choose?"}
    Decision --"Non-critical telemetry<br/>(metrics, logs)"--> At-Most-Once
    Decision --"Most use cases<br/>(99% of scenarios)"--> At-Least-Once
    Decision --"Financial transactions<br/>(payments, billing)"--> Exactly-Once

At-most-once is fast but may lose messages. At-least-once guarantees delivery but may duplicate (requires idempotent subscribers). Exactly-once prevents duplicates using distributed transactions but costs 10x more. Choose based on business impact of loss vs duplication.

When to Use (and When Not To)

Use pub-sub when:

One event needs to trigger multiple independent actions (order placed → update inventory, charge card, send email, log analytics)
You need to add new consumers without modifying producers (new team wants to subscribe to existing events)
Consumers scale independently from producers (video transcoding needs 100 workers, thumbnail generation needs 10)
Producers shouldn’t wait for consumer processing (user registration should return immediately, not wait for welcome email to send)
You need to replay events for new consumers (new recommendation model wants to process last month’s viewing history)

Avoid pub-sub when:

You need request-response semantics (“create order and return order ID”). Use RPC or REST instead. Pub-sub is fire-and-forget.
You need strict ordering across all messages (“process all user actions in exact sequence”). Pub-sub parallelizes by default. Use a single-partition queue or event sourcing instead.
Message volume is low and latency is critical (“notify user’s browser of new message”). The broker adds latency. Use WebSockets or server-sent events.
Only one consumer should process each message (“assign task to one worker”). Use a work queue pattern (see message-queue) instead.
You need synchronous error handling (“if payment fails, abort order creation”). Pub-sub is async; errors surface later. Use transactions or sagas.

Anti-patterns:

Using pub-sub for RPC by creating request and response topics. This is complex and fragile. Use gRPC or REST.
Publishing massive messages (>1MB). Pub-sub is for notifications, not data transfer. Publish a reference (S3 URL) instead.
Creating a topic per entity instance (“order-12345-events”). This doesn’t scale. Use one topic with entity ID in the message.
Synchronous publishing in critical paths without timeouts. If the broker is slow, your API is slow. Publish asynchronously or use circuit breakers.

Real-World Examples

company: Netflix system: Viewing Event Processing implementation: When you watch a show on Netflix, the player publishes viewing events (play, pause, stop, progress) to a Kafka topic. Over 30 subscriber services consume these events: the recommendation engine updates your taste profile, the billing system tracks watch time for analytics, the continue watching feature updates your progress, the trending algorithm counts views, and A/B testing frameworks log experiment exposures. Each subscriber processes 10+ million events per second independently. The video player doesn’t know these systems exist and never waits for them. interesting_detail: Netflix uses Kafka’s consumer groups to scale subscribers horizontally. The recommendation engine runs 500 consumer instances, each processing a partition of the viewing topic. When they deploy a new recommendation model, they create a new consumer group that replays the last 24 hours of viewing events to warm up the model before switching traffic.

company: Uber system: Trip Lifecycle Events implementation: Uber’s trip service publishes events to topics like “trip.requested”, “trip.matched”, “trip.started”, “trip.completed”. Subscribers include: the driver app (shows new trip requests), the rider app (shows driver location), the pricing service (calculates surge), the dispatch system (finds nearby drivers), the fraud detection service (flags suspicious patterns), and the data warehouse (logs for analytics). Each subscriber maintains its own view of trip state. The trip service is a thin event publisher with no knowledge of downstream consumers. interesting_detail: Uber uses content-based routing to deliver trip events only to relevant geographic regions. A trip in San Francisco doesn’t fan out to servers in Mumbai. This reduces cross-region traffic by 90% and improves latency. They achieve this by publishing to region-specific topics (“trip.us-west.completed”) and using a topic router that maps events to regions based on pickup coordinates.

company: Google Cloud system: Cloud Logging and Monitoring implementation: Google Cloud Platform services publish logs and metrics to Cloud Pub/Sub topics. When a Compute Engine VM crashes, it publishes a “compute.instance.crashed” event. Subscribers include: Cloud Monitoring (triggers alerts), Cloud Logging (stores logs in BigQuery), incident management tools (creates tickets), and customer-defined Cloud Functions (custom automation). Customers can add their own subscribers without Google’s involvement. This architecture processes 500+ million messages per second across all GCP services. interesting_detail: Google uses exactly-once delivery for billing-related events (“compute.instance.billed”) to prevent double-charging customers, but at-least-once for most monitoring events where duplicates are harmless. This hybrid approach optimizes cost—exactly-once requires distributed transactions and is 10x more expensive than at-least-once. They make the trade-off per topic based on business impact.

Interview Essentials

Mid-Level

Explain the difference between pub-sub and point-to-point messaging. When would you choose each?

How does pub-sub achieve loose coupling? What are the trade-offs?

Walk through what happens when a publisher sends a message to a topic with three subscribers.

What is fan-out? Give an example where you’d need it.

How do you handle a slow subscriber that can’t keep up with message volume?

Senior

Design a pub-sub system for a social media platform where users post updates and followers receive them. How do you handle 1 million followers?

Compare at-least-once vs exactly-once delivery. When is the complexity of exactly-once worth it?

How would you implement message filtering so subscribers only receive relevant messages?

A subscriber is processing duplicate messages. Walk through possible causes and solutions.

Design a dead-letter queue strategy. What goes in the DLQ? How do you monitor and replay it?

How do you version message schemas without breaking existing subscribers?

Staff+

You’re seeing 10-second delivery latency in a pub-sub system handling 100K msg/sec. Debug this systematically.

Design a multi-region pub-sub system with active-active publishers and subscribers. How do you handle network partitions?

A critical subscriber is down for 2 hours. The message backlog is 10 million messages. How do you recover without overwhelming the subscriber?

Compare building on Kafka vs Google Cloud Pub/Sub vs AWS SNS/SQS for a new microservices platform. What are the architectural implications?

Design a pub-sub system that guarantees message ordering within a user’s session but allows parallel processing across users.

How would you implement a time-travel feature that lets subscribers replay messages from arbitrary points in the past?

Common Interview Questions

Why use pub-sub instead of direct API calls between services?

How does pub-sub improve system scalability and resilience?

What happens if a subscriber is offline when a message is published?

Can you guarantee message ordering in pub-sub?

How do you prevent message loss in pub-sub systems?

Red Flags to Avoid

Confusing pub-sub with request-response patterns. Pub-sub is fire-and-forget, not RPC.

Claiming pub-sub guarantees ordering without mentioning partitioning or single-consumer constraints.

Not understanding at-least-once delivery implications. Subscribers must be idempotent.

Ignoring backpressure. What happens when subscribers can’t keep up with publishers?

Designing synchronous pub-sub. The whole point is async decoupling.

Not considering message size limits. Most brokers cap messages at 1-10MB.

Forgetting about monitoring. How do you detect stuck subscribers or growing backlogs?

Key Takeaways

Pub-sub decouples publishers from subscribers through topics, enabling one-to-many broadcasting without tight coupling. Publishers don’t know who subscribes; subscribers don’t know who publishes.

The pattern trades consistency for scalability and resilience. Messages are delivered asynchronously with at-least-once semantics by default, requiring idempotent subscribers.

Fan-out is the killer feature: one message published to a topic is delivered to all subscribers independently. This enables parallel processing and independent scaling of consumers.

Choose pub-sub for event broadcasting (order placed → multiple reactions), avoid it for request-response (create order → return order ID). Use RPC or REST for synchronous workflows.

Real-world systems like Netflix and Uber process millions of events per second using pub-sub, with subscribers ranging from real-time apps to batch analytics pipelines. The pattern scales from startups to hyperscale.