Long Polling vs WebSockets vs SSE: When to Use Each

After this topic, you will be able to:

Compare long polling, WebSockets, and Server-Sent Events for real-time communication requirements
Evaluate the trade-offs between these approaches in terms of latency, overhead, and implementation complexity
Recommend the appropriate real-time communication strategy based on use case requirements (unidirectional vs bidirectional, message frequency, client capabilities)

TL;DR

Long polling, WebSockets, and Server-Sent Events (SSE) are three approaches for real-time server-to-client communication, each with distinct trade-offs. Long polling simulates real-time by holding HTTP requests open until data is available, WebSockets establish persistent bidirectional connections for low-latency two-way communication, and SSE provides efficient unidirectional streaming from server to client. Choose long polling for broad compatibility with simple requirements, WebSockets for bidirectional real-time features like chat or gaming, and SSE for server-push scenarios like live feeds or notifications.

Cheat Sheet: Long Polling = repeated HTTP requests held open | WebSockets = persistent bidirectional TCP connection | SSE = unidirectional HTTP stream with auto-reconnect | Latency: WebSockets < SSE < Long Polling | Overhead: Long Polling > WebSockets ≈ SSE

Context

Modern applications demand real-time updates—think Slack messages appearing instantly, live sports scores updating without refresh, or collaborative documents showing changes as teammates type. The traditional request-response HTTP model falls short here because clients must repeatedly ask “anything new?” which wastes bandwidth and introduces latency. Engineers face a critical architectural choice: how do we push data from server to client efficiently?

This decision impacts user experience, infrastructure costs, and system complexity. A chat application choosing the wrong approach might burn through server connections, a stock ticker might deliver stale prices, or a notification system might drain mobile batteries. In system design interviews, this question surfaces constantly: “How would you implement real-time updates for [feature X]?” Interviewers expect you to understand not just what these technologies do, but when each shines and when each becomes a liability.

The choice between long polling, WebSockets, and SSE isn’t about finding the “best” technology—it’s about matching communication patterns to requirements. Does your system need bidirectional communication or just server-to-client push? How many concurrent connections must you support? What about mobile clients with flaky networks? These constraints drive the decision, and understanding the trade-offs separates engineers who can build scalable real-time systems from those who create performance bottlenecks.

Evolution from Traditional HTTP to Real-Time Communication

graph LR
    subgraph Traditional HTTP
        C1["Client"] --"1. Request data"--> S1["Server"]
        S1 --"2. Response"--> C1
        C1 --"3. Request again"--> S1
        S1 --"4. Response"--> C1
    end
    
    subgraph Real-Time Push
        C2["Client"] --"1. Establish connection"--> S2["Server"]
        S2 --"2. Push update"--> C2
        S2 --"3. Push update"--> C2
        S2 --"4. Push update"--> C2
    end
    
    Traditional HTTP -."Inefficient:<br/>repeated requests,<br/>wasted bandwidth".-> Real-Time Push

Traditional HTTP requires clients to repeatedly poll for updates, wasting bandwidth and introducing latency. Real-time communication patterns establish persistent connections allowing servers to push updates immediately when data changes.

Side-by-Side Comparison

Feature Comparison Matrix

Dimension	Long Polling	WebSockets	Server-Sent Events
Communication Pattern	Simulated push via repeated requests	Full-duplex bidirectional	Unidirectional server-to-client
Protocol	HTTP/HTTPS	WebSocket protocol (ws://, wss://) over TCP	HTTP/HTTPS (EventSource API)
Connection Model	New connection per request	Single persistent connection	Single persistent connection
Latency	High (request overhead + server hold time)	Very low (~1-2ms after connection)	Low (no request overhead)
Overhead	High (HTTP headers on every request)	Low (minimal framing, ~2 bytes/message)	Low (text-based streaming)
Browser Support	Universal (any HTTP client)	IE10+, all modern browsers	All modern browsers (no IE)
Mobile Battery	Poor (constant reconnections)	Good (persistent connection)	Good (persistent connection)
Firewall/Proxy	Excellent (standard HTTP)	Can be blocked by corporate proxies	Excellent (standard HTTP)
Automatic Reconnection	Manual implementation required	Manual implementation required	Built-in (EventSource handles it)
Message Format	Any (JSON, binary, etc.)	Text or binary frames	Text only (UTF-8)
Scalability	Poor (many concurrent connections)	Moderate (stateful connections)	Moderate (stateful connections)
Load Balancing	Easy (stateless requests)	Complex (sticky sessions required)	Complex (sticky sessions required)

When Each Approach Excels

Long Polling works best when you need maximum compatibility with legacy systems, corporate networks with restrictive firewalls, or when real-time requirements are loose (updates every few seconds acceptable). Example: A dashboard showing system metrics that refresh every 5 seconds doesn’t need sub-second latency, and long polling’s simplicity outweighs its overhead.

WebSockets dominate when you need low-latency bidirectional communication—chat applications, multiplayer games, collaborative editing, or trading platforms. Discord uses WebSockets for message delivery because users both send and receive messages constantly, and latency must stay under 100ms for natural conversation flow. The persistent connection eliminates HTTP handshake overhead on every message.

Server-Sent Events shine for unidirectional server push scenarios where clients primarily consume data: live sports scores, stock tickers, social media feeds, or server-sent notifications. Twitter’s live timeline updates use SSE because tweets flow from server to client, automatic reconnection handles network blips gracefully, and the EventSource API requires minimal client code compared to WebSocket connection management.

Communication Patterns: Long Polling vs WebSockets vs SSE

graph TB
    subgraph Long Polling
        LP_C["Client"] --"1. HTTP Request<br/>(held open)"--> LP_S["Server"]
        LP_S --"2. Response when<br/>data available"--> LP_C
        LP_C --"3. Immediate<br/>new request"--> LP_S
        LP_S --"4. Response"--> LP_C
    end
    
    subgraph WebSockets
        WS_C["Client"] --"1. HTTP Upgrade<br/>Request"--> WS_S["Server"]
        WS_S --"2. 101 Switching<br/>Protocols"--> WS_C
        WS_C <--"3. Bidirectional<br/>messages (2-14 bytes overhead)"--> WS_S
    end
    
    subgraph Server-Sent Events
        SSE_C["Client<br/>(EventSource)"] --"1. HTTP Request<br/>Accept: text/event-stream"--> SSE_S["Server"]
        SSE_S --"2. Keep connection open"--> SSE_C
        SSE_S --"3. Stream events<br/>(data: ...)"--> SSE_C
        SSE_S --"4. Auto-reconnect<br/>on disconnect"--> SSE_C
    end

Long polling simulates real-time through repeated HTTP requests held open until data arrives. WebSockets establish a persistent bidirectional connection after an upgrade handshake. SSE maintains a unidirectional HTTP stream with built-in reconnection logic.

Deep Analysis

Long Polling: The Compatibility Champion

Long polling emerged as a clever hack to simulate server push over HTTP. The client sends a request, but instead of responding immediately, the server holds the connection open until new data arrives or a timeout expires (typically 30-60 seconds). When data arrives or timeout occurs, the server responds, and the client immediately sends another request. This creates a near-continuous connection using standard HTTP.

The appeal lies in simplicity and compatibility. Any HTTP client works—no special protocols, no browser API requirements, no proxy configuration headaches. For systems serving diverse clients (mobile apps, web browsers, IoT devices), long polling guarantees universal support. Implementation is straightforward: a while loop on the server holding requests, standard HTTP endpoints, and familiar request-response patterns.

However, the overhead is brutal at scale. Each request carries full HTTP headers (often 500-800 bytes), cookies, and authentication tokens. A system with 100,000 concurrent users means 100,000 open connections, each consuming server resources. When timeouts expire simultaneously, thundering herds of reconnection requests can overwhelm servers. Netflix abandoned long polling for their real-time updates because the connection churn created unpredictable load spikes that complicated capacity planning.

Latency suffers too. Even with connections held open, the request-response cycle adds 50-200ms compared to WebSockets. For applications where every millisecond counts—trading platforms, gaming, live collaboration—this latency is unacceptable. Long polling also complicates error handling: distinguishing between network failures, timeouts, and actual server errors requires careful client-side logic.

WebSockets: The Real-Time Powerhouse

WebSockets provide true bidirectional communication over a single TCP connection. The magic happens during the upgrade handshake: the client sends an HTTP request with Upgrade: websocket and Connection: Upgrade headers. If the server supports WebSockets, it responds with HTTP 101 Switching Protocols, and the connection transforms into a WebSocket connection. From that point, both sides can send messages freely without HTTP overhead.

The performance gains are dramatic. After the initial handshake, messages carry only 2-14 bytes of framing overhead (depending on payload size and masking). A chat message that would require 800+ bytes with HTTP headers now needs perhaps 50 bytes total. Latency drops to single-digit milliseconds because there’s no request-response cycle—messages flow immediately in both directions. Slack’s message delivery achieves sub-100ms latency globally because WebSockets eliminate the round-trip overhead of HTTP.

Bidirectional communication enables entirely new interaction patterns. In collaborative editing (Google Docs, Figma), every keystroke flows to the server, which broadcasts changes to other users, all over the same connection. Multiplayer games send player positions, actions, and game state updates continuously. Trading platforms stream live prices while simultaneously accepting orders. This symmetry is WebSockets’ killer feature.

The challenges emerge at scale. WebSocket connections are stateful—each connection pins to a specific server instance, complicating load balancing. You need sticky sessions (session affinity) to ensure clients reconnect to the same server, or you need a message broker (Redis Pub/Sub, RabbitMQ) to route messages between servers. Discord handles millions of concurrent WebSocket connections by sharding users across server clusters and using a message bus to coordinate cross-shard communication.

Connection management requires careful engineering. Unlike HTTP, where failed requests are obvious, WebSocket failures can be silent—connections might appear open while network issues prevent message delivery. You need heartbeat/ping-pong mechanisms to detect dead connections, exponential backoff for reconnection attempts, and graceful degradation when connections fail. Mobile clients face additional challenges: iOS and Android aggressively kill background connections to save battery, requiring apps to reconnect frequently.

Security considerations differ from HTTP too. WebSocket connections don’t automatically include CORS protections, so you must validate origins explicitly. Authentication typically happens during the upgrade handshake (passing tokens in headers or query parameters), but you must also handle token expiration and re-authentication over long-lived connections.

Server-Sent Events: The Streaming Specialist

SSE provides a elegant middle ground: unidirectional server-to-client streaming over standard HTTP. The client creates an EventSource object pointing to an endpoint, and the server responds with Content-Type: text/event-stream. The connection stays open, and the server sends events as text data prefixed with data: followed by two newlines. The browser’s EventSource API handles connection management, automatic reconnection (with exponential backoff), and event parsing.

The developer experience is remarkably clean. Client code is often just three lines: create EventSource, attach event listener, handle messages. The browser manages reconnection automatically—if the connection drops, EventSource reconnects and includes a Last-Event-ID header so the server can resume from the last received event. This built-in resilience makes SSE ideal for unreliable networks.

SSE works over standard HTTP, so it inherits HTTP’s infrastructure compatibility. It works through proxies, benefits from HTTP/2 multiplexing (multiple SSE streams over one TCP connection), and integrates with existing authentication and authorization mechanisms. For server-push scenarios, SSE’s simplicity often beats WebSockets’ complexity. A notification system that pushes alerts to users needs only server-to-client communication—WebSockets’ bidirectional capability is overkill.

The limitations are clear: SSE is text-only (UTF-8), so binary data requires base64 encoding. It’s unidirectional—if clients need to send data, they must use separate HTTP requests. Browser support excludes Internet Explorer (though polyfills exist). Connection limits per domain (typically 6) can be restrictive, though HTTP/2 alleviates this.

For use cases like live dashboards, activity feeds, or progress updates, SSE hits the sweet spot. Stripe uses SSE for their dashboard’s live transaction feed because transactions flow from server to client, automatic reconnection handles network blips during payment processing, and the simplicity reduces client-side bugs compared to WebSocket connection management.

WebSocket Upgrade Handshake and Message Flow

sequenceDiagram
    participant Client
    participant Server
    participant App as Application Logic
    
    Note over Client,Server: Initial Handshake
    Client->>Server: GET /chat HTTP/1.1<br/>Upgrade: websocket<br/>Connection: Upgrade<br/>Sec-WebSocket-Key: x3JJHMbDL1EzLkh9GBhXDw==
    Server->>Server: Validate upgrade request
    Server->>Client: HTTP/1.1 101 Switching Protocols<br/>Upgrade: websocket<br/>Connection: Upgrade<br/>Sec-WebSocket-Accept: HSmrc0sMlYUkAGmm5OPpG2HaGWk=
    
    Note over Client,Server: Connection Established - Now WebSocket Protocol
    
    Client->>Server: WebSocket Frame<br/>{"type": "message", "text": "Hello"}<br/>(~50 bytes total)
    Server->>App: Process message
    App->>Server: Broadcast to recipients
    Server->>Client: WebSocket Frame<br/>{"type": "message", "text": "Hello"}<br/>(~50 bytes total)
    
    Note over Client,Server: Bidirectional - No Request/Response Cycle
    
    Server->>Client: WebSocket Frame<br/>{"type": "notification", "count": 5}
    Client->>Server: WebSocket Frame<br/>{"type": "typing", "user": "Alice"}
    
    Note over Client,Server: Heartbeat / Keep-Alive
    Client->>Server: Ping Frame
    Server->>Client: Pong Frame
    
    Note over Client,Server: Connection Close
    Client->>Server: Close Frame (code: 1000)
    Server->>Client: Close Frame (code: 1000)

WebSocket connections begin with an HTTP upgrade handshake, then switch to the WebSocket protocol for low-overhead bidirectional messaging. After the handshake, messages flow freely in both directions with only 2-14 bytes of framing overhead, compared to 500-800 bytes for HTTP headers.

Scaling Real-Time Connections

Scaling real-time connections presents unique challenges because both WebSockets and SSE maintain stateful connections that pin clients to specific servers. A traditional stateless HTTP service can scale horizontally by adding servers behind a load balancer—any server can handle any request. With persistent connections, you lose this flexibility.

The primary challenge is connection distribution. Load balancers must implement sticky sessions (session affinity) to route clients to the same server instance. This works but creates uneven load distribution—if one server handles many active connections while another sits idle, you can’t easily rebalance. WhatsApp solves this by implementing connection draining: when a server becomes overloaded, it stops accepting new connections and gradually closes existing ones with reconnection hints, allowing clients to redistribute across the cluster.

Memory consumption becomes critical at scale. Each WebSocket or SSE connection consumes memory for buffers, state tracking, and application data. A server with 64GB RAM might handle 100,000 concurrent connections (640KB per connection), but that leaves little room for application logic. Discord optimizes by using lightweight connection handlers written in Elixir (BEAM VM), which can maintain millions of connections per server through efficient process scheduling.

Message routing across servers requires a pub/sub layer. When a user sends a chat message, their WebSocket connects to Server A, but recipients might connect to Servers B, C, and D. You need a message broker (Redis Pub/Sub, NATS, Kafka) to broadcast messages across servers. Slack uses a combination of Redis for low-latency message routing and Kafka for durable message storage, allowing servers to coordinate without direct connections.

Connection limits hit at multiple layers. Operating systems limit file descriptors (each connection is a file descriptor), typically defaulting to 1024 but tunable to 100,000+. TCP port exhaustion can occur with many outbound connections. Browsers limit concurrent connections per domain (6 for HTTP/1.1, effectively unlimited for HTTP/2). Cloud load balancers often have connection limits (AWS ALB supports 500,000 concurrent connections per load balancer). Planning capacity requires understanding limits at every layer.

Health checks and graceful shutdown become complex. Traditional HTTP health checks don’t reflect WebSocket connection health—a server might pass health checks while all WebSocket connections are dead. You need application-level health metrics tracking active connections, message throughput, and error rates. Graceful shutdown must close connections cleanly, giving clients time to reconnect elsewhere rather than abruptly dropping thousands of connections.

Cost implications differ from stateless services. Auto-scaling based on CPU or memory doesn’t work well because connection count drives resource usage, not request rate. A server might run at 20% CPU while maxed out on connections. You need custom scaling metrics based on connection count per server. Additionally, persistent connections mean you can’t scale down as aggressively—closing a server terminates all its connections, potentially impacting thousands of users simultaneously.

Scaling WebSocket Connections Across Multiple Servers

graph TB
    subgraph Client Layer
        C1["User A<br/>WebSocket"]
        C2["User B<br/>WebSocket"]
        C3["User C<br/>WebSocket"]
        C4["User D<br/>WebSocket"]
    end
    
    LB["Load Balancer<br/><i>Sticky Sessions Enabled</i>"]
    
    subgraph Server Cluster
        WS1["WebSocket Server 1<br/><i>25K connections</i>"]
        WS2["WebSocket Server 2<br/><i>30K connections</i>"]
        WS3["WebSocket Server 3<br/><i>20K connections</i>"]
    end
    
    subgraph Message Routing Layer
        Redis["Redis Pub/Sub<br/><i>Message Broker</i>"]
        Kafka["Kafka<br/><i>Durable Storage</i>"]
    end
    
    DB[("PostgreSQL<br/>Message History")]
    
    C1 & C2 --"WebSocket connections"--> LB
    C3 & C4 --"WebSocket connections"--> LB
    
    LB --"Route to same server<br/>(sticky session)"--> WS1
    LB --> WS2
    LB --> WS3
    
    WS1 --"1. Publish message"--> Redis
    WS2 --"2. Subscribe to channels"--> Redis
    WS3 --"3. Receive broadcasts"--> Redis
    
    WS1 & WS2 & WS3 --"Persist messages"--> Kafka
    Kafka --"Archive"--> DB
    
    Redis -."Cross-server<br/>message routing".-> WS1
    Redis -."Cross-server<br/>message routing".-> WS2
    Redis -."Cross-server<br/>message routing".-> WS3

Scaling WebSocket connections requires sticky sessions to route clients to the same server, plus a message routing layer (Redis Pub/Sub, Kafka) to coordinate communication across servers. When User A on Server 1 sends a message to User D on Server 3, the message flows through the pub/sub layer to reach the correct server for broadcast.

Decision Framework

Choosing the Right Approach

Start by mapping your communication pattern. If you need bidirectional communication where both client and server send messages frequently, WebSockets are your only real option. Chat applications, collaborative editing, multiplayer games, and real-time trading platforms all require bidirectional flow. Long polling and SSE can’t efficiently handle client-to-server messages without separate HTTP requests, which defeats the purpose.

For unidirectional server-to-client push, evaluate message frequency and latency requirements. If updates arrive every few seconds and 1-2 second latency is acceptable, long polling’s simplicity might outweigh its overhead—especially if you need maximum compatibility or operate in restrictive network environments. If you need sub-second latency and frequent updates (multiple per second), SSE provides better efficiency with simpler implementation than WebSockets.

Consider your client environment. Building for modern web browsers? SSE’s EventSource API offers the easiest implementation with automatic reconnection. Supporting mobile apps? WebSockets provide better battery life than long polling’s constant reconnections, but you’ll need robust reconnection logic. Dealing with corporate networks or legacy systems? Long polling’s standard HTTP ensures it works everywhere.

Scale and infrastructure matter significantly. If you’re building a prototype or serving thousands of users, any approach works—choose based on implementation simplicity. At hundreds of thousands of concurrent connections, you need WebSockets or SSE with proper infrastructure (load balancers with sticky sessions, pub/sub message routing, connection pooling). Long polling’s overhead becomes prohibitive.

For mobile applications, battery life is paramount. Long polling’s constant reconnections drain batteries quickly. WebSockets and SSE maintain persistent connections more efficiently, but you must handle mobile OS connection killing (iOS and Android terminate background connections). Implement exponential backoff, respect system battery optimization modes, and consider reducing message frequency when on cellular networks.

Security and compliance requirements influence the choice. WebSockets require explicit origin validation and careful authentication handling. SSE and long polling inherit HTTP’s security model, making them easier to secure correctly. If your organization has strict security policies or compliance requirements, the simpler security model of SSE/long polling might be worth the performance trade-off.

Decision Tree

Need bidirectional communication? → WebSockets (only viable option)

Unidirectional server-to-client only:

Updates < 1/second, need maximum compatibility → Long Polling
Frequent updates, modern browsers, want simplicity → SSE
Frequent updates, need binary data or IE support → WebSockets

Scale considerations:

< 10,000 concurrent connections → Any approach works, choose simplest
10,000 - 100,000 connections → WebSockets or SSE with proper infrastructure
> 100,000 connections → WebSockets with clustering and message routing

Mobile-first application:

Bidirectional needed → WebSockets with robust reconnection logic
Server push only → SSE with mobile-optimized reconnection
Avoid → Long polling (battery drain)

Restrictive network environment:

Corporate firewalls, proxies → Long Polling or SSE (standard HTTP)
Open internet → WebSockets (best performance)

Decision Tree: Choosing the Right Real-Time Technology

graph TD
    Start["Real-Time Communication<br/>Requirement"]
    
    Start --> Bidirectional{"Need bidirectional<br/>communication?<br/>(client ↔ server)"}
    
    Bidirectional -->|Yes| WS["✓ WebSockets<br/><i>Chat, gaming, collaboration</i>"]
    
    Bidirectional -->|No| Frequency{"Message frequency<br/>& latency needs?"}
    
    Frequency -->|"< 1 msg/sec<br/>loose latency"| Compat{"Maximum<br/>compatibility<br/>required?"}
    
    Compat -->|Yes| LP["✓ Long Polling<br/><i>Legacy systems, firewalls</i>"]
    Compat -->|No| SSE1["✓ SSE<br/><i>Simple, auto-reconnect</i>"]
    
    Frequency -->|"Multiple msgs/sec<br/>low latency"| Binary{"Need binary data<br/>or IE support?"}
    
    Binary -->|Yes| WS2["✓ WebSockets<br/><i>Binary frames supported</i>"]
    Binary -->|No| SSE2["✓ SSE<br/><i>Efficient text streaming</i>"]
    
    Start --> Scale{"Expected<br/>concurrent<br/>connections?"}
    
    Scale -->|"< 10K"| Simple["✓ Any approach<br/><i>Choose simplest</i>"]
    Scale -->|"10K - 100K"| Infra["✓ WebSockets or SSE<br/><i>Need proper infrastructure</i>"]
    Scale -->|"> 100K"| Cluster["✓ WebSockets<br/><i>Clustering + message routing</i>"]
    
    Start --> Mobile{"Mobile-first<br/>application?"}
    
    Mobile -->|"Bidirectional"| WSMobile["✓ WebSockets<br/><i>Robust reconnection logic</i>"]
    Mobile -->|"Server push only"| SSEMobile["✓ SSE<br/><i>Mobile-optimized reconnect</i>"]
    Mobile -->|"Avoid"| LPMobile["✗ Long Polling<br/><i>Battery drain</i>"]

Choose your real-time communication approach based on communication pattern (bidirectional vs unidirectional), message frequency and latency requirements, scale expectations, and client environment. WebSockets excel at bidirectional high-frequency communication, SSE provides simple unidirectional streaming with auto-reconnect, and long polling offers maximum compatibility at the cost of efficiency.

Real-World Examples

Slack: WebSockets for Real-Time Messaging

Slack built their real-time messaging infrastructure on WebSockets to achieve sub-100ms message delivery globally. When a user sends a message, it flows over their WebSocket connection to Slack’s edge servers, which validate and persist the message before broadcasting to recipients. Recipients receive messages over their WebSocket connections within milliseconds.

The interesting challenge Slack faced was scaling to millions of concurrent connections. They implemented a multi-tier architecture: edge servers handle WebSocket connections and message routing, while backend services handle business logic and persistence. Edge servers connect to a message bus (Kafka) that coordinates message delivery across the cluster. When a user’s connection moves between servers (due to load balancing or server restarts), Slack’s client library automatically reconnects and resynchronizes missed messages using sequence numbers.

Slack also implements intelligent backoff and degradation. If a client’s connection becomes unstable (frequent disconnects, high latency), the system switches to a polling mode temporarily to maintain functionality while network conditions improve. This hybrid approach ensures reliability across diverse network conditions—from stable office WiFi to flaky mobile connections.

Discord: Massive-Scale WebSocket Infrastructure

Discord handles over 4 million concurrent WebSocket connections during peak hours, delivering voice, video, and text communication with minimal latency. Their architecture uses Elixir (running on the BEAM VM) for connection handling because BEAM’s lightweight processes can maintain millions of connections per server with low memory overhead.

Discord’s gateway servers maintain WebSocket connections and handle message routing. They implement sophisticated connection management: heartbeat intervals detect dead connections, session resumption allows clients to reconnect without losing state, and rate limiting prevents abuse. Each gateway server can handle 50,000+ concurrent connections while maintaining sub-50ms message delivery.

The scaling strategy involves sharding users across gateway clusters. Each cluster handles a subset of users, and a message routing layer (using Redis and Kafka) coordinates cross-cluster communication. When a user sends a message to a channel, the gateway publishes it to the message bus, which delivers it to all relevant gateways for broadcast to connected users. This architecture allows Discord to scale horizontally by adding gateway clusters without redesigning the core system.

Stripe: SSE for Dashboard Updates

Stripe uses Server-Sent Events to power their dashboard’s live transaction feed. When merchants view their dashboard, an SSE connection streams new transactions, disputes, and events as they occur. This provides real-time visibility into business activity without requiring merchants to refresh the page.

Stripe chose SSE over WebSockets because the communication is purely server-to-client—merchants consume events but don’t send data over the same connection (API calls use separate HTTP requests). SSE’s automatic reconnection handles network interruptions gracefully, which is critical during payment processing when merchants need continuous visibility. The EventSource API’s simplicity reduced client-side bugs compared to managing WebSocket connections manually.

The implementation uses HTTP/2 multiplexing to maintain multiple SSE streams per connection efficiently. Each dashboard component (transactions, customers, events) can have its own SSE stream without exhausting connection limits. Stripe’s edge infrastructure includes connection pooling and message buffering to handle reconnection scenarios—if a client disconnects briefly, the server buffers recent events and replays them upon reconnection, ensuring no events are missed during network blips.

Interview Essentials

Mid-Level

Explain how long polling works and why it’s less efficient than WebSockets for real-time updates

Describe the WebSocket upgrade handshake and what happens after the connection is established

Compare the latency and overhead characteristics of long polling vs. WebSockets vs. SSE

Explain when you would choose SSE over WebSockets for a notification system

Describe how automatic reconnection works in SSE and why it’s valuable

Senior

Design a chat system handling 100,000 concurrent users—justify your choice between WebSockets and alternatives

Explain the load balancing challenges with WebSocket connections and how to address them with sticky sessions and message routing

Discuss the trade-offs between WebSockets and SSE for a live dashboard showing system metrics

Describe how you would implement graceful degradation from WebSockets to long polling when connections fail

Explain the memory and connection limit considerations when scaling WebSocket servers to millions of connections

Staff+

Design a global real-time messaging platform (like Slack) with multi-region deployment—discuss connection routing, message consistency, and failover strategies

Analyze the cost implications of maintaining millions of persistent connections vs. polling-based approaches at scale

Discuss how HTTP/2 and HTTP/3 (QUIC) impact the WebSocket vs. SSE decision for modern applications

Design a hybrid approach that uses WebSockets for active users and degrades to SSE or polling for inactive users to optimize resource usage

Explain how you would implement connection draining and zero-downtime deployments for a WebSocket-based system serving millions of users

Common Interview Questions

When would you choose long polling over WebSockets?

How do WebSockets maintain low latency compared to HTTP polling?

What are the main differences between WebSockets and Server-Sent Events?

How would you handle WebSocket connection failures and reconnection?

Explain the scaling challenges with persistent connections and how to address them

Why might SSE be preferable to WebSockets for a notification system?

How do you implement authentication and authorization for WebSocket connections?

What happens to WebSocket connections during server deployments?

How would you monitor and debug issues with thousands of concurrent WebSocket connections?

Explain the battery and mobile network implications of different real-time communication approaches

Red Flags to Avoid

Claiming WebSockets are always better than alternatives without considering use case requirements

Not understanding that WebSockets require sticky sessions or message routing for load balancing

Ignoring the automatic reconnection benefits of SSE’s EventSource API

Proposing long polling for high-frequency, low-latency requirements where it’s clearly inappropriate

Not considering mobile battery implications when choosing between approaches

Failing to discuss connection management, heartbeats, and failure detection for persistent connections

Overlooking the unidirectional nature of SSE when bidirectional communication is required

Not understanding the HTTP upgrade handshake process for WebSockets

Ignoring browser compatibility and corporate firewall issues that might block WebSockets

Proposing WebSockets without discussing the infrastructure complexity (message routing, connection limits, graceful shutdown)

Key Takeaways

Match communication pattern to technology: Use WebSockets for bidirectional real-time communication (chat, gaming, collaboration), SSE for unidirectional server-to-client push (notifications, feeds, dashboards), and long polling only when maximum compatibility is required or real-time requirements are loose (updates every few seconds).

Latency and overhead trade-offs: WebSockets provide the lowest latency (~1-2ms) and minimal overhead (2-14 bytes per message) after connection establishment, SSE offers low latency with automatic reconnection over standard HTTP, while long polling has high overhead (full HTTP headers per request) and higher latency (50-200ms) due to request-response cycles.

Scaling persistent connections requires different infrastructure: Both WebSockets and SSE need sticky sessions for load balancing, message routing layers (Redis Pub/Sub, Kafka) for cross-server communication, and careful connection limit management. Plan for 640KB-1MB memory per connection and implement connection draining for graceful scaling.

SSE’s automatic reconnection is undervalued: The EventSource API handles reconnection with exponential backoff and Last-Event-ID resumption automatically, making SSE significantly simpler to implement correctly than WebSockets for server-push scenarios. This built-in resilience is critical for unreliable networks.

Mobile and battery considerations matter: Long polling’s constant reconnections drain mobile batteries quickly. WebSockets and SSE maintain persistent connections more efficiently but require robust reconnection logic to handle iOS and Android killing background connections. Implement exponential backoff and respect system battery optimization modes.