Asynchronous Processing in System Design

After this topic, you will be able to:

Differentiate between synchronous and asynchronous communication patterns in distributed systems
Analyze the trade-offs between coupling, latency, and reliability when choosing async patterns
Identify scenarios where asynchronous processing provides architectural benefits
Compare temporal decoupling strategies across different system architectures

TL;DR

Asynchronous processing decouples components in time, allowing producers and consumers to operate independently without blocking. This pattern reduces request latency, improves fault isolation, and enables systems to handle variable workloads gracefully. Core approaches include message queues for reliable delivery, task queues for background work, and stream processing for real-time event handling.

Cheat Sheet: Async = temporal decoupling. Use when operations are expensive (>100ms), failure-prone, or variable in timing. Trade immediate consistency for scalability and resilience.

Why This Matters

In synchronous systems, every request waits for its entire chain of operations to complete before responding. When Netflix encodes a newly uploaded video, that process takes minutes—but the user’s upload request returns in seconds. When Uber matches a rider with a driver, dozens of services coordinate asynchronously to prevent any single slow service from blocking the entire flow. This is asynchronous processing: the architectural pattern that separates when work is requested from when work is performed.

Interviewers care deeply about asynchronism because it reveals whether you understand distributed systems at a fundamental level. A junior engineer might design every API call as a synchronous request-response. An experienced engineer recognizes when to break that coupling, understanding the trade-offs between consistency, latency, and operational complexity. The difference shows up in every major system design question: designing Instagram’s feed generation, building Stripe’s payment processing, or architecting Twitter’s notification system all hinge on choosing the right async patterns.

The stakes are concrete. Synchronous architectures create cascading failures—one slow database query blocks an API server, which exhausts connection pools, which brings down load balancers. Asynchronous architectures isolate failures: a slow video encoder doesn’t prevent users from uploading files. They also unlock horizontal scaling: instead of vertically scaling a monolithic service to handle peak load, you can add more async workers elastically. Companies like Shopify process millions of background jobs daily using async patterns, turning what would be a scaling nightmare into a manageable queue depth metric.

Synchronous vs Asynchronous Request Flow

graph LR
    subgraph Synchronous Flow
        U1[User] --"1. Upload video"--> API1[API Server]
        API1 --"2. Encode video (5 min)"--> Encoder1[Video Encoder]
        Encoder1 --"3. Store"--> S3_1[(S3)]
        S3_1 --"4. Success"--> API1
        API1 --"5. Response (5 min later)"--> U1
    end
    
    subgraph Asynchronous Flow
        U2[User] --"1. Upload video"--> API2[API Server]
        API2 --"2. Store raw"--> S3_2[(S3)]
        API2 --"3. Enqueue job"--> Q[Message Queue]
        API2 --"4. Immediate response"--> U2
        W[Worker Pool] --"5. Consume job"--> Q
        W --"6. Encode (5 min)"--> Encoder2[Video Encoder]
        Encoder2 --"7. Store encoded"--> S3_3[(S3)]
    end

Synchronous processing blocks the user for 5 minutes while encoding completes. Asynchronous processing returns immediately by queuing the work, allowing the user to continue while workers process the video in the background. This pattern reduces perceived latency from minutes to milliseconds.

The Landscape

The asynchronous processing landscape divides into three major territories, each solving different problems with distinct trade-offs.

Message queues provide reliable, ordered delivery of discrete messages between services. Amazon SQS, RabbitMQ, and Apache Kafka (in queue mode) excel at decoupling microservices. When Shopify’s checkout service needs to trigger inventory updates, email notifications, and analytics events, it publishes messages to queues rather than calling each service synchronously. The checkout API returns instantly; workers process messages at their own pace. Message queues guarantee delivery (at-least-once or exactly-once semantics) and provide natural back pressure through queue depth monitoring.

Task queues specialize in background job execution with features like delayed execution, retries, and priority scheduling. Celery, Sidekiq, and AWS Step Functions manage long-running workflows that don’t fit the request-response model. When LinkedIn generates a user’s weekly activity digest, that computation runs asynchronously: scheduled daily, retried on failure, and prioritized below real-time features. Task queues add workflow orchestration—chaining jobs, handling timeouts, and managing state across multi-step processes.

Stream processing handles continuous flows of events for real-time analytics and stateful computations. Apache Kafka, AWS Kinesis, and Apache Flink process millions of events per second with low latency. When Netflix tracks viewing behavior to update recommendations in real-time, stream processors aggregate events, maintain windowed state, and trigger actions without the overhead of individual message acknowledgments. Streams trade the simplicity of discrete messages for throughput and temporal semantics.

Each territory overlaps at the edges—Kafka serves as both a message queue and a stream platform—but the core use cases differ. Understanding which tool fits which problem is the foundation of async architecture.

Async Processing Landscape: Three Territories

graph TB
    subgraph Message Queues<br/>Reliable Discrete Delivery
        MQ1["Amazon SQS<br/><i>Simple, managed</i>"]
        MQ2["RabbitMQ<br/><i>Advanced routing</i>"]
        MQ3["Kafka<br/><i>High throughput</i>"]
        MQUse["Use: Service decoupling<br/>Order processing<br/>Event notifications"]
    end
    
    subgraph Task Queues<br/>Workflow Orchestration
        TQ1["Celery<br/><i>Python ecosystem</i>"]
        TQ2["Sidekiq<br/><i>Ruby/Redis</i>"]
        TQ3["Step Functions<br/><i>AWS managed</i>"]
        TQUse["Use: Background jobs<br/>Multi-step workflows<br/>Scheduled tasks"]
    end
    
    subgraph Stream Processing<br/>Real-time Event Flows
        SP1["Kafka Streams<br/><i>Stateful processing</i>"]
        SP2["AWS Kinesis<br/><i>Managed streams</i>"]
        SP3["Apache Flink<br/><i>Complex analytics</i>"]
        SPUse["Use: Real-time analytics<br/>Event sourcing<br/>Windowed aggregation"]
    end
    
    MQ1 & MQ2 & MQ3 -.-> MQUse
    TQ1 & TQ2 & TQ3 -.-> TQUse
    SP1 & SP2 & SP3 -.-> SPUse

The async landscape divides into three territories based on processing model. Message queues handle discrete messages with acknowledgment. Task queues add workflow orchestration for multi-step processes. Stream processing handles continuous event flows with temporal semantics. Tools like Kafka span multiple territories.

Key Areas

Temporal Decoupling and Communication Patterns

The fundamental distinction between synchronous and asynchronous communication shapes every architectural decision. In synchronous systems, the caller blocks until the callee responds—think HTTP request-response or RPC calls. The caller’s thread waits, consuming resources and creating tight coupling. If the callee is slow or unavailable, the caller suffers immediately.

Asynchronous communication breaks this temporal coupling. The producer sends a message and continues immediately; the consumer processes it later, possibly seconds or hours after. This decoupling provides three critical benefits: the producer doesn’t wait for slow operations, the consumer can batch work for efficiency, and failures in one component don’t immediately cascade to others. When Stripe processes a payment, the API returns a payment intent immediately while fraud checks, bank authorizations, and ledger updates happen asynchronously. Users see instant feedback; the system maintains correctness through eventual consistency.

The trade-off is complexity. Synchronous systems have simple mental models: call a function, get a result. Asynchronous systems require tracking work through queues, handling partial failures, and reasoning about eventual consistency. Interviewers want to see you articulate this trade-off clearly: async isn’t always better, but it’s essential for scalable, resilient systems.

Decoupling Dimensions: Spatial, Temporal, and Failure Isolation

Asynchronism provides three types of decoupling, each solving different problems. Spatial decoupling means producers and consumers don’t need to know each other’s location or even existence. Netflix’s video upload service publishes an “upload complete” event; dozens of downstream services (encoding, thumbnail generation, metadata extraction) subscribe without the upload service knowing they exist. This enables independent deployment and team autonomy.

Temporal decoupling means producers and consumers don’t need to be active simultaneously. When GitHub Actions queues a CI job, the runner might not start for minutes if all workers are busy. The job waits in the queue; the user’s push operation completed immediately. This enables elastic scaling: add workers during peak hours, scale down at night.

Failure isolation prevents cascading failures. If Uber’s notification service crashes, ride requests still succeed—notifications queue up and deliver when the service recovers. Synchronous architectures would fail the entire ride request if any downstream service failed. Async architectures contain failures, turning hard dependencies into soft ones.

These decoupling dimensions aren’t free. They introduce eventual consistency, require idempotent operations (see Idempotent Operations for retry safety), and complicate debugging. But for systems operating at scale, the trade-off is usually worth it.

Queue-Based vs Stream-Based Processing

The choice between queue-based and stream-based async processing depends on your data characteristics and processing requirements. Queue-based systems (see Message Queues for implementation details) treat each message as an independent unit of work. Messages are acknowledged individually, retried on failure, and processed in roughly FIFO order. This model fits discrete tasks: sending an email, processing a payment, resizing an image. The mental model is simple: put work in a queue, workers take items and process them.

Stream-based systems treat data as a continuous, ordered flow. Events are never “consumed” in the traditional sense—they remain in the stream for a retention period, and multiple consumers can read the same events at different offsets. This model fits real-time analytics, event sourcing, and stateful processing. When Spotify tracks song plays to update “Wrapped” statistics, stream processors maintain running aggregates over time windows. The mental model is more complex but enables powerful temporal operations.

The boundary blurs in practice. Kafka can act as a message queue (with consumer groups and offset management) or a stream platform (with stream processing frameworks). The key distinction is your processing model: discrete tasks favor queues, continuous aggregation favors streams. Interviewers often probe this boundary to see if you understand the fundamental difference or just know tool names.

Back Pressure and Flow Control

Asynchronous systems must handle producers that generate work faster than consumers can process it. Without flow control, queues grow unbounded, memory exhausts, and systems crash. Back pressure mechanisms (see Back Pressure for detailed strategies) signal producers to slow down when consumers are overwhelmed.

The simplest form is queue depth monitoring: if a queue exceeds a threshold, producers receive errors or delays. AWS SQS exposes queue depth as a CloudWatch metric; autoscaling can add workers when depth grows. More sophisticated systems use credit-based flow control (like TCP’s sliding window) or reactive streams (like Akka Streams’ demand signaling).

The challenge is balancing responsiveness with stability. Aggressive back pressure prevents overload but can cascade upstream, turning a slow consumer into a system-wide slowdown. Lenient back pressure allows temporary bursts but risks queue overflow during sustained load. Production systems typically combine multiple strategies: queue limits for safety, autoscaling for elasticity, and circuit breakers to prevent cascading failures.

Task Orchestration and Workflow Management

Many real-world operations require multiple async steps with complex dependencies. When Airbnb processes a booking, it must charge the guest, notify the host, update calendars, trigger cleaning schedules, and send confirmation emails—some steps depend on others, some can run in parallel, and all must handle failures gracefully.

Task queues (see Task Queues for worker pool mechanics) provide orchestration primitives: job chaining, conditional execution, timeouts, and retry policies. AWS Step Functions models workflows as state machines with explicit error handling. Apache Airflow uses directed acyclic graphs (DAGs) to express dependencies. These tools turn complex multi-step processes into manageable, observable workflows.

The trade-off is operational complexity. Simple async patterns (fire-and-forget messages) are easy to reason about but hard to debug when things go wrong. Orchestrated workflows provide visibility and control but require learning new abstractions and managing additional infrastructure. Senior engineers choose the right level of orchestration for their use case: simple queues for independent tasks, full workflow engines for complex multi-step processes.

Three Dimensions of Decoupling

graph TB
    subgraph Spatial Decoupling<br/>Location Independence
        Producer1["Upload Service"] --"Publishes event"--> Topic1[Event Topic]
        Topic1 -."Subscribe"..-> C1[Encoder]
        Topic1 -."Subscribe"..-> C2[Thumbnail Gen]
        Topic1 -."Subscribe"..-> C3[Metadata Extract]
        Note1["Producer doesn't know<br/>consumers exist"]
    end
    
    subgraph Temporal Decoupling<br/>Time Independence
        Producer2["CI Trigger<br/><i>t=0</i>"] --"Enqueue job"--> Queue[Job Queue]
        Queue --"Process later<br/><i>t=300s</i>"--> Worker["Worker<br/><i>starts when available</i>"]
        Note2["Producer and consumer<br/>don't run simultaneously"]
    end
    
    subgraph Failure Isolation<br/>Fault Independence
        API[Ride Request API] --"1. Create ride"--> DB[(Database)]
        API --"2. Queue notification"--> NQ[Notification Queue]
        NQ -."3. Process".-> NS["Notification Service<br/><i>CRASHED</i>"]
        API --"4. Success response"--> User[User]
        Note3["Ride succeeds even if<br/>notifications fail"]
    end

Async processing provides three types of decoupling. Spatial decoupling enables independent deployment—services don’t know about each other. Temporal decoupling allows elastic scaling—components operate at different times. Failure isolation prevents cascades—one component’s failure doesn’t propagate to others.

Queue-Based vs Stream-Based Processing Models

graph LR
    subgraph Queue-Based Model<br/>Discrete Task Processing
        P1[Producer] --"Send message"--> Q["Queue<br/><i>FIFO</i>"]
        Q --"Consume & ACK"--> C1[Consumer 1]
        Q --"Consume & ACK"--> C2[Consumer 2]
        C1 --"Delete after ACK"--> Q
        QChar["✓ Message deleted after ACK<br/>✓ Each message processed once<br/>✓ Simple mental model<br/>Use: Email sending, image resize"]
    end
    
    subgraph Stream-Based Model<br/>Continuous Event Flow
        P2[Producer] --"Append event"--> S["Stream<br/><i>Ordered log</i>"]
        S --"Read offset 100"--> CG1["Consumer Group A<br/><i>Real-time analytics</i>"]
        S --"Read offset 50"--> CG2["Consumer Group B<br/><i>Batch processing</i>"]
        S -."Retained for 7 days"..-> Retention[Retention Policy]
        SChar["✓ Events retained, not deleted<br/>✓ Multiple consumers at different offsets<br/>✓ Replay capability<br/>Use: Event sourcing, real-time aggregation"]
    end
    
    QChar -.-> Q
    SChar -.-> S

Queue-based systems treat messages as discrete units that are consumed and deleted. Stream-based systems treat events as an ordered log that multiple consumers can read at different positions. Queues fit independent tasks; streams fit continuous processing and event replay scenarios.

How Things Connect

The async processing landscape forms a progression from simple to complex, with each layer building on the previous one. At the foundation, message queues provide basic temporal decoupling: producers send messages, consumers process them independently. This solves the immediate problem of blocking operations but doesn’t address workflow complexity or real-time processing.

Task queues add orchestration on top of message queues, managing job lifecycles, retries, and dependencies. They’re essentially message queues with workflow awareness. When you need to chain multiple async operations or handle timeouts, task queues provide the primitives. But they still treat each job as a discrete unit—they don’t handle continuous event streams.

Stream processing represents a different paradigm: instead of discrete messages, you process continuous flows with temporal semantics. Streams enable stateful operations (aggregating events over time windows) and real-time analytics that queues can’t efficiently support. But streams are harder to reason about and require different mental models.

Cutting across all three layers are cross-cutting concerns: back pressure prevents overload regardless of whether you’re using queues or streams. Idempotency ensures correctness when messages are retried, whether in a task queue or a stream processor. These concerns aren’t separate technologies—they’re design principles that apply to any async system.

In interviews, demonstrating this layered understanding separates strong candidates from weak ones. Weak candidates know tool names (“I’d use Kafka”). Strong candidates explain the progression: “We’ll start with SQS for simple async tasks, add Step Functions when we need workflow orchestration, and consider Kinesis if we need real-time aggregation. The choice depends on our latency requirements and processing model.”

Real-World Context

Netflix: Async Video Processing Pipeline

Netflix’s video encoding pipeline is a masterclass in async architecture. When a studio uploads a new movie, the upload service immediately returns success and publishes an event to a message queue. Downstream workers pick up the event and trigger a complex workflow: video validation, multiple encoding passes (4K, 1080p, 720p, mobile), thumbnail generation, subtitle processing, and CDN distribution. Each step runs asynchronously, with failures retried and progress tracked.

The key insight: encoding a 2-hour movie takes hours, but users see “upload complete” in seconds. The async architecture decouples user-facing operations from expensive backend processing. Netflix can scale encoding workers independently based on queue depth, adding capacity during peak upload times. If an encoding worker crashes, the job retries automatically without user intervention. This pattern handles billions of encoding jobs annually with minimal manual intervention.

Netflix’s architecture combines multiple async patterns: message queues for job distribution, task orchestration for multi-step workflows, and stream processing for real-time quality monitoring. The system tracks every encoding job’s state, provides observability into queue depths and worker utilization, and automatically scales based on demand. This is async processing at production scale.

Spotify: Event-Driven Architecture

Spotify’s backend is fundamentally event-driven, with hundreds of microservices communicating primarily through async events rather than synchronous API calls. When a user plays a song, that action generates events consumed by multiple services: the recommendation engine updates user preferences, the analytics pipeline tracks listening patterns, the social graph notifies followers, and the royalty system records plays for artist payments.

The architecture uses Kafka as the central nervous system, with services publishing events to topics and subscribing to events they care about. This provides spatial decoupling—services don’t know about each other—and temporal decoupling—consumers process events at their own pace. During peak hours (evening in major markets), Spotify processes millions of events per second without any service blocking on another.

The challenge is maintaining consistency. When a user’s subscription lapses, multiple services must react: the playback service restricts features, the recommendation engine adjusts, and the billing system updates. Spotify uses event sourcing patterns and eventual consistency, accepting that different services might have slightly stale views of user state. The trade-off enables massive scale: Spotify serves 500+ million users with an architecture that would collapse under synchronous coupling.

Both examples illustrate the same principle: async processing isn’t a single technology, it’s an architectural philosophy. You choose specific tools (queues, streams, orchestrators) based on your requirements, but the underlying pattern—decoupling producers from consumers in time—remains constant.

Netflix Video Processing Pipeline Architecture

graph LR
    User[Studio User] --"1. Upload movie"--> Upload[Upload Service]
    Upload --"2. Store raw"--> S3_Raw[("S3 Raw<br/>Bucket")]
    Upload --"3. Publish event"--> EventBus[Event Bus]
    Upload --"4. Immediate success"--> User
    
    EventBus --"5. Trigger workflow"--> Orchestrator[Step Functions<br/>Orchestrator]
    
    Orchestrator --"6. Validate"--> Validator[Validation Worker]
    Validator --"Result"--> Orchestrator
    
    Orchestrator --"7. Encode jobs"--> EncodeQueue[Encoding Queue]
    EncodeQueue --> E1["Encoder<br/>4K"]
    EncodeQueue --> E2["Encoder<br/>1080p"]
    EncodeQueue --> E3["Encoder<br/>720p"]
    
    E1 & E2 & E3 --"8. Store encoded"--> S3_Encoded[("S3 Encoded<br/>Bucket")]
    
    Orchestrator --"9. Parallel tasks"--> ThumbQueue[Thumbnail Queue]
    Orchestrator --"9. Parallel tasks"--> SubQueue[Subtitle Queue]
    
    ThumbQueue --> TWorker[Thumbnail Worker]
    SubQueue --> SWorker[Subtitle Worker]
    
    TWorker & SWorker --"10. Store assets"--> S3_Assets[("S3 Assets<br/>Bucket")]
    
    S3_Encoded & S3_Assets --"11. Distribute"--> CDN[CloudFront CDN]
    
    Orchestrator --"12. Complete event"--> EventBus

Netflix’s video pipeline demonstrates production-scale async architecture. The upload service returns immediately while a complex workflow processes the video asynchronously. Step Functions orchestrates parallel encoding at multiple resolutions, thumbnail generation, and subtitle processing. Workers scale independently based on queue depth, processing billions of jobs annually without blocking user uploads.

Interview Essentials

Mid-Level

Mid-level candidates should articulate the basic synchronous vs asynchronous trade-off clearly. Explain that async processing decouples components in time, allowing producers to continue without waiting for consumers. Describe a concrete scenario: “When a user uploads a profile photo, we return success immediately and process the image resize asynchronously. This prevents slow image processing from blocking the upload API.”

Demonstrate awareness of the three main async patterns: message queues for reliable delivery, task queues for background jobs, and streams for real-time processing. You don’t need deep implementation knowledge, but you should know when to use each. For example: “I’d use a message queue like SQS for order processing because we need guaranteed delivery and can tolerate some latency. For real-time analytics, I’d consider Kinesis because we need to process events as they arrive.”

Show you understand the consistency trade-off. Async systems are eventually consistent—acknowledge this explicitly and explain how you’d handle it. “Users might see a stale follower count for a few seconds after someone follows them, but that’s acceptable for this use case.” Interviewers want to see you reason about trade-offs, not just list technologies.

Senior

Senior candidates should demonstrate deep understanding of decoupling dimensions and their implications. Explain spatial, temporal, and failure isolation with concrete examples: “By using a message queue between our API and notification service, we achieve failure isolation—if notifications are down, API requests still succeed. We also get temporal decoupling—notifications can batch for efficiency without blocking the API.”

Discuss back pressure and flow control proactively. “As queue depth grows, we’ll need back pressure mechanisms. We could rate-limit producers, auto-scale consumers, or implement priority queues to shed low-priority work under load.” This shows operational maturity—you’re thinking about what happens when things go wrong.

Compare async patterns with nuance. Don’t just say “I’d use Kafka”—explain why Kafka over SQS, or why a task queue over a message queue. “Kafka gives us event replay and multiple consumers, which matters for this use case because we need both real-time processing and batch analytics on the same events. SQS would require duplicating events.” Senior engineers choose tools based on requirements, not popularity.

Address idempotency and retry semantics. “Since we’re using at-least-once delivery, our consumers must be idempotent. We’ll use a deduplication table to track processed message IDs.” This demonstrates you’ve built async systems in production and dealt with the edge cases.

Staff+

Staff-plus candidates should discuss async architecture at the system level, not just component level. Explain how async patterns enable organizational scaling: “By decoupling services through events, teams can deploy independently. The payment team doesn’t need to coordinate with the notification team—they just publish payment events and the notification team subscribes.”

Analyze trade-offs quantitatively. “Moving to async processing will reduce P99 latency from 2s to 200ms for the API, but we’ll need to handle eventual consistency. Based on our traffic patterns (10K req/s peak), we’ll need about 50 workers to keep queue depth under 1000 messages. At $0.10/hour per worker, that’s $360/month—well worth it for the latency improvement.”

Discuss evolution and migration strategies. “We can’t move everything async at once. I’d start with the slowest, most failure-prone operations—image processing and external API calls. We’ll use the strangler fig pattern: new requests go async, existing code paths remain synchronous until we’ve validated the async approach. We’ll need feature flags to roll back quickly if issues arise.”

Address observability and debugging. “Async systems are harder to debug because requests don’t have single stack traces. We’ll need distributed tracing (with correlation IDs across queue boundaries), queue depth monitoring, and dead letter queues for failed messages. I’d also implement shadow mode initially—process messages async but still do the work synchronously, comparing results to validate correctness.”

Show awareness of anti-patterns. “One mistake I’ve seen is using async for everything. Synchronous calls are simpler and appropriate for fast, reliable operations. Async adds complexity—only use it when the benefits (decoupling, latency, failure isolation) justify the cost.”

Common Interview Questions

“When would you choose asynchronous processing over synchronous?” — Look for: operations that are slow (>100ms), failure-prone (external APIs), or variable in timing (batch processing). Mention the consistency trade-off explicitly.

“How do you handle failures in async systems?” — Discuss retries with exponential backoff, dead letter queues for poison messages, and idempotency to handle duplicate processing. Show you’ve dealt with production failures.

“What’s the difference between a message queue and a stream?” — Queues are for discrete tasks with acknowledgment; streams are for continuous event flows with replay. Kafka can do both, but the processing model differs.

“How do you prevent async queues from growing unbounded?” — Back pressure mechanisms: rate limiting producers, auto-scaling consumers, priority queues, and circuit breakers. Mention monitoring queue depth as a key metric.

“Design an async notification system for 1M users” — This tests your ability to apply async patterns. Discuss fan-out strategies, batching, priority queues for urgent notifications, and handling delivery failures.

Red Flags to Avoid

Using async for everything without considering the complexity cost — shows lack of judgment about trade-offs

Not mentioning eventual consistency or idempotency — indicates no production async experience

Confusing message queues with task queues or streams — shows superficial tool knowledge without understanding use cases

Ignoring failure scenarios and retry logic — async systems fail differently than sync; you must address this

Not discussing observability and debugging — async systems are harder to debug; ignoring this suggests inexperience

Choosing tools based on popularity rather than requirements — “I’d use Kafka because everyone uses it” without explaining why

Key Takeaways

Asynchronous processing decouples producers and consumers in time, enabling independent scaling, failure isolation, and reduced latency for expensive operations. The trade-off is eventual consistency and increased complexity.

Three main patterns serve different needs: message queues for reliable discrete tasks, task queues for orchestrated workflows, and stream processing for real-time event flows. Choose based on your processing model, not tool popularity.

Async systems require explicit handling of failures, retries, idempotency, and back pressure. These aren’t optional features—they’re fundamental to correctness and stability at scale.

Real-world systems like Netflix and Spotify combine multiple async patterns, using queues for job distribution, orchestrators for workflows, and streams for analytics. The architecture evolves based on scale and requirements.

In interviews, demonstrate understanding of trade-offs: when async helps (slow operations, failure isolation, variable load) and when it hurts (added complexity, eventual consistency, debugging difficulty). Strong candidates choose patterns based on requirements, not trends.