RPC vs REST: Remote Procedure Call Explained

After this topic, you will be able to:

Explain the RPC paradigm and how it abstracts remote calls as local function calls
Analyze the trade-offs between RPC and REST for internal vs external APIs
Distinguish between different RPC implementations (XML-RPC, JSON-RPC, gRPC) and their use cases

TL;DR

RPC (Remote Procedure Call) is a communication protocol that allows a client to execute a procedure on a remote server as if it were a local function call, abstracting away network complexity. Unlike REST’s resource-oriented approach, RPC is action-oriented and focuses on exposing behaviors. It’s widely used for internal microservice communication at companies like Google, Netflix, and Uber due to its performance benefits and strong typing, though it trades flexibility for tight coupling between client and server.

Mental Model

Think of RPC like making a phone call to a colleague in another office. You dial their extension (the procedure name), speak your request (parameters), wait for them to do the work, and get a response—all without worrying about how the phone system routes your call through the building’s infrastructure. From your perspective, it feels like they’re sitting right next to you. The phone system (RPC framework) handles marshaling your voice into electrical signals, routing them through the network, and converting them back to sound on the other end. You don’t think about TCP packets or network protocols; you just make the call and get an answer. This is exactly how RPC abstracts remote computation: the complexity of network communication disappears behind a familiar function call interface.

Why This Matters

In system design interviews, RPC is a critical decision point that reveals your understanding of API design trade-offs. Interviewers want to see if you know when to choose RPC over REST—typically for internal service-to-service communication where performance and type safety matter more than flexibility. Companies like Google built their entire infrastructure on RPC (with gRPC), while Netflix uses it extensively for low-latency microservice calls. Understanding RPC demonstrates you can make pragmatic architectural choices: REST for public APIs where clients are diverse and unknown, RPC for internal systems where you control both ends and need maximum efficiency. The ability to articulate this trade-off—and explain why Stripe uses REST externally but RPC internally—is what separates mid-level engineers from senior architects.

Core Concept

Remote Procedure Call (RPC) is a protocol that enables a program to execute a procedure (function) on a remote server as if it were a local call. The fundamental insight of RPC is that network communication can be hidden behind a familiar programming abstraction: the function call. When you write result = calculateTax(order), you shouldn’t need to know whether calculateTax runs on your machine or on a server in another datacenter. RPC frameworks handle the messy details of serialization, network transport, error handling, and deserialization, presenting a clean interface that looks identical to local code. This abstraction emerged in the 1980s with Sun RPC and has evolved through XML-RPC, JSON-RPC, Apache Thrift, and modern gRPC. The core principle remains unchanged: make distributed computing feel like local computing.

How It Works

The RPC lifecycle involves six distinct steps that transform a local-looking function call into a network request and back. First, the client invokes what appears to be a local function—for example, getUserProfile(userId). Second, the client stub (a generated proxy object) intercepts this call and marshals the procedure name and arguments into a binary or text format. This marshaling process serializes complex data structures into a format suitable for network transmission. Third, the client communication module (typically the OS network stack) sends this serialized message over the network using TCP or HTTP/2. Fourth, the server communication module receives the incoming packets and passes them to the server stub. Fifth, the server stub unmarshals the message, extracting the procedure name and arguments, then invokes the actual server-side function with those parameters. The server executes the business logic and returns a result. Sixth, the response travels back through the same pipeline in reverse: server stub marshals the result, network transport sends it back, client stub unmarshals it, and the original caller receives what looks like a normal function return value. This entire round-trip might take 10-50ms, but from the developer’s perspective, it’s just a function call.

Complete RPC Call Flow: Six-Step Process

graph LR
    Client["Client Code<br/><i>Payment Service</i>"]
    ClientStub["Client Stub<br/><i>Generated Proxy</i>"]
    ClientNet["Client Network<br/><i>TCP/HTTP2</i>"]
    ServerNet["Server Network<br/><i>TCP/HTTP2</i>"]
    ServerStub["Server Stub<br/><i>Generated Handler</i>"]
    Server["Server Code<br/><i>Inventory Service</i>"]
    
    Client --"1. checkStock(itemId, qty)"--> ClientStub
    ClientStub --"2. Marshal to binary<br/>{method: 'checkStock', params: {...}}"--> ClientNet
    ClientNet --"3. Send over network<br/>TCP connection"--> ServerNet
    ServerNet --"4. Receive packets"--> ServerStub
    ServerStub --"5. Unmarshal & invoke<br/>checkStock('SKU-12345', 2)"--> Server
    Server --"6. Return result<br/>{available: true, qty: 150}"--> ServerStub
    ServerStub --"Marshal response"--> ServerNet
    ServerNet --"Send response"--> ClientNet
    ClientNet --"Unmarshal"--> ClientStub
    ClientStub --"Return to caller"--> Client

The RPC lifecycle transforms a local-looking function call into a network request through six steps: client invocation, marshaling, network transport, server reception, unmarshaling, and execution. The entire round-trip typically takes 10-50ms, but appears as a simple function call to the developer.

Key Principles

principle: Location Transparency explanation: The calling code should not need to know whether a function executes locally or remotely. This abstraction allows developers to refactor monoliths into microservices without rewriting business logic—just change the implementation from local to remote. example: At Netflix, when they split their monolith, existing code calling getRecommendations() continued working unchanged. The function signature stayed identical; only the implementation changed from in-process to cross-service RPC.

principle: Strong Typing and Code Generation explanation: RPC frameworks typically use Interface Definition Languages (IDL) like Protocol Buffers or Thrift to define service contracts. Compilers generate type-safe client and server code, catching errors at compile time rather than runtime. This is a massive advantage over REST, where you’re passing untyped JSON and discovering schema mismatches in production. example: Google’s internal services define APIs in Protocol Buffers. When a service changes its API, clients get compile-time errors immediately, preventing the ‘field renamed, production broke’ disasters common with REST APIs.

principle: Performance Through Binary Protocols explanation: RPC frameworks often use compact binary serialization (Protobuf, Thrift, Avro) instead of verbose JSON. Combined with HTTP/2 multiplexing and connection reuse, this reduces latency and bandwidth consumption by 5-10x compared to REST. example: Uber’s microservices communicate via TChannel (their RPC framework). Switching from JSON/REST to binary RPC reduced their average service-to-service latency from 50ms to 5ms, enabling them to chain more services in a single user request without exceeding their 200ms SLA.

principle: Tight Coupling by Design explanation: RPC intentionally couples clients to server implementations. The client needs the server’s interface definition to generate stubs. This is a feature, not a bug, for internal systems where you control both sides and want compile-time safety. It becomes a liability for public APIs where you can’t control or even know all clients. example: Stripe’s internal payment processing uses RPC between services they own, but exposes REST APIs to external developers. They can evolve internal RPC contracts with coordinated deploys, but must maintain REST API backward compatibility for years.

RPC vs REST: When to Use Each

Philosophical Differences

REST and RPC represent fundamentally different mental models for distributed systems. REST is resource-oriented: it models your system as a collection of resources (nouns) that clients manipulate through standard HTTP verbs (GET, POST, PUT, DELETE). A REST API for a banking system exposes resources like /accounts/123 and /transactions/456. RPC is action-oriented: it models your system as a collection of procedures (verbs) that clients invoke. An RPC API for the same system exposes procedures like transferMoney(fromAccount, toAccount, amount) and getAccountBalance(accountId). REST says ‘here are the things in my system; tell me what you want to do with them.’ RPC says ‘here are the operations I support; tell me which one you want.’ This philosophical difference cascades into every design decision.

Technical Trade Offs

REST’s resource model maps naturally to HTTP semantics, making it cacheable by default. A GET /users/123 request can be cached by CDNs, reverse proxies, and browsers without any custom logic. RPC calls, even read-only ones, typically use POST requests (since they’re sending procedure parameters in the body), which breaks HTTP caching. You can work around this with custom cache headers, but you’re fighting the framework. However, RPC wins on performance for internal systems: binary serialization is 5-10x more compact than JSON, and frameworks like gRPC use HTTP/2 multiplexing to send multiple requests over a single connection. REST’s text-based JSON over HTTP/1.1 incurs significant overhead. For type safety, RPC is unmatched—your IDE autocompletes remote function calls and catches type errors before you commit code. REST APIs require runtime validation and manual deserialization, leading to the classic ‘undefined is not a function’ errors when the API changes.

Decision Criteria

Choose RPC for internal microservices where you control both client and server, need low latency, and want strong typing. Google, Uber, Netflix, and Twitter all use RPC internally. Choose REST for public APIs where clients are diverse (web browsers, mobile apps, third-party developers), you need HTTP caching, and you want loose coupling to evolve the API independently. Stripe, GitHub, and Twilio expose REST APIs externally. The pattern is clear: RPC inside the firewall, REST at the boundary. A hybrid approach is common: Airbnb uses Thrift RPC between backend services but exposes REST APIs to mobile apps. This gives them performance and type safety internally while maintaining flexibility and compatibility externally. The key insight is that RPC and REST aren’t competing solutions—they solve different problems for different contexts.

RPC vs REST: Architectural Decision Matrix

graph TB
    Start["API Design Decision"]
    Q1{"Who are the clients?"}
    Q2{"Performance requirements?"}
    Q3{"Control both ends?"}
    Q4{"Need HTTP caching?"}
    
    Start --> Q1
    Q1 -->|Internal services| Q2
    Q1 -->|External/public| Q4
    
    Q2 -->|Low latency critical<br/>< 10ms| RPC1["✓ Use RPC<br/><i>gRPC, Thrift</i>"]
    Q2 -->|Standard latency<br/>> 50ms| Q3
    
    Q3 -->|Yes, coordinated deploys| RPC2["✓ Use RPC<br/><i>Type safety matters</i>"]
    Q3 -->|No, independent teams| REST1["✓ Use REST<br/><i>Loose coupling needed</i>"]
    
    Q4 -->|Yes, read-heavy| REST2["✓ Use REST<br/><i>CDN/browser caching</i>"]
    Q4 -->|No| Q5{"Client diversity?"}
    
    Q5 -->|High<br/>web, mobile, 3rd party| REST3["✓ Use REST<br/><i>Flexibility needed</i>"]
    Q5 -->|Low<br/>known clients| Hybrid["✓ Hybrid<br/><i>REST external, RPC internal</i>"]
    
    subgraph Examples
        Ex1["Google: RPC everywhere<br/><i>Monorepo, full control</i>"]
        Ex2["Netflix: Hybrid<br/><i>REST for mobile, gRPC internal</i>"]
        Ex3["Stripe: Hybrid<br/><i>REST public API, RPC internal</i>"]
    end

Decision tree for choosing between RPC and REST based on client type, performance needs, and control. Most production systems use a hybrid approach: RPC for internal microservices where performance and type safety matter, REST for external APIs where flexibility and compatibility are critical.

How It Works

Step By Step Flow

Let’s trace a complete RPC call from a payment service to an inventory service checking if an item is in stock. Step 1: Client Code Invocation — The payment service calls inventoryService.checkStock(itemId: 'SKU-12345', quantity: 2). This looks like a normal function call in the code. Step 2: Client Stub Marshaling — The generated client stub intercepts this call. It serializes the method name (‘checkStock’), parameter names (‘itemId’, ‘quantity’), and values (‘SKU-12345’, 2) into a binary Protocol Buffer message or JSON payload. It also adds metadata like request ID, timeout, and authentication tokens. Step 3: Network Transport — The client’s RPC library opens a TCP connection (or reuses an existing one) to the inventory service’s address (discovered via service registry). It sends the serialized message over this connection. If using gRPC, this happens over HTTP/2, allowing multiple concurrent requests on one connection. Step 4: Server Reception — The inventory service’s RPC server receives the incoming bytes. Its communication module handles TCP reassembly and HTTP/2 framing, then passes the complete message to the server stub. Step 5: Server Stub Unmarshaling — The server stub deserializes the message, extracting the method name and parameters. It validates the request (authentication, rate limiting) and invokes the actual business logic: InventoryService.checkStock('SKU-12345', 2). Step 6: Response Path — The inventory service executes the query, finds the item has 150 units in stock, and returns {available: true, quantity: 150}. The server stub marshals this response, the network sends it back, the client stub unmarshals it, and the original payment service code receives the result—all in 10-20ms. If anything fails (network timeout, server crash, invalid parameters), the RPC framework throws an exception that looks like a local error, maintaining the abstraction.

Payment Service to Inventory Service: Real Request Flow

sequenceDiagram
    participant PS as Payment Service<br/>(Client)
    participant CS as Client Stub<br/>(Generated)
    participant Net as Network<br/>(HTTP/2)
    participant SS as Server Stub<br/>(Generated)
    participant IS as Inventory Service<br/>(Server)
    participant DB as Database
    
    PS->>CS: checkStock('SKU-12345', 2)
    Note over CS: Serialize to Protobuf<br/>Add metadata (requestID, auth)
    CS->>Net: POST /InventoryService/checkStock<br/>Body: binary payload
    Note over Net: Reuse existing TCP connection<br/>HTTP/2 multiplexing
    Net->>SS: Receive binary message
    Note over SS: Deserialize Protobuf<br/>Validate auth & rate limits
    SS->>IS: checkStock('SKU-12345', 2)
    IS->>DB: SELECT quantity FROM inventory<br/>WHERE sku='SKU-12345'
    DB-->>IS: quantity: 150
    IS-->>SS: {available: true, quantity: 150}
    Note over SS: Serialize response to Protobuf
    SS-->>Net: HTTP 200 OK<br/>Body: binary payload
    Net-->>CS: Response received
    Note over CS: Deserialize Protobuf<br/>Extract result
    CS-->>PS: {available: true, quantity: 150}
    Note over PS: Total time: 10-20ms

A complete RPC request showing how a payment service checks inventory stock. The client and server stubs handle all serialization, while the network layer manages connection reuse and HTTP/2 multiplexing. The business logic remains simple despite the underlying complexity.

Error Handling

RPC error handling is more complex than local calls because networks are unreliable. When you call a local function, it either returns a value or throws an exception—two outcomes. With RPC, you have at least five outcomes: success, explicit server error (business logic failure), timeout (request sent but no response), connection failure (couldn’t reach server), or partial failure (request succeeded but response lost). Modern RPC frameworks expose these as different exception types. gRPC has status codes like DEADLINE_EXCEEDED, UNAVAILABLE, and INTERNAL. The client must decide: should I retry? Was the operation idempotent? Did the server execute my request before timing out? This is why companies like Uber implement automatic retries with exponential backoff and idempotency keys for RPC calls. The framework can’t solve this—it requires application-level design.

RPC Error Scenarios: Five Possible Outcomes

stateDiagram-v2
    [*] --> ClientInvoke: Client calls RPC
    ClientInvoke --> RequestSent: Stub marshals & sends
    
    RequestSent --> Success: Server processes & responds
    RequestSent --> Timeout: No response within deadline
    RequestSent --> ConnectionFail: Network unreachable
    RequestSent --> ServerError: Server returns error
    RequestSent --> PartialFailure: Request succeeded, response lost
    
    Success --> [*]: Return result
    
    Timeout --> RetryDecision: Was operation idempotent?
    RetryDecision --> Retry: Yes, safe to retry
    RetryDecision --> Fail: No, cannot retry
    Retry --> RequestSent: Exponential backoff
    Fail --> [*]: Throw DEADLINE_EXCEEDED
    
    ConnectionFail --> CircuitBreaker: Check failure rate
    CircuitBreaker --> Retry: Circuit closed
    CircuitBreaker --> [*]: Circuit open, fail fast
    
    ServerError --> [*]: Throw INTERNAL/INVALID_ARGUMENT
    
    PartialFailure --> IdempotencyCheck: Check idempotency key
    IdempotencyCheck --> [*]: Deduplicate or retry
    
    note right of Timeout
        Unknown state:
        Did server execute?
        Need idempotency keys
    end note
    
    note right of PartialFailure
        Most dangerous:
        Money transferred but
        client thinks it failed
    end note

RPC calls have five possible outcomes, unlike local calls which only succeed or throw exceptions. Timeouts and partial failures are particularly dangerous because the client doesn’t know if the server executed the operation. Production systems must implement idempotency keys, retry logic with exponential backoff, and circuit breakers to handle these scenarios safely.

Versioning Challenges

RPC’s tight coupling creates versioning nightmares. When you update a service’s API, all clients must update their generated stubs. In a monorepo with coordinated deploys (Google’s model), this works: update the proto file, regenerate all stubs, deploy everything together. In a polyrepo with independent services (most companies), it’s painful. You must maintain backward compatibility: add new fields as optional, never remove fields, use field numbers carefully. Protocol Buffers and Thrift support this through field numbering—old clients ignore new fields, new servers provide defaults for missing fields. But you can’t change field types or reorder parameters without breaking clients. This is why REST’s loose coupling is valuable for public APIs: you can add new JSON fields without breaking existing clients. For RPC, the solution is either strict API governance (Google’s approach) or versioned endpoints (v1.PaymentService, v2.PaymentService), which fragments your client base.

Common Misconceptions

misconception: RPC is always faster than REST why_wrong: Many candidates assume RPC’s binary protocols automatically make it faster, ignoring network latency and serialization costs. truth: RPC is faster for internal, high-throughput communication where serialization overhead dominates (thousands of small requests per second). For a single request over the public internet, the 100ms network latency dwarfs the 1ms difference between JSON and Protobuf serialization. REST can actually be faster for cacheable read operations because HTTP caching is free. The real RPC performance win comes from HTTP/2 multiplexing and connection reuse in microservice meshes, not just binary encoding.

misconception: RPC and REST are mutually exclusive choices why_wrong: Candidates often frame this as ‘pick one for your entire system,’ missing that most companies use both strategically. truth: Production systems use RPC internally and REST externally. Netflix uses gRPC between backend services but REST for their public API. This hybrid approach gives you performance and type safety where you control both ends, and flexibility and compatibility where you don’t. The decision isn’t ‘RPC vs REST’—it’s ‘RPC for this use case, REST for that one.’

misconception: RPC is a legacy technology replaced by REST why_wrong: The rise of REST in the 2000s led many to believe RPC was obsolete, but the opposite happened. truth: RPC is more popular than ever in modern microservices. Google open-sourced gRPC in 2015, and it’s now used by Netflix, Uber, Square, Dropbox, and thousands of companies. The difference is that modern RPC (gRPC, Thrift) learned from REST’s mistakes: they use HTTP/2, support streaming, and have better tooling. Old RPC (CORBA, SOAP) was clunky and proprietary. New RPC is fast, open-source, and designed for cloud-native systems.

misconception: RPC calls should look exactly like local function calls why_wrong: The original RPC vision was complete transparency, but this creates dangerous assumptions about reliability and latency. truth: Network calls should be explicit, not hidden. Modern RPC frameworks make remote calls look similar to local calls for convenience, but good practice is to make them distinguishable—use async/await, explicit timeout parameters, or naming conventions like remoteInventoryService.checkStock(). This prevents developers from treating network calls like local calls and forgetting to handle timeouts, retries, and partial failures. The abstraction should reduce boilerplate, not hide reality.

misconception: You need RPC frameworks to implement RPC why_wrong: Candidates think RPC requires gRPC or Thrift, missing that RPC is a pattern, not a specific technology. truth: Any request-response protocol can implement RPC. You can build RPC over HTTP+JSON (JSON-RPC), WebSockets, or even message queues. Early web services used XML-RPC over HTTP. The framework (gRPC, Thrift) just provides tooling: code generation, efficient serialization, connection pooling. You’re trading implementation effort for features. For a simple internal service, a hand-rolled HTTP+JSON RPC might be fine. For a high-scale system, gRPC’s optimizations matter.

Real-World Usage

Google Grpc

Google invented Protocol Buffers in 2001 and used internal RPC frameworks (Stubby) for all service-to-service communication. In 2015, they open-sourced gRPC, a modern RPC framework built on HTTP/2 and Protobuf. Today, Google processes over 10 billion gRPC calls per second internally. Every Google service—Search, Gmail, YouTube, Maps—uses gRPC. The key insight: when you control both client and server and need microsecond-level efficiency, RPC’s tight coupling becomes an advantage. Google’s monorepo and automated tooling make coordinated API changes trivial.

Uber Tchannel

Uber built TChannel, their own RPC framework, to handle 40+ microservices making millions of calls per second. They needed sub-10ms latency for the dispatch system (matching riders to drivers) and chose RPC over REST because JSON serialization was too slow. TChannel uses Thrift for schemas and implements automatic retries, circuit breaking, and distributed tracing. When a rider requests a ride, the mobile app makes one REST call to Uber’s API gateway, which triggers 20+ internal RPC calls (user service, payment service, dispatch service, etc.) that must complete in under 200ms total.

Netflix Zuul

Netflix uses gRPC for internal microservice communication but REST for their public API. Their API gateway (Zuul) translates incoming REST requests from mobile apps into internal gRPC calls. This architecture gives them the best of both worlds: mobile developers use familiar REST, while backend services get RPC’s performance and type safety. Netflix processes over 2 billion API requests per day, with each request fanning out to 5-10 internal services via gRPC. They chose RPC internally because their microservices are written in Java, and gRPC’s generated stubs integrate seamlessly with their codebase.

Netflix Hybrid Architecture: REST External, gRPC Internal

graph TB
    subgraph External Clients
        Mobile["Mobile App<br/><i>iOS/Android</i>"]
        Web["Web Browser<br/><i>React SPA</i>"]
    end
    
    subgraph API Gateway Layer
        Zuul["Zuul Gateway<br/><i>REST → gRPC translator</i>"]
    end
    
    subgraph Internal Microservices
        User["User Service<br/><i>gRPC</i>"]
        Rec["Recommendation<br/><i>gRPC</i>"]
        Video["Video Metadata<br/><i>gRPC</i>"]
        Auth["Auth Service<br/><i>gRPC</i>"]
        Billing["Billing Service<br/><i>gRPC</i>"]
    end
    
    subgraph Data Layer
        Cache[("Redis Cache")]
        DB[("Cassandra DB")]
    end
    
    Mobile --"1. GET /api/home<br/>REST/JSON"--> Zuul
    Web --"REST/JSON"--> Zuul
    
    Zuul --"2. getUserProfile()<br/>gRPC binary"--> User
    Zuul --"3. getRecommendations()<br/>gRPC binary"--> Rec
    Zuul --"4. getVideoMetadata()<br/>gRPC binary"--> Video
    Zuul --"5. validateToken()<br/>gRPC binary"--> Auth
    
    User --> Cache
    Rec --> DB
    Video --> DB
    Auth --> Cache
    
    Zuul --"6. Aggregate results<br/>Return JSON"--> Mobile
    
    note1["Why REST externally?<br/>• Can't control client updates<br/>• HTTP caching for CDN<br/>• Familiar to developers"]
    note2["Why gRPC internally?<br/>• Sub-10ms latency needed<br/>• Type safety across 500+ services<br/>• HTTP/2 multiplexing<br/>• 2B requests/day"]

Netflix’s hybrid architecture uses REST for external APIs (mobile/web clients) and gRPC for internal microservices. A single user request triggers 5-10 internal gRPC calls that must complete in under 200ms. This pattern—REST at the boundary, RPC inside—is common at companies like Uber, Airbnb, and Stripe.

Stripe Internal Rpc

Stripe exposes REST APIs to customers but uses RPC internally between their payment processing services. Their fraud detection system makes synchronous RPC calls to check transactions in real-time, requiring sub-50ms latency. They use a custom RPC framework built on HTTP+JSON (not gRPC) because they needed fine-grained control over serialization and error handling. The key lesson: you don’t need to adopt gRPC to get RPC benefits. Stripe’s approach shows that a well-designed HTTP+JSON RPC can deliver performance and type safety if you control both ends.

Interview Essentials

Mid-Level

At the mid-level, demonstrate you understand RPC’s core abstraction and when to use it. Explain that RPC makes remote calls look like local function calls, hiding network complexity. Contrast it with REST: ‘RPC is action-oriented (call this function), REST is resource-oriented (manipulate this resource).’ Describe the basic flow: client stub marshals parameters, network sends them, server stub unmarshals and invokes the function, response travels back. Mention that RPC is common for internal microservices (Google, Uber) because it’s faster and type-safe, while REST is better for public APIs because it’s flexible and cacheable. If asked to design a system, suggest RPC for low-latency internal communication and REST for external APIs. Show you know the trade-off exists, even if you can’t articulate every nuance.

Senior

Senior engineers must articulate the subtle trade-offs and demonstrate production experience. Explain why RPC’s tight coupling is a feature for internal systems: ‘When I control both client and server, I want compile-time safety. If I change an API, I want all callers to break at build time, not in production.’ Discuss versioning strategies: Protocol Buffers’ field numbering for backward compatibility, or versioned service endpoints. Mention specific frameworks (gRPC, Thrift, Avro) and their differences: gRPC uses HTTP/2 and Protobuf, Thrift is language-agnostic with pluggable transports. Describe error handling: ‘RPC calls can timeout, fail partially, or succeed but lose the response. I need idempotency keys and retry logic.’ In system design, proactively suggest RPC for internal services and justify it: ‘For the payment-to-inventory call, I’d use gRPC because we need sub-10ms latency and strong typing. For the mobile-to-backend call, REST because we can’t control client updates.’ Show you’ve made this decision in production.

Staff+

Staff+ engineers must demonstrate deep expertise and architectural judgment. Discuss RPC’s evolution: ‘Early RPC (CORBA, SOAP) failed because it was too complex and proprietary. Modern RPC (gRPC) succeeded by embracing HTTP/2, open standards, and cloud-native patterns.’ Explain when RPC’s disadvantages outweigh benefits: ‘For a public API with unknown clients, RPC’s tight coupling is a liability. I can’t force third-party developers to regenerate stubs when I change my API. REST’s loose coupling and HTTP caching are worth the performance cost.’ Describe advanced patterns: ‘At scale, we used gRPC streaming for real-time data pipelines, avoiding the overhead of establishing new connections for each message. We also implemented custom load balancing in the gRPC client to route requests based on server load, not just round-robin.’ Mention observability: ‘RPC frameworks need distributed tracing (OpenTelemetry) to debug cross-service calls. Without it, you can’t tell if a 500ms request was slow because of network latency, server processing, or cascading failures.’ Show you’ve operated RPC at scale and understand the operational challenges, not just the happy path.

Common Interview Questions

When would you choose RPC over REST? (Answer: Internal microservices where you control both ends, need low latency, and want type safety. REST for public APIs where clients are diverse and you need loose coupling.)

How does RPC handle versioning? (Answer: Protocol Buffers use field numbering for backward compatibility. Add new fields as optional, never remove fields. Alternatively, version your service endpoints: v1.PaymentService, v2.PaymentService.)

What happens if an RPC call times out? (Answer: The client doesn’t know if the server received the request, processed it, or failed. You need idempotency keys and retry logic. Modern frameworks expose timeout errors as specific exception types like DEADLINE_EXCEEDED.)

Why do companies like Google use RPC internally but REST externally? (Answer: RPC gives performance and type safety for internal systems where coordinated deploys are feasible. REST provides flexibility and compatibility for external APIs where you can’t control clients.)

How does gRPC achieve better performance than REST? (Answer: Binary Protobuf serialization is 5-10x more compact than JSON. HTTP/2 multiplexing allows multiple requests on one connection. Connection reuse eliminates TCP handshake overhead. Combined, this reduces latency from 50ms to 5ms for typical microservice calls.)

Red Flags to Avoid

Saying ‘RPC is always better than REST’ without acknowledging trade-offs (shows lack of nuance)

Not knowing any specific RPC framework (gRPC, Thrift, Avro) or how they differ (shows no hands-on experience)

Ignoring error handling and timeouts (treating RPC like local calls is dangerous)

Claiming RPC is ‘legacy’ or ‘replaced by REST’ (shows outdated knowledge—RPC is thriving in modern systems)

Not mentioning real companies that use RPC (Google, Uber, Netflix) or understanding why they chose it (shows lack of industry awareness)

Key Takeaways

RPC abstracts remote calls as local function calls, hiding network complexity behind familiar programming interfaces. The client stub marshals parameters, the network transports them, and the server stub unmarshals and invokes the actual function—all transparently to the caller.

Use RPC for internal microservices, REST for public APIs. RPC’s tight coupling and binary protocols deliver performance and type safety when you control both client and server. REST’s loose coupling and HTTP caching provide flexibility when clients are diverse and unknown. Most companies (Google, Netflix, Uber) use both strategically.

RPC’s main advantages are performance, type safety, and code generation. Binary serialization (Protobuf, Thrift) is 5-10x more compact than JSON. Generated client stubs catch errors at compile time. HTTP/2 multiplexing reduces latency. These benefits matter most for high-throughput internal communication.

RPC’s main disadvantages are tight coupling and versioning challenges. Clients depend on server interface definitions and must regenerate stubs when APIs change. Coordinated deploys are required unless you carefully maintain backward compatibility through optional fields and field numbering.

Modern RPC (gRPC, Thrift) is thriving in cloud-native systems. Despite REST’s popularity for web APIs, RPC dominates internal microservice communication at scale. Google processes 10 billion gRPC calls per second. Understanding when and why to use RPC—not just how it works—is critical for senior system design interviews.