Getty Images


The architectural impact of RPC in distributed systems

It has become increasingly important for software architects to understand the mechanics behind remote procedure call (RPC), particularly its role in distributed architectures.

The remote procedure call is, in many ways, an evolution of the well-established modular programming paradigm. While it is not used by developers on a universal level, the practice has seen a lot of adoption, particularly for leading-edge, cloud-based application development. Given that many forward-looking companies have embraced the use of distributed software systems, it's critical that the RPC strategies are designed to support the complex levels of development that entails. Rather than looking at RPC as an extension of the past, IT architects should approach it as an on-ramp to the future.

Let's take a look at some of the basics of RPC, the issues that architecture designs like microservices can impose on RPC-based mechanisms, and five techniques software teams can adopt to avoid issues related to RPC in distributed systems.

What is RPC?

The RPC concept is relatively simple: A remote software element, such as a microservice, is called in the same way that a local procedure would be called. To do this, a software-based mechanism is used to invoke a remote process or service using a "translated" version of the code that performs a local call. 

By emulating local procedures, RPC allows familiar practices to be used in the unfamiliar world of distributed computing. However, it can disguise issues unless you deal with the difference between the "procedures" of old, the use of RPC in web-facing applications, and the services and microservices of today.

Challenges of RPC in distributed systems

The difficulties that arise when introducing RPC within a distributed software system are primarily related to issues with state management and underlying call logic. For instance, translating a local procedure call to an RPC without considering state requirements can easily break an application. In a distributed architecture like microservices, a breakage like this will likely also cause cascading failures that spread across most of the related application services.

Local procedures are typically stateful -- they store state data internally, and use that stored information to execute call sequences and process responses. However, microservices are typically configured to be stateless, with nothing stored internally between calls. Any RPC implementation related to microservices must account for the fact that it will likely need to either retrieve state from an abstracted source or adopt a mechanism that allows it to recognize state behavior.

Another notable issue arises with RPC in distributed systems during synchronous and asynchronous call processes. A synchronous call is one where the caller actively waits on hold for a response before moving on to other application processes. In an asynchronous call, the caller will move on to the next process and check periodically to see if the previous call has received a response yet. With microservices, a team will need to make sure that the specific RPC mechanism is adequately prepared to handle asynchronous call processes, such as having stand-in data that a calling service can use to continue operations while it waits for its expected response.

This synchronous and asynchronous issue also affects scalability. When services are called in a synchronous way, the caller will be stuck waiting in limbo until it receives its expected response.  A service that's designed to be scalable, but can only be called by a single parent service in a synchronous fashion, will not scale simply due to the fact that it can never make a secondary call.

Idempotency and RPC

The concept of idempotency plays an important role in RESTful API development and presents a particular caveat in RPC. By convention, RPC limits its inventory of HTTP methods to GET (an idempotent method) and POST (a non-idempotent method). Developers using RPC in the midst of RESTful API development will need to pick one of these two methods and stick with it. Where POST is used, take special care to ensure its idempotency issues don't impact call logic stability.

Five essential strategies for RPC in distributed systems

Given the issues detailed above, developers need a strategy that can help them deal with challenges related to state and call process management. Here are five strategies your team can use to keep these problems to a minimum.

Implement asynchronous communication

All individual microservices, containers and other software components should be called asynchronously, which means that those making the calls should expect that they might not receive an immediate response to a request. This is essential for cloud-native operations, where synchrony will impact a developer's ability to scale or replace a microservice when needed. Keep in mind, however, that mixing synchronous and asynchronous behaviors within a single RPC implementation is a potential invitation to mismatched communications and recurring message failures.

Go beyond REST and HTTP

There are plenty of protocol options for RPC that aren't REST and HTTP. If your applications are not related to web-based information exchange, you may want to consider options such as JSON-RPC or gRPC, which offer developers a little more functionality and flexibility for creating more modern, high-performance web applications that are popular in enterprise-level environments.

Inspect underlying communication mechanisms

RCP can run via TCP or UDP, and the two choices behave and perform differently. TCP is stateful, will recover lost packets and will introduce latency. UDP is faster, handles scaling and load balancing better, but requires some application-level response to packet loss.

Get strict with state control

Avoid having both client and server maintain state and avoid error handling. If the server side has to maintain state, then use context handles as the strategy.  If there is no true "server side" because the RPC links something to a service or microservice element, then consider back-end state control to preserve scalability.

Prepare for version rollbacks

If RPC APIs are expected to change in a functional way with some regularity, ensure that API changes are backward compatible where possible. Some API changes may be too profound to support different versions between called and calling elements. Design APIs such that new versions are developed as supersets of the old ones, which makes it easier to make them backward compatible.

Practice good coding hygiene

Finally, perform regular thread cleanup and garbage collection as required by the specifications for your team's RPC implementation. This is more likely to be an issue with explicit client-to-server RPC exchanges than with microservices. However, each RPC implementation -- and sometimes each language you use -- will have its own rules that developers must follow to avoid misplacing or duplicating resources. If it's likely that the implementation will be populated with multiple languages, try to establish and enforce a consistent approach to resource management upfront to avoid confusion down the line.

Dig Deeper on Enterprise architecture management

Software Quality
Cloud Computing