Petya Petrova - Fotolia

Tip

Synchronous vs asynchronous communications: A complete guide

Synchronous execution requires parties or components to work simultaneously in real time, while asynchronous communications don't need systems to wait for a reply.

All software systems are underpinned by fundamental choices that shape key characteristics of the system's design.

One such foundational element is how different components within a system communicate with each other. Synchronous and asynchronous communication are the two predominant choices in this area.  

Software architects and developers must understand the differences between synchronous vs. asynchronous communications and how they apply to program execution and systems design. A synchronous system is one where two or more components communicate directly and wait for a response before continuing execution. In an asynchronous system, the design assumes that a response will come later and communication often occurs through indirect message passing.

Explore the details of synchronous and asynchronous communications -- including their behavior in hardware, cloud and microservices -- as well as some scenarios that illustrate how these two communication approaches work.

What is synchronous communication?

In synchronous communication, once a communication has been initiated, the sender waits for a receiver to respond before continuing the execution of the program. The system in such a model moves in a lock-step style, where the sequence of events and execution -- whether successful or failed -- is coupled and chronologically deterministic.

A real-time customer support chat is a common example of synchronous communication. Both the support specialist and customer are actively engaged in the same session, exchanging messages in real time. The chat flow is sequential, in real-time and predictable. This ensures immediate feedback and consistency, but can introduce latency since each side waits for the other's response.

Synchronous communication uses protocols and mechanisms that establish and maintain continuous connectivity, such as HTTP, gRPC and TCP. Each request-response cycle consumes active system resources -- including network connections and threads -- until completion. While this design simplifies coordination and ensures predictable execution, it can also cascade any performance bottlenecks or failures when one of the services becomes slow or unavailable.

What is asynchronous communication?

In asynchronous communication, the components operate independently, so ordering is not guaranteed while sending and receiving messages. This means that flow execution is not chronologically consistent and must be accounted for during flow design. Each component processes the message at its own pace, often through a message queue, event bus or notification system.

One example of this communication pattern would be an email sent between departments. It would likely take a long time in transit. The two parties in an asynchronous exchange do not interact in real time. In fact, either party might be completely unaware of who they are interacting with and when the next response will arrive. Asynchronous communication is particularly valuable for reporting and alerts, such as a manufacturing application that monitors the temperature of an industrial furnace, continually transmitting status updates and automatically sending alerts.

These two forms of data transmission can be easily understood in terms of human communication, but are significantly more challenging for architects and developers to apply in software design, especially when systems must operate under strict adherence to an SLA.

Under the hood, asynchronous communication uses mechanisms such as message queues -- e.g., RabbitMQ, Kafka and AWS SQS -- publish and subscribe architectures, and event-driven systems. This architecture improves scalability and resilience. The tradeoff is eventual consistency. The system might take some time to reach a consistent state across components.

Comparing synchronous vs asynchronous communication

Synchronous and asynchronous methods each have their potential benefits and drawbacks, but choosing the correct method depends on the application's purpose.

Synchronous communication is simpler in design but carries the risk of spreading failures across services. To mitigate that risk, the architect must implement sophisticated service discovery and application load balancing among microservices.

On the other hand, asynchronous communication trades architectural simplicity and data consistency for resilience and scalability. Asynchronous designs often provide better control over failures than synchronous setups. Consider starting with a synchronous system to optimize for speed of evolution and then switch to asynchronous communications once the microservices architecture grows.

Both synchronous and asynchronous communication patterns have their place in modern system design. The key is to match the approach to the interaction model, performance needs and failure tolerance.

Synchronous and asynchronous communications are not competing paradigms; they are complementary design approaches. Each has its strengths depending on whether the system prioritizes simplicity or scalability. In practice, most modern architectures use a mix of both synchronous communication for user-facing, immediate-response needs and asynchronous communication for background or distributed processes. The most effective architectures use both patterns contextually, optimizing for speed where responsiveness matters and for decoupling where resilience and elasticity are more important.

Best practices for synchronous and asynchronous communication

Consistency in inter-service communication is one of the main challenges in a distributed architecture, such as microservices. There are three approaches to address this challenge. Communications between services in a microservices architecture can be:

  • Decentralized and synchronous. Each service handles control flow and makes direct synchronous calls to other services. This is simpler to design but introduces strong coupling and potential latency issues. Failures and performance degradation can cascade.
  • Choreographed and asynchronous. Services communicate through events published to a message queue or broker. This enables scalability and decoupling, but trades simplicity for a design that is more difficult to debug and trace.
  • Centralized orchestration. A hybrid approach where a central orchestrator manages workflows using both synchronous and asynchronous interactions. Services focus on their tasks, while orchestration logic remains external and configurable. This promotes loose coupling and centralization of the flow. However, it introduces a single point of failure that needs to be scaled effectively.

In the first two approaches, there is no information about the system's overall behavior. Business flow logic is either embedded inside the services or in the event bindings between the producers and consumers. In a decentralized and synchronous communications pattern, each service receives flow control, makes subsequent synchronous calls to other services and passes control to the next service. In choreographed and asynchronous service communications, the service publishes events to a central message queue that distributes those events.

In centralized orchestration, business workflow knowledge is in a centralized location, and services focus on their individual responsibilities. The orchestrator sequences the various service calls based on a defined workflow. That sequence is not embedded within the participating services.

To enable both synchronous and asynchronous communication between microservices, keep flow sequencing separate from individual services. Service-based flows are difficult to decouple. Instead, design an architecture that supports both asynchronous and synchronous communication. Then, allow the orchestrator to switch the communication pattern for the specific service, as in the figure below.

Orchestrator service

This approach enables a simple, decoupled architecture that is easy to read. It also helps support interoperability between protocols and payload transformation between services.

Synchronous vs. asynchronous communication considerations

Several issues can arise with both synchronous and asynchronous communication processes -- all of which can significantly affect the performance of an application system. These challenges are often exaggerated when applied to distributed systems, particularly when it comes to concurrency, workflow and component tracking.

Clock skew

Clock skew is a situation where linked digital components receive time indications at different intervals, which significantly affects a synchronous system's performance. This can particularly cause problems in densely designed systems that host large numbers of components.

Clock skew is even more damaging in asynchronous communication. It is a challenge to ensure each module and constituent component's clock remains synchronized with the others. Read-and-write storage operations are likely to occur within milliseconds of each other. Without clock synchronization, I/O operations will happen in the wrong order.

Data storage and integrity

Cloud data storage, especially cloud backup for on-premises systems, can put primary and backup data in different locations. Remote synchronous replication dictates that read-and-write operations occur in time with the primary and backup data storage locations.

Asynchronous replication, while faster, introduces lag between copies, which can affect data accuracy during recovery or analytics. Another challenge is the need to correlate multiple data streams that encompass both synchronous and asynchronous collection methods, which is particularly present in data mining and streaming analytics.

Tracking and observability

In most monolithic application architectures, statements about the system's behavior are relatively evident as part of the app design. However, when the underlying architecture consists of distributed services, it becomes more challenging to track the flow of communication. A correlation ID or tracing ID, along with centralized logging and tracing frameworks, is essential. It can help maintain visibility and accountability across service boundaries.

Priyank Gupta is a polyglot technologist who is well versed with the craft of building distributed systems that operate at scale. He is an active open source contributor and speaker who loves to solve a difficult business challenge using technology at scale.

Next Steps

Unravel asynchronous inter-service communication in microservices

Dig Deeper on Enterprise architecture management