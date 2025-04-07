Rapid growth is a blessing and a curse for any application. It provides both increased revenue and increased technical challenges. To mitigate these challenges, developers should consider cloud design patterns.

Design patterns that govern cloud-based applications aren't always discussed until companies reach a certain scale. While there are countless design patterns to choose from, one of the biggest challenges is dealing with scale when it becomes necessary. To better scale workloads, several design patterns can make any cloud-based application more fault-tolerant and resistant to problems from increased traffic.

Review these five cloud design patterns to help developers better handle unexpected increases in throughput:

Bulkhead. Retry. Circuit breaker. Queue-based load leveling (QBLL). Throttling.

1. Bulkhead Named after the partitions of a ship that help isolate flooding, the bulkhead pattern prevents a single failure within an application from cascading into a total failure. While implementing this pattern isn't always obvious, it is typically found in applications that can operate under degraded performance. An application that implements the bulkhead pattern is built with resiliency in mind. While not all operations are possible when email or caching layers go down, the application can still function with enough foresight and communication to the end user. The bulkhead pattern ensures application functionality by isolating different parts of the app. With isolated application sections that can operate independently of one another, subsystem failures can safely reduce the application's overall functionality without shutting everything down. A good example of the bulkhead pattern in action is any application that can operate in offline mode. While most cloud-based applications require an external API to reach their full potential, fault-tolerant clients can operate without the cloud by relying on cached resources and other workarounds to ensure the client is marginally usable.

2. Retry In many applications, failure is a final state. However, in more resilient services, a failed request can potentially be resent. The retry pattern, a common cloud design pattern for third-party interactions, encourages applications to expect failures. Processes that implement the retry pattern create fault-tolerant systems that require minimal long-term maintenance. These processes are implemented with the ability to retry failed operations safely. The retry pattern enables failed requests to be resent a limited number of times to encourage success. The retry pattern is often seen in webhook implementations. When one service tries to send a webhook to another service, that request can do one of two things: Succeed. If it succeeds, then the operation is completed. Fail. If it fails, the sending service can resend the webhook a limited number of times until the request is successful. To avoid overloading the target system, many webhook implementations use exponential backoff, increasing time delays between each request to give a faulty destination time to recover before failing. These delays often introduce some randomness in the timing -- called jitter -- to prevent synchronized retries from further overwhelming the system. The retry pattern only works when both the sender and receiver know that failed requests can be resent. In the webhook example, a unique identifier for each webhook is often provided. The receiver can then validate that a request is never processed more than once. This avoids duplicates, while enabling the sender to experience its own errors that could erroneously resend redundant data.

3. Circuit breaker Dealing with scale can be an incredibly nuanced problem in cloud-based applications, especially with processes with unpredictable performance. The circuit breaker pattern prevents processes from "running away" by cutting them short before they consume more resources than necessary. The circuit breaker pattern can halt any request that takes too long to generate. Imagine a webpage that generates a report from several different data sources. In a typical scenario, this operation might take only a few seconds. However, querying the back end might take much longer in rare circumstances, which ties up valuable resources. A circuit breaker could halt the execution of any report that takes more than 10 seconds to generate, which prevents long-running queries from monopolizing application resources. More modern implementations can take this further by pairing failure thresholds with dynamic recovery times, enabling more resiliency in systems with unpredictable resource constraints, such as serverless environments.

4. Queue-based load leveling QBLL is a common cloud design pattern that uses queues to execute requests. Rather than performing several complex operations at request time -- which adds latency to user-exposed functionality -- these operations enter a queue. This queue executes fewer requests within a given time period. This design pattern is valuable in systems where many operations do not need to show immediate results, such as sending emails or calculating aggregate values. The queue-based load leveling pattern organizes requests into a queue to manage execution. Consider an API endpoint that must make retroactive changes to a large data set whenever it is executed. While this endpoint was built with a certain threshold of traffic in mind, a large burst in requests or rapid growth in users could negatively affect the application's latency. By offloading this functionality to a QBLL system, the application infrastructure can more easily withstand the increased throughput by processing a fixed number of operations at a time.

5. Throttling An alternative design pattern to QBLL is the throttling pattern, which functions in relation to the noisy neighbor problem. While the QBLL pattern offloads excess workloads to a queue for more manageable processing, the throttling pattern sets and forces limits on how frequently a single client can use a service or endpoint to keep one noisy neighbor from negatively impacting the system for everyone. The throttling pattern can also supplement the QBLL pattern. This enables the managed processing of excess workloads and ensures the queue depth doesn't become full. The throttling pattern enforces limits on how many times a client uses a service or endpoint. Looking back at the QBLL example, imagine the API endpoint could originally handle about 100 requests per minute before the heavy work was offloaded to a queue. An API can support a maximum throughput of about 10,000 requests per minute. That is a huge jump from 100, but the queue can only support 100 requests per minute without any noticeable impact on the end user. This means that 1,000 API requests could take 10 minutes to process, and 10,000 API requests could take two hours. In a system with evenly distributed requests, every user experiences slower processing equally. But, if a single user sends all 10,000 requests, all other users experience a two-hour delay before their workloads start. A throttling schema that limits all users to 1,000 requests per second ensures that no single user could monopolize application resources at the expense of any other user.