Speed up AWS Lambda cold starts with these strategies

Cold starts are a common and frustrating issue for AWS Lambda users. Learn why they happen and how to prewarm Lambda environments to minimize latency.

Alastair Cooke

Published: 13 Jan 2020

AWS Lambda runs and automatically scales code without the hassle of constant reserved server capacity. Just run when needed. This probably sounds too good to be true, so naturally, it is.

AWS Lambda users are commonly frustrated by inconsistent startup performance. Lambda functions run inside containers that AWS must spin up before the function can execute its task. This prep time means it can take a while before the code runs. This is called a cold start. Other times, the container is up and available, and the Lambda function runs right away. This is a warm start.

Here's why Lambda cold starts occur, and how to optimize Lambda functions to build the most efficient application.

What is a Lambda cold start?

Cold starts happen when a new container is needed to run a Lambda function. This clean container needs to warm up before the event handler code can run. The process can take from 400 milliseconds up to a few seconds.

In the warmup process, AWS loads the function code into the container so it can execute. The setup includes the custom file system layers, then the language runtime, and then the initialization part of the code starts. After that is complete, the container is ready to handle events, and the event handler code runs. Lambda customers pay for the compute time -- from when AWS starts to set up the container for your function until the handler finishes the sequence.

When the container finishes the event task, AWS doesn't immediately terminate it. It can be reused to rerun the function, and customers don't pay to keep the container warm. Note that AWS does not publish or commit to a specific time window that idle containers are retained.

In a warm start, the Lambda function can run again in a container that already has its layers, runtime and initialization complete. This cuts out the initialization time, and the event handler code runs immediately; you only pay for the time the event handler runs. However, each container can only handle one Lambda invocation at a time, so multiple concurrent invocations need multiple warm containers.

Optimize for Lambda cold starts

If your function runs infrequently or varies in concurrency, you may deal with a lot of cold starts. Try to minimize Lambda cold start times with some application changes.

Only put application elements that you use in layers and initialization. For example, a Java library exports a lot of methods, but the application only uses a few of them. Rebuild the library with just the properties that this Lambda function uses, as the smaller library will load faster.

You can also shorten Lambda cold starts by choosing a language that uses a lightweight runtime environment. C# and Java are the slowest, while JavaScript, Python and Ruby are quicker to start. If you have to use a slow-loading language, especially C#, allocate more RAM.

Optimize for warm starts

If your Lambda function runs frequently -- multiple times per hour -- and at a fairly consistent rate, then it will build up a population of containers ready to handle new events, and every invocation will be a warm start. To further reduce execution time, optimize the event handler code. Having as much code as possible in the initialization section will make the event handler smaller and, therefore, faster. Make sure that all the libraries (Python import) are in the initialization.

To ensure you always have a warm container, you can proactively invoke a given Lambda function every five minutes to prevent its containers from termination. Make sure that the event data you send to the function for these warming activities causes the event handler to exit immediately, so it is available for real events. You will pay for execution time whenever you warm a container; an immediate exit will minimize the cost. Prewarming is useful if your function is not naturally invoked enough to stay warm, and the application is sensitive to the additional latency of a cold start. Prewarming is particularly helpful for C# functions that might take a couple of seconds to warm up but only a few tens of milliseconds to handle events.

AWS also recently announced Provisioned Concurrency for Lambda, where you can specify the number of warmed containers for your Lambda function. This chosen number of containers are guaranteed to be available without warm up delays. The downside is that you pay for those containers whether your Lambda function runs or not, although at approximately 60% of the usual Lambda cost. You also get charged the normal execution cost whenever your Lambda code runs.

Overall, you may not need to optimize the start time of some Lambda functions. If the function has a cold start time of one second but the function handler runs for ten minutes, then container warmup probably isn't a performance issue. If the function is seldom invoked, then you may get no value from the effort to optimize startup. For example, it's not worth the effort to shave a few milliseconds off the execution time of a daily or weekly reporting Lambda function.

Next Steps

Check out an experiment in AWS Lambda to see cold and warm starts with a Hello World test, from the book Serverless Architectures on AWS by Peter Sbarski.

Speed up AWS Lambda cold starts with these strategies

Cold starts are a common and frustrating issue for AWS Lambda users. Learn why they happen and how to prewarm Lambda environments to minimize latency.

What is a Lambda cold start?

Optimize for Lambda cold starts

Optimize for warm starts

Next Steps

Dig Deeper on Cloud app development and management

Compare AWS Lambda vs. AWS Fargate for serverless

What is AWS Lambda?

disaster recovery (DR) site

warm site