Lambda Cold Start Troubleshooting

How to identify cold start patterns in your CloudWatch logs, measure their impact, and reduce init latency.

What is a Lambda cold start?

A cold start occurs when AWS Lambda needs to create a new execution environment to handle a request. This happens when there are no warm containers available - either because the function hasn't been invoked recently, traffic has spiked beyond the current pool of containers, or the function was just deployed.

During a cold start, Lambda performs three steps before your code runs:

Downloads your deployment package (or pulls the container image)
Starts the runtime (Node.js, Python, Java, Go, etc.)
Runs your initialization code (module-level code outside the handler)

The combined time for these steps shows up as Init Duration in CloudWatch logs. This is the latency your users experience on top of the normal function execution time.

Identifying cold starts in CloudWatch logs

Lambda logs cold starts with a specific line format. Look for these patterns:

INIT_START Runtime Version: nodejs20.x  Runtime Version ARN: arn:aws:lambda:us-east-1::runtime:...

And in the REPORT line, cold starts include an extra Init Duration field:

REPORT RequestId: abc-123  Duration: 45.21 ms  Billed Duration: 946 ms  Memory Size: 512 MB  Max Memory Used: 128 MB  Init Duration: 891.33 ms

The Init Duration field only appears on cold start invocations. If the REPORT line doesn't have it, the invocation reused a warm container.

Measuring cold start impact

To understand how cold starts affect your users, you need three numbers from your logs:

Cold start rate What percentage of invocations have Init Duration? Above 10-15% means cold starts are frequent enough to affect user experience.
Avg init duration How long is the initialization? Under 200ms is generally acceptable. Over 500ms is noticeable. Over 1 second needs attention.
P99 impact Cold starts disproportionately affect P99 duration. Compare P99 vs P50 - if the gap is large, cold starts are likely the cause.

Cold start times by runtime

Runtime choice is the single biggest factor in cold start duration. These are typical observed ranges across standard Lambda configurations (128–512 MB memory, non-VPC, package under 10 MB):

Runtime	Typical cold start	With VPC	Notes
Node.js 20/22	100–300 ms	200–400 ms	Fastest JS option; use esbuild to minimize package size
Python 3.12/3.13	100–250 ms	200–350 ms	Heavy imports (pandas, numpy) add 500ms–2s to init
Go	50–150 ms	150–250 ms	Native binary, minimal startup overhead
Rust (custom runtime)	30–100 ms	130–200 ms	Fastest overall; requires lambda_runtime crate
Java 21 (JVM)	1–4 s	1.2–4.5 s	Use SnapStart to bring this under 200 ms
Java 21 + SnapStart	100–200 ms	200–350 ms	Restores from snapshot; big win for JVM workloads
.NET 8 (JIT)	800 ms–3 s	1–3.5 s	Use NativeAOT for sub-200 ms
.NET 8 NativeAOT	50–200 ms	150–300 ms	Compiled to native; no JIT warmup cost

Figures are typical observed ranges, not AWS-guaranteed SLAs. Actual durations vary with memory allocation, package size, and initialization complexity. Higher memory allocations get proportionally more CPU, which can reduce init time for CPU-bound startup work.

How smplogs detects cold starts

When you analyze a Lambda log file with smplogs, the WASM engine automatically parses every REPORT line, extracts Init Duration values, and produces a finding like this:

HIGH Cold Start Latency

18.3% cold start rate with avg 890ms init. P99 heavily impacted.

-> Enable provisioned concurrency or optimize package size.

smplogs also correlates cold starts with P99 duration spikes in the root cause analysis, so you can see exactly how much cold starts contribute to tail latency.

Common causes of high cold start rates

Large deployment packages

The bigger your deployment package, the longer Lambda takes to download and extract it. Node.js functions with large node_modules or Python functions with heavy dependencies (pandas, numpy) are common offenders.

SDK clients created inside the handler

Creating AWS SDK clients (DynamoDB, S3, SQS) inside the handler function means they get recreated on every invocation. Move them to module scope so they're initialized once per container.

Low traffic

Lambda recycles idle containers after a period of inactivity (typically 5-15 minutes, though this varies). Functions invoked less than a few times per minute will see higher cold start rates.

Traffic spikes

When traffic suddenly increases, Lambda needs to spin up new containers to handle the burst. Each new container experiences a cold start, so they tend to cluster around spikes.

VPC-attached functions

Functions in a VPC used to have significantly longer cold starts due to ENI attachment. AWS improved this with Hyperplane in 2019, but VPC functions still add some overhead compared to non-VPC ones.

How to reduce cold start latency

Move initialization outside the handler

The most impactful free optimization. Any code outside the handler function runs once per container, not once per invocation.

// Bad - creates a new client every invocation
exports.handler = async (event) => {
  const dynamo = new DynamoDBClient({});
  // ...
};

// Good - client created once per container
const dynamo = new DynamoDBClient({});
exports.handler = async (event) => {
  // reuses the client from above
};

Reduce deployment package size

Use a bundler (esbuild, webpack) to tree-shake unused code. For Node.js, this can cut packages from 50MB+ to under 5MB. Import only what you need from the AWS SDK v3 (@aws-sdk/client-dynamodb instead of the full SDK).

Provisioned concurrency

Keeps a pool of pre-initialized containers warm. Eliminates cold starts entirely for the provisioned amount but adds cost. Best for latency-sensitive functions where cold starts exceed 1 second.

Pick a faster runtime

Node.js and Python cold-start in 100-300ms. Java and .NET are 1-5 seconds unless you use GraalVM native image or .NET NativeAOT. Go and Rust compile to native code and cold-start almost instantly.

SnapStart (Java only)

Lambda SnapStart snapshots the initialized execution environment and restores it on cold start. Brings Java cold starts from seconds down to under 200ms.

Want to skip the manual inspection? Drop your CloudWatch JSON into smplogs and get cold start rate, init duration breakdown, and P99 impact in seconds.

Try it free

Related guides

How to Export CloudWatch Logs

Download logs as JSON from the console, CLI, or SDK.

All Troubleshooting Guides

Lambda, API Gateway, and ECS debugging guides.

TL;DR