Slow Execution Times

What this means

Slow execution times mean that the tail latency of your Lambda function — specifically the 99th percentile (P99) — exceeds 3 seconds. While your median invocation might complete in 200ms, 1 in every 100 invocations takes more than 3 seconds. For user-facing APIs, this means roughly 1% of your users experience unacceptable response times. For high-traffic services handling thousands of requests per minute, that 1% translates to dozens of slow responses every minute.

P99 latency is often dominated by a few specific factors that do not affect the median. Cold starts are the most obvious contributor: if 15% of your invocations are cold starts and each adds 2 seconds of init time, your P99 will be pulled up even though most warm invocations are fast. Beyond cold starts, slow outliers are typically caused by downstream service variability — a database query that occasionally hits a cold cache, an HTTP API that responds in 50ms usually but 4 seconds when it has to regenerate a token, or an S3 GetObject call that retries once due to a transient 503.

Detection criteria

smplogs triggers this finding when:

●ELEVATED — P99 duration exceeds 3 seconds

Example finding

ELEVATED · P99 4,712ms Slow Execution Times

P99 duration is 4,712ms across 1,203 invocations. Median (P50) is 187ms. The slowest 1% of invocations are 25x slower than the median, with cold starts accounting for 62% of the slow tail.

-> Profile code for bottlenecks. Check slow database queries, external APIs.

How to fix

Separate cold start impact from handler slowness

Before optimizing anything, determine whether your P99 is driven by cold starts or by genuinely slow handler execution. smplogs shows init duration separately from billed duration. If cold starts account for most of the slow tail, focus on cold start mitigation (Provisioned Concurrency, smaller packages, lighter runtime). If warm invocations are also slow at P99, the problem is in your handler code or its downstream dependencies. Solving the wrong problem wastes effort.

Add tracing with X-Ray or structured timing logs

Enable AWS X-Ray active tracing on your Lambda function to get subsegment-level visibility into where time is spent. X-Ray automatically instruments AWS SDK calls, showing you the exact latency of each DynamoDB query, S3 operation, or SQS send. For non-AWS calls (HTTP APIs, databases), add custom subsegments. If X-Ray is not an option, add structured log lines with timestamps before and after each significant operation so you can compute durations from the logs.

Parallelize independent operations

If your handler makes multiple independent calls — fetching user data from DynamoDB, checking a feature flag, and calling an external API — run them concurrently instead of sequentially. In Node.js, use Promise.all() or Promise.allSettled(). In Python, use asyncio.gather() or concurrent.futures.ThreadPoolExecutor. Three sequential 500ms calls take 1.5 seconds; parallelized, they take 500ms. This is often the single highest-leverage optimization for Lambda functions that call multiple services.

Increase memory for CPU-bound functions

Lambda allocates CPU proportionally to memory. A function at 128 MB gets a tiny slice of a vCPU. If your function does any non-trivial computation — JSON parsing of large payloads, image resizing, data transformation, or compression — it may be CPU-starved. Doubling memory from 256 MB to 512 MB doubles available CPU and can halve execution time for CPU-bound work. Use Lambda Power Tuning to empirically find the memory setting that minimizes cost (duration times price per GB-second).

Cache repeated lookups

If your function fetches the same configuration, secret, or reference data on every invocation, cache it in a module-scoped variable. SSM Parameter Store lookups, Secrets Manager retrievals, and DynamoDB reads for slowly-changing data can all be cached in memory with a TTL. The AWS Parameters and Secrets Lambda Extension provides a local HTTP cache for SSM and Secrets Manager. For more complex caching needs, use a module-level Map with expiration timestamps. This eliminates repeated network round-trips for data that changes infrequently.

Function Timeouts

When slow execution crosses the timeout boundary

High Memory Utilization

Low memory means low CPU, causing slow execution

High Cold Start Rate

Cold starts inflate P99 latency

See where your P99 latency is hiding. Drop a CloudWatch export into smplogs.

Try it free