DynamoDB Throttling in Lambda

How to debug ProvisionedThroughputExceededException errors, understand capacity planning, and eliminate throttling.

What is DynamoDB throttling?

DynamoDB enforces throughput limits at the table and partition level. When your application sends more read or write requests per second than the table can handle, DynamoDB rejects the excess requests with an HTTP 400 status code and the error ProvisionedThroughputExceededException. This is throttling. It is DynamoDB's way of protecting itself from being overwhelmed and ensuring that capacity is distributed fairly across partitions.

Throttling is not a bug or an outage. It is the expected behavior when demand exceeds the capacity you have allocated. On a provisioned table, you set specific Read Capacity Units (RCU) and Write Capacity Units (WCU). One RCU allows one strongly-consistent read per second for an item up to 4 KB. One WCU allows one write per second for an item up to 1 KB. Any request that pushes the consumed capacity above these limits within a one-second window gets throttled.

When the AWS SDK (v2 or v3) receives a ProvisionedThroughputExceededException, it retries the request automatically using exponential backoff with jitter. The default retry configuration in the JavaScript SDK v3 allows up to 3 retries with delays of roughly 100ms, 200ms, and 400ms (plus jitter). The Python boto3 SDK uses adaptive retry mode by default, which backs off more aggressively and can delay up to 20 seconds across retries. If all retries are exhausted, the SDK throws the exception to your application code, which typically causes the Lambda invocation to fail with an unhandled error.

A common misconception is that on-demand tables cannot be throttled. On-demand capacity mode does auto-scale, but it scales based on the previous traffic peak. DynamoDB allocates up to 2x the previous peak of read or write traffic. If your traffic suddenly jumps far beyond the historical maximum, such as going from 100 writes per second to 10,000 writes per second in a single minute, on-demand tables will still throttle while adaptive capacity ramps up. This ramp-up process takes several minutes, during which you may see intermittent ProvisionedThroughputExceededException errors even though you are using on-demand mode.

The practical impact on Lambda is severe. A function that normally completes a DynamoDB write in 15ms might suddenly take 5-10 seconds when retries kick in. If your Lambda timeout is set to 10 seconds, the backoff delays can push the total execution time past the limit, resulting in a timeout. From the user's perspective, what was a fast API call becomes an error with no clear explanation unless you dig into the CloudWatch logs.

Identifying throttling in CloudWatch logs

DynamoDB throttling errors appear in Lambda CloudWatch logs with a distinctive format. The full exception message includes the table name, the service, the HTTP status code, and the error code. Here is what the raw error looks like:

2026-03-05T14:23:01.847Z abc12345-6789-def0-1234-abcdef012345 ERROR
ProvisionedThroughputExceededException: Rate exceeded for table Payments
(Service: AmazonDynamoDBv2; Status Code: 400;
Error Code: ProvisionedThroughputExceededException;
Request ID: A1B2C3D4E5F6G7H8I9J0)

If you are using the AWS SDK with the default logger enabled, you will also see retry messages that indicate the SDK is attempting to recover from the throttle:

2026-03-05T14:23:01.950Z abc12345 INFO Retry attempt 1 of 3 for PutItem on table Payments
2026-03-05T14:23:02.173Z abc12345 INFO Retry attempt 2 of 3 for PutItem on table Payments
2026-03-05T14:23:02.589Z abc12345 INFO Retry attempt 3 of 3 for PutItem on table Payments
2026-03-05T14:23:02.591Z abc12345 ERROR All retries exhausted for PutItem on table Payments

One of the strongest signals of DynamoDB throttling is a sudden spike in Lambda invocation duration. Functions that normally complete in 30-80ms suddenly show durations of 3,000-15,000ms. This happens because each retry adds a backoff delay. When you see a cluster of invocations with anomalously high durations, check whether those same request IDs have ProvisionedThroughputExceededException earlier in the log stream.

Another common pattern is Lambda timeouts that are actually caused by DynamoDB backoff consuming the entire timeout budget. The log sequence looks like this:

2026-03-05T14:23:01.100Z req-789 INFO Processing order #4521
2026-03-05T14:23:01.115Z req-789 INFO Writing to DynamoDB table Orders
2026-03-05T14:23:01.847Z req-789 WARN ProvisionedThroughputExceededException on PutItem
2026-03-05T14:23:02.150Z req-789 WARN Retry 1 of 3, backing off 302ms
2026-03-05T14:23:02.890Z req-789 WARN ProvisionedThroughputExceededException on PutItem
2026-03-05T14:23:03.520Z req-789 WARN Retry 2 of 3, backing off 628ms
2026-03-05T14:23:04.700Z req-789 WARN ProvisionedThroughputExceededException on PutItem
2026-03-05T14:23:06.100Z req-789 WARN Retry 3 of 3, backing off 1395ms
2026-03-05T14:23:11.100Z req-789 Task timed out after 10.00 seconds

REPORT RequestId: req-789  Duration: 10000.00 ms  Billed Duration: 10000 ms
Memory Size: 256 MB  Max Memory Used: 89 MB  Status: timeout

Notice how the function started at 14:23:01 and timed out exactly 10 seconds later. The DynamoDB retries consumed the entire execution window. Without examining the log lines between START and the timeout, you might conclude the function itself is slow. The root cause is actually DynamoDB throttling, not a Lambda performance issue.

Provisioned vs on-demand capacity

DynamoDB offers two capacity modes that handle throughput very differently. Choosing the right one is the single most effective way to prevent throttling.

Provisioned capacity

You specify the exact number of RCU and WCU your table should handle. DynamoDB allocates resources accordingly. This mode is cost-effective when you have predictable, steady traffic because you pay a fixed hourly rate for the capacity you provision. However, any traffic above the provisioned limit is throttled immediately. Provisioned tables also get burst capacity: DynamoDB reserves up to 300 seconds' worth of unused capacity that can absorb short spikes. For example, if your table is provisioned at 100 WCU but only consuming 20 WCU, the unused 80 WCU accumulates as burst credit. A sudden spike to 400 WCU could be absorbed for a short time. But burst capacity is finite, and once it is consumed under sustained load, throttling begins immediately.

On-demand capacity

DynamoDB automatically scales to accommodate your workload. You do not set RCU or WCU values. Instead, DynamoDB allocates capacity based on the prior peak traffic for the table, maintaining enough resources to handle up to 2x the previous peak. On-demand is more expensive per request (roughly 5-7x the per-unit cost of provisioned), but it eliminates most capacity planning concerns. It is ideal for workloads with unpredictable or spiky traffic, new tables where you do not yet know the traffic pattern, or development and staging environments where cost optimization is less important than operational simplicity.

The hot partition problem

Even with sufficient total capacity, DynamoDB can throttle if a single partition receives a disproportionate share of the traffic. DynamoDB distributes data across partitions based on the partition key. If one key value receives far more reads or writes than others, that partition's per-partition throughput limit (3,000 RCU or 1,000 WCU per partition) can be exceeded while the table-level metrics look healthy. This is the hot partition problem, and it is the most common cause of throttling on tables that appear to have enough provisioned capacity. DynamoDB's adaptive capacity feature helps by borrowing unused capacity from other partitions, but it cannot fully compensate for a severely skewed access pattern. If 90% of your writes target a single partition key, no amount of provisioned capacity will solve the problem. You need to redesign the key.

How smplogs detects DynamoDB throttling

When you upload a Lambda CloudWatch log file to smplogs, the WASM analysis engine scans every log line for ProvisionedThroughputExceededException messages, SDK retry patterns, and the associated request IDs. It then correlates these throttling events with the REPORT lines to determine which invocations experienced duration spikes or timeouts as a direct result of DynamoDB backoff delays.

The root cause analysis engine links throttling events to error rate spikes and timeout patterns, producing a correlation finding like this:

CRITICAL correlation

Error spike correlates with DynamoDB throttling events — 94% of failures occur within 200ms of a ProvisionedThroughputExceededException.

-> Switch table to on-demand capacity or increase provisioned WCU.

smplogs also detects the cascading effect of throttling on Lambda metrics. When DynamoDB retries push invocation durations from 50ms to 5,000ms, the metric cards highlight the duration anomaly. The insights panel explains the chain: throttling caused retries, retries caused duration spikes, duration spikes caused timeouts, and timeouts caused the error rate increase. This chain is surfaced as a single root cause rather than four separate findings, so you can focus on the actual problem instead of chasing symptoms.

If the log contains multiple table names in throttling errors, smplogs groups findings by table so you can see which DynamoDB table is the primary source of failures. This is especially useful in microservices architectures where a single Lambda function interacts with multiple tables and the throttling is only happening on one of them.

Step-by-step fixes

Fix 1: Switch to on-demand capacity mode

The simplest fix that works immediately. You can switch a provisioned table to on-demand in the DynamoDB console or with the AWS CLI. The change takes effect within a few minutes and requires no code changes. On-demand mode eliminates throttling for the vast majority of workloads.

aws dynamodb update-table \
  --table-name Payments \
  --billing-mode PAY_PER_REQUEST

Be aware that you can only switch between provisioned and on-demand once every 24 hours. Also, on-demand pricing is roughly 5-7x more expensive per request unit than provisioned. For high-throughput, steady-state workloads, this cost difference can be significant. Use on-demand as an immediate fix, then evaluate whether provisioned with auto-scaling is more cost-effective for your traffic pattern.

Fix 2: Increase provisioned RCU/WCU

If you want to stay on provisioned mode, check the CloudWatch metrics ConsumedReadCapacityUnits and ConsumedWriteCapacityUnits for your table. Compare the consumed capacity to the provisioned capacity. If consumed regularly exceeds 80% of provisioned, you need more headroom.

aws dynamodb update-table \
  --table-name Payments \
  --provisioned-throughput ReadCapacityUnits=500,WriteCapacityUnits=200

Fix 3: Enable auto-scaling

Auto-scaling adjusts your provisioned capacity automatically based on actual consumption. Configure it with a target utilization of 70% (the default), and set minimum and maximum values that match your expected traffic range. Auto-scaling uses Application Auto Scaling under the hood and adjusts capacity within 1-2 minutes of detecting a sustained change. It does not react to sudden spikes as fast as on-demand mode, so you still need enough base capacity to handle short bursts. Set the minimum to your baseline traffic level and the maximum to your expected peak plus a 30% buffer.

Fix 4: Fix hot partitions

If your table has sufficient total capacity but is still throttling, the problem is almost certainly a hot partition. Examine your access patterns: are most writes going to the same partition key? Common culprits include using a date as the partition key (all writes for today hit one partition), using a status field (all "pending" items in one partition), or using a tenant ID where one tenant dominates traffic.

The fix is to add entropy to the partition key. Two strategies work well:

// Write sharding: append a random suffix to the partition key
const shardId = Math.floor(Math.random() * 10);
const partitionKey = `${orderId}#${shardId}`;

// For date-based keys: use a composite key with a shard prefix
const partitionKey = `${shardId}#2026-03-06`;
// Query all shards in parallel and merge results

Write sharding requires changes to your query logic since reads must now fan out across all shard suffixes and merge results. This adds complexity but eliminates the hot partition bottleneck entirely.

Fix 5: Add DAX caching for read-heavy workloads

DynamoDB Accelerator (DAX) is a fully managed, in-memory cache that sits in front of DynamoDB. It reduces read latency from single-digit milliseconds to microseconds and absorbs read traffic that would otherwise consume RCU. If your throttling is primarily on reads (high ConsumedReadCapacityUnits but moderate write load), DAX can dramatically reduce the read pressure on your table. DAX requires a code change: you replace the DynamoDB client with the DAX client, but the API is identical. Note that DAX runs inside a VPC, so your Lambda function must also be VPC-attached.

Fix 6: Implement batch operations and application-level backoff

If your Lambda function makes many individual PutItem or GetItem calls in a loop, replace them with BatchWriteItem or BatchGetItem. Batch operations are more efficient because they reduce the number of network round trips and allow DynamoDB to process items in parallel internally.

// Bad - sequential individual writes, each can throttle independently
for (const item of items) {
  await dynamodb.send(new PutItemCommand({ TableName: 'Orders', Item: item }));
}

// Good - batch write with unprocessed item handling
const chunks = chunkArray(items, 25); // BatchWriteItem max is 25 items
for (const chunk of chunks) {
  let params = { RequestItems: { Orders: chunk.map(item => ({ PutRequest: { Item: item } })) } };
  let result = await dynamodb.send(new BatchWriteItemCommand(params));
  // Retry unprocessed items with exponential backoff
  while (result.UnprocessedItems?.Orders?.length) {
    await sleep(Math.pow(2, retryCount) * 100);
    params.RequestItems = result.UnprocessedItems;
    result = await dynamodb.send(new BatchWriteItemCommand(params));
  }
}

Prevention best practices

Set CloudWatch alarms on ThrottledRequests

DynamoDB publishes a ThrottledRequests metric to CloudWatch for every table. Create an alarm that triggers when this metric exceeds zero for 3 consecutive 1-minute periods. This catches throttling before it becomes a sustained outage. Also monitor ReadThrottleEvents and WriteThrottleEvents separately so you know whether to add RCU, WCU, or both. A typical alarm configuration triggers an SNS notification to your on-call channel when throttle events exceed 5 per minute for 3 consecutive evaluation periods.

Design partition keys for even distribution

Choose a partition key with high cardinality that distributes traffic evenly. Good partition keys include user IDs, order IDs, device IDs, or UUIDs. Bad partition keys include dates, status values, boolean flags, or any field with fewer than a few hundred unique values. If your access pattern requires querying by a low-cardinality attribute (like status or date), use a Global Secondary Index with a different partition key rather than making the low-cardinality field the table's primary partition key.

Match capacity mode to your traffic pattern

Use on-demand for unpredictable or bursty workloads: new applications, event-driven architectures, development environments, or any table where the peak-to-average ratio exceeds 4:1. Use provisioned with auto-scaling for steady, predictable workloads where the cost savings of provisioned mode justify the operational overhead. Review your choice quarterly by examining the CloudWatch metrics for each table. Traffic patterns change as applications evolve, and a table that was bursty at launch may become predictable at scale.

Implement circuit breakers in Lambda

When DynamoDB is throttling heavily, continuing to send requests only makes the problem worse. Implement a circuit breaker pattern in your Lambda function that stops making DynamoDB calls after a configurable number of consecutive throttle errors. Return a 503 (Service Unavailable) to the caller with a Retry-After header instead of timing out. This preserves Lambda concurrency for requests that can succeed and gives DynamoDB time to recover. Libraries like opossum (Node.js) or pybreaker (Python) make circuit breaker implementation straightforward.

Pre-warm tables before expected traffic spikes

If you know a traffic spike is coming (marketing campaign launch, seasonal sale, product announcement), increase your provisioned capacity or switch to on-demand mode in advance. For provisioned tables with auto-scaling, increase the minimum capacity to your expected peak at least 30 minutes before the event. Auto-scaling reacts to actual traffic, which means it lags behind sudden jumps. Pre-warming ensures capacity is already available when the spike hits. For on-demand tables, you can pre-warm by running a load test that gradually ramps traffic to the expected peak level. DynamoDB remembers the peak and maintains capacity for up to 30 minutes of idle time.

Seeing throttling errors in your logs? Drop your CloudWatch JSON into smplogs to instantly correlate DynamoDB throttling with Lambda failures, timeouts, and error rate spikes.

Try it free

Related guides

Lambda Timeout Debugging

Diagnose timeout patterns caused by downstream throttling.

Lambda Cold Start Troubleshooting

Identify cold start patterns and reduce init latency.

All Troubleshooting Guides

Lambda, API Gateway, and ECS debugging guides.

TL;DR