Prevent API overload with rate limiting in AWS

Ali Haydar
5 min readApr 11, 2023

--

In the early days of the COVID-19 pandemic, and for a significant period, when we went to the supermarket, we had to wait in a queue for our turn to enter the supermarket. Only a certain number of customers could shop together at a particular time. The intention was to keep a physical distance between individuals to limit the spread of the virus, hence keeping people safe and limiting the load on the health system.

That’s similar to what we do in Software when we rate limit our APIs. We do it for multiple reasons, including preventing abuse (e.g. malicious user overwhelming the system with too many requests), managing downstream services and resources (e.g. database) or simply using rate limit as a way to monetize the APIs (e.g. enabling free users to use the APIs less frequently than paid users).

Photo by Enrique Zafra on Pexels

This post will build an API using AWS API Gateway and explore how to rate-limit calls to our endpoints. API Gateway is a managed service from AWS that helps you publish, manage and monitor your APIs. We will use Terraform to set up our infrastructure.

We deploy APIs to stages in API Gateway, where each stage has a single deployment. You could have one stage for development, one for testing and one for production. Each stage has its URL and settings.

We will build an endpoint that returns the number of people in the supermarket. It will be of this shape: GET /…/prod/customers.

Build the API

In Terraform:

  • Create the REST API
resource "aws_api_gateway_rest_api" "api" {
name = "SupermarketCustomers"
description = "Tracks the number of customers that enter the supermarket"
endpoint_configuration {
types = ["REGIONAL"]
}
}
  • Create the “Customers” resource. That will be the first endpoint:
resource "aws_api_gateway_resource" "customers_resource" {
rest_api_id = aws_api_gateway_rest_api.api.id
parent_id = aws_api_gateway_rest_api.api.root_resource_id
path_part = "customers"
}
  • Create the method for this resource — GET in this case:
resource "aws_api_gateway_method" "get_method" {
rest_api_id = aws_api_gateway_rest_api.api.id
resource_id = aws_api_gateway_resource.customers_resource.id
http_method = "GET"
authorization = "NONE"
}

Notice the http_method in this case, and the authorization set to NONE, as we’d want this API to be accessible to everyone (some people would like to check the number of customers in the supermarket from home)

  • Until now, we have configured the API and method. Now we must move to the integration part, which covers interacting with the backend. Specifying an AWS service, HTTP Backend, or MOCK is possible. In this example, we will use MOCK to mock the backend and hardcode the response to return on the API Gateway.
resource "aws_api_gateway_integration" "mock_backend" {
http_method = aws_api_gateway_method.get_method.http_method
resource_id = aws_api_gateway_resource.customers_resource.id
rest_api_id = aws_api_gateway_rest_api.api.id
type = "MOCK"
request_templates = {
"application/json" = jsonencode(
{
statusCode = 200
})
}
}
  • Define the integration response and method response:
resource "aws_api_gateway_method_response" "response_200" {
rest_api_id = aws_api_gateway_rest_api.api.id
resource_id = aws_api_gateway_resource.customers_resource.id
http_method = aws_api_gateway_method.get_method.http_method
status_code = 200
}
resource “aws_api_gateway_integration_response” “mock_backend_response” {
rest_api_id = aws_api_gateway_rest_api.api.id
resource_id = aws_api_gateway_resource.customers_resource.id
http_method = aws_api_gateway_method.get_method.http_method
status_code = aws_api_gateway_method_response.response_200.status_code

# Transforms the backend JSON response to XML
response_templates = {
"application/json" = <<EOF
{
"message": "hello from the mocked backend"
}
EOF
}
}
  • As mentioned above, changes in API Gateway won’t apply without a deployment, which needs to happen to a Stage:
resource "aws_api_gateway_deployment" "api_deployment" {
rest_api_id = aws_api_gateway_rest_api.api.id
lifecycle {
create_before_destroy = true
}
}

resource "aws_api_gateway_stage" "prod" {
deployment_id = aws_api_gateway_deployment.api_deployment.id
rest_api_id = aws_api_gateway_rest_api.api.id
stage_name = "prod"
}

Deploy this configuration to your AWS account. You should be able to see your API, resource and method as per the following screenshot:

Click the “Test” button to get a response like the following:

Rate & Burst Limits

AWS API Gateway has a default set of 10,000 requests per second limit per region within an AWS account, with a burst of 5000 requests. That’s across all APIs of different types (e.g. REST, WebSocket).

AWS uses the Token Bucket Algorithm to throttle requests. Each request is a token in the bucket. The burst limit is the maximum number of concurrent tokens consumed simultaneously. The rate limit is the speed at which the bucket is refilled with new tokens. You can also look at the rate limit as the maximum number of tokens that can be used within one second (or a period).

Assume we set a rate limit of 10 requests per second and a burst rate of 15 requests. That means that ten requests are added to the bucket every second up to the maximum of 15 requests. If we consume 15 requests per minute, the usage rate will be bigger than the refill rate, which means we’ll consume the requests in our bucket faster. So in 2 seconds, we will start throttling requests returning 429: Too Many Requests errors. Here are some details:

  • In 1 second, we have 15 requests in the bucket; we consumed 15 requests and refilled 10. That leaves ten requests in the bucket
  • In 2 seconds, we have ten requests in the bucket; we consumed 15 requests and refilled 10. 5 of the requests consumed will return an error, as there are no more available requests in the bucket.

However, suppose we’re consuming requests at the same speed as the refill rate (rate limit of 10 requests per second). In that case, we will always be able to serve these requests without a problem, as the burst rate (maximum number of requests simultaneously) will never be reached.

Setting the rate and burst limits properly based on the expected traffic levels and available resources is essential to ensure reliable service.

This is how we set the rates in Terraform at the stage and route level:

resource "aws_api_gateway_method_settings" "get_method_settings" {
rest_api_id = aws_api_gateway_rest_api.api.id
stage_name = aws_api_gateway_stage.prod.stage_name
method_path = "*/*"
settings {
throttling_burst_limit = 1
throttling_rate_limit = 2
}
}

After you deploy this change, Invoke the endpoint URL (You can find the URL under stages → prod → / → /customers → GET). Notice the “Too many requests” error when invoked more than a single time per second.

With this approach, a single customer might overuse the API, which would cause throttling for other customers. It is possible to control this with usage plans. This requires that each customer has an API key, which we haven’t set up in our example.

Have you implemented rate limiting in the past? What would you do differently?

Thanks for reading this far. Did you like this article, and do you think others might find it helpful? Feel free to share it on Twitter or LinkedIn.

--

--

Ali Haydar

Software engineer (JS | REACT | Node | AWS | Test Automation)