API Management 101: Rate Limiting

In the realm of API management , API rate limiting is one of the fundamental aspects of managing traffic to your APIs. It is important for quality of service, efficiency and security. It is one of the easiest and most efficient ways to control traffic to your APIs.

Rate limiting can help with API overuse caused by accidental issues within client code, which results in the API being slammed with requests. On the malicious side, it can prevent a denial of service attack meant to overwhelm the API resources, which could easily be executed without rate limits in place. The risks associated with this are sufficiently large that unrestricted resource consumption earned a spot in the OWASP list of top 10 API security risks in 2023. 

With API rate limiting in place, you can ensure that everyone who calls your API receives an efficient, quality service while protecting your API from events that would impact its availability. 

What is API rate limiting and how does it work? 

API rate limit refers to the number of calls the client (API consumer) is able to make in a second. Rate limits are calculated in requests per second, or RPS. 

Let’s say a developer only wants to allow a client to call an API a maximum of 10 times per minute. In this case, the developer would apply a rate limit to their API expressed as “10 requests per 60 seconds”. This means that the client will be able to successfully call the API up to 10 times within any 60-second interval. After that, if they call the API an 11th time within that time frame, the user will get an error stating they have exceeded their rate limit. 

Implementing API rate limiting therefore provides a robust defence against distributed denial of service (DDoS) attacks. DDoS attacks are on the rise. Cloudflare’s DDoS threat report for 2023 Q4 observed a 117% year-on-year increase in network layer DDoS attacks – a stark reminder that businesses need to do all they can to defend their resources.  

Are there different types of rate limiting?

There are different ways that you can approach API rate limiting. Tyk has two methods of doing so: 

Key-level

Key-level limiting is more focused on controlling API traffic from individual sources and making sure that users are staying within their prescribed limits. This approach to API rate limiting allows you to configure a policy to rate limit in two possible ways: 

  • Limiting the rate of calls the user of a key can make to all available APIs – another form of global rate limits just from one specific user
  • Limiting the rate of calls to specific individual APIs – also known as a “per API rate limit”

API-level

API-level rate limiting assesses all traffic coming into an API from all sources and ensures that the overall rate limit is not exceeded. Overwhelming an endpoint with traffic is an easy and efficient way to execute a denial of service attack. By using a global rate limit, you can easily ensure that all incoming requests are within a specific limit. This limit may be calculated by something as simple as having a good idea of the maximum amount of requests you could expect from users of your API. It may also be something more scientific and precise, such as the amount of requests your system can handle while still performing at a high level. You can quickly establish this threshold with performance testing. 

When you put rate limiting measures in place, they are assessed in this order (if applied):

  1. API-level global rate limit
  2. Key-level global rate limit
  3. Key-level per-API rate limit

These two approaches have different use cases and they may even be used in unison to power an overall API rate limiting strategy.

When might I want to use rate limiting?

This approach is ideal if you want to ensure that one particular user or system accessing the API is not exceeding a determined rate. This makes sense in a scenario such as APIs which are associated with a monetisation scheme, where you may allow so many requests per second based on the tier the consumer is subscribed to or paying for.

An API-level global rate limit, meanwhile, can serve as an extra line of defense against attempted denial of service attacks. For instance, if you’ve load tested your current system and established a performance threshold that you would not want to exceed, to ensure system availability and/or performance, then you may want to set a global API rate limit as a defense to make sure that it is not exceeded.

Of course, there are plenty of other scenarios where applying a rate limit may be beneficial to your APIs and the systems that your APIs leverage behind the scenes. Essentially, if you’re ready to build dynamic API products , it’s time to implement rate limiting. The simplest way to figure out which type of rate limits you should apply can be determined by asking a few questions:

Do you want to protect against denial of service attacks or overwhelming amounts of traffic from all users of the API? Yes? Then, go for an API-level global rate limit!

Do you want to limit the number of API requests a specific user can make to all APIs they have access to? Then choose a key-level global rate limit!

Do you want to limit the number of requests a specific user can make to specific APIs they have access to? Then it’s time for a key-level per-API rate limit.

How to implement rate limiting in API environments

If you want to implement API rate limiting, you have various strategies available to you, including several algorithm-based approaches. These include:

  • Leaky bucket – a first come, first served approach that queues items and processes them at a regular rate. 
  • Fixed window – a fixed number of requests are permitted in a fixed period of time (per second, hour, day and so on). 
  • Moving/sliding window – similar to a fixed window but with a sliding timescale, to avoid bursts of intense demand each time the window opens again. 
  • Sliding log – user logs are time stamped and the total calculated, with a limit set on the total rate. 

To go into a bit more detail on the specifics of how to handle API rate limiting, let’s take a look at our Tyk documentation about how API rate limits are handled and enforced within Tyk:

“Limit is enforced using a pseudo “leaky bucket” mechanism: Tyk will record each request in a timestamped list in Redis, at the same time it will count the number of requests that fall between the current time, and the maximum time in the past that encompasses the rate limit (and remove these from the list). If this count exceeds the number of requests over the period, the request is blocked.”

“The actual limit is a “moving window” so that there is no fixed point in time to flood the limiter or execute more requests than is permitted by any one client.”

Using an API management solution such as Tyk means you can tick rate limiting off your list of things to worry about – see below for a step-by-step guide on how to implement rate limiting in your API environment with Tyk. 

How to test API rate limiting

Naturally, you’ll want to test that your API rate limit is working as it should. It’s not the kind of thing you want untested when you’re facing a denial of service attack!

If you want to know how to test API rate limiting using external support, there are companies out there who will undertake API pentesting to test how robust your API security is, including how well your rate limiting works. 

How to check API rate limit

API rate limiting isn’t just a question of looking up how to limit API calls, implementing your limits and then forgetting about it. You’ll need to check your API rate limits are appropriate and likely change them as your business grows and develops. After all, little in the API world stays still for very long. 

How to check your API rate limit will depend on which API management tools you are using. If you’re using a tool with a handy dashboard, you should be able to see easily which limits you have in place. 

How long does the rate limit last?

Asking how long an API rate limit lasts is similar to asking how long a piece of string is – there is no fixed answer. It is common to apply a dynamic rate limit based on the number of requests per second, but you could also think in terms of minutes, hours or whatever time frame best suits your business model. 

What is API throttling vs rate limiting?

The two ways that requests can be handled once they exceed the prescribed limit is by returning an error (via API rate limiting) or by queueing the request (though throttling) to be executed later.

You can implement throttling at key or policy level, depending on your requirements. It’s a versatile approach that can work well if you prefer not to throw an error back when a rate limit is exceeded. By using throttling, you can instead queue the request . This means that you can automatically queue and auto-retry client requests once the rate limit is not being exceeded. 

Throttling means that you can protect your API while still enabling people to use it. However, it can slow down the service that the user receives considerably, so how to throttle API requests needs careful thought in terms of maintaining service quality as well as availability. 

What does “API rate limit exceeded” mean?

API rate limit exceeded means precisely what it says – that the client trying to call an API has exceeded its rate limit. This will result in the service producing a 429 error status response. You can modify that response to include relevant details about why the response has been triggered. 

Examples of popular services and their rate limits

Rate limiting serves a range of purposes. Tyk’s clients have provided some excellent examples of this. Embedded digital payments platform Modulr, for example, uses rate limiting and quota management via security policies to relinquish the need to set specific clients on a per sign-up basis. It makes rolling out services for new clients far easier and more efficient. 

Over at PA Digital, the firm that has brought the Yellow Pages service to businesses across Spain for 50+ years, rate limiting has been key to the delivery of the firm’s Bee Digital product. Security and a seamless service for end users were the company’s main goals when implementing rate limiting and other features via a Tyk Hybrid deployment on AWS. 

Such diverse use cases highlight the value of rate limiting in commercial environments around the globe. 

How to bypass an API rate limit

While API rate limiting can go a long way towards protecting the availability of your APIs and downstream services, it is not without its flaws. As such, some individuals have worked out how to bypass an API rate limit. In fact, they’ve worked out several ways to do so. 

If you use an IP-based rate-limiter , rather than key-level rate limiting, then it’s possible for people to bypass your limits using proxy servers. They can multiply their usual quota by the number of proxies they can use. 

Key-based API rate limiting can also be bypassed, by those who are prepared to create multiple accounts and get numerous keys. 

There are other techniques out there, such as using client-side JavaScript to bypass rate limits, so be aware that knowing how to rate limit API products doesn’t make them impervious to being bypassed! 

Applying rate limiting in Tyk

Now that you know what rate limiting is, let’s look at how easy it is to set up within Tyk! 

Setting up an API-Level global rate limit

Adding an API-level global rate limit is what requires the least amount of effort and can be done as soon as you’ve created the API in Tyk. To apply a global rate limit you simply need to:

  1. Navigate to the API you want to set the global rate limit on
  2. In the Core Settings tab, navigate to the Rate Limiting and Quotas section
  3. Ensure that Disable rate limiting is unchecked
  4. Enter in your request per second threshold
  5. Save/Update your changes

Want to see it in action? Check out the video below!

Setting up a key-level global rate limit *

To set a key-level global rate limit:

  1. Navigate to the Tyk policy that you want to enforce rate limiting on
  2. Ensure that API(s) that you want to apply rate limits to are selected
  3. Under Global Limits and Quota , make sure that Disable rate limiting is unchecked and enter your requests per second threshold
  4. Save/Update the policy

Setting up a key-level per-API rate limit *

To set a key-level per-API rate limit:

  1. Navigate to the Tyk policy that you want to enforce rate limiting on
  2. Ensure that API(s) that you want to apply rate limits to are selected
  3. Under API Access , turn on Set per API Limits and Quota
  4. Under Rate Limiting , make sure that Disable rate limiting is unchecked and enter your requests per second threshold
  5. Save/Update the policy

Want to see key-level rate limiting in action? Check out the video below!

* It is assumed that the APIs being protected with a rate limit are using an Authentication token Authentication mode and have policies already created

And with that we have explored a few ways to apply rate limiting within Tyk. Stay tuned for other overviews of key features in Tyk with our APIM 101 blog series. Want to know more about rate limiting within Tyk? Check out our docs on the subject! Learn more about Tyk’s API gateway.

API product design Create, secure & test APIs Monitor, troubleshoot & update APIs