How to reduce API latency and optimize your API

November 28, 2024
By Jennifer Craig

Understand the complexity of API latency and why relying on HTTP response codes can be misleading. Learn more.

Discover how to reduce gateway latency issues and optimize your APIs today. Find out the best strategies to conquer API gateway latency problems now.

Implementing an API gateway can work wonders for your API security and governance, but it can also add a latency overhead. If you’re worried about API latency in relation to your gateway, read on to discover common causes of latency and how to reduce them. We’ve rounded up the best strategies to conquer API gateway latency problems, so you can achieve the blistering performance your enterprise demands.

What is API gateway latency?

API gateway latency is the amount of time it takes an API gateway to:

Process a request
Compose a response to that request
Send the response to the client

This metric is measured in milliseconds. The complete request-response cycle through a gateway includes:

Request reception
Request validation and transformation
Backend communication
Response processing
Response delivery

According to Amazon Web Services documentation, well-optimized API gateways typically introduce 5-20ms of overhead per request.

API gateway integration latency

Another metric worth noting is API gateway integration latency. This is the amount of time it takes an API gateway to send a request to the backend and receive a response from it. It is also measured in milliseconds. When providing an API gateway latency figure, that figure will include API gateway integration latency within it.

Metric	What it measures	Typical range
Gateway latency	Total time from request arrival to response delivery	10-100ms
Integration latency	Time waiting for backend response only	5-500ms
Gateway overhead	Processing time excluding backend wait	5-20ms

Table 1: API gateway latency metrics and typical performance ranges for optimized systems

What causes high API gateway latency?

For some enterprises, every millisecond counts – that’s why we’ve designed Tyk to be ultra performant. It’s one of the many benefits of Tyk. Whichever API gateway you use, if you’re running into latency issues, some of these common causes might be to blame.

Network issues

High traffic and connectivity issues can result in API gateway latency problems. If you’re keen to reduce API latency, investigating your network stability and reliability is a good starting point.

Inadequate server resources

If the gateway is calling a server with inadequate resources, you’re likely to run into latency issues, with requests taking longer to process. Likewise, if the gateway is calling a server in a different region, the extra milliseconds can quickly add up.

Inefficient code and configuration

Unoptimized gateway code creates processing bottlenecks. Common inefficiencies include:

Synchronous blocking operations
Excessive logging
Redundant transformations
Memory leaks

These issues accumulate across high request volumes, causing gateway timeouts when latency exceeds configured thresholds.

Authentication and authorization overhead

Security operations add measurable latency:

Token validation, especially with external identity providers, introduces 5-50ms per request
According to Auth0 documentation, JWT validation adds approximately 5-10ms
OAuth token introspection with external calls can add 50-100ms

Caching configuration

While caching reduces backend load, cache encryption and decryption operations add processing overhead. Tests show encrypted cache operations can add 2-10ms per request compared to unencrypted caching.

How do you measure API gateway latency?

If you’re worried about latency, start measuring and monitoring. What is API latency in the context of your enterprise? Unless you’ve taken the time to benchmark your latency, you won’t know how big a problem you have or whether any troubleshooting measures you implement are effective.

There are various ways you can approach API latency monitoring:

Tyk API Gateway latency monitoring

The Tyk API Gateway provides a handy example of different ways in which you can monitor latency. One approach is to use Tyk Dashboard, where you can view your average latency over time on a graph, alongside other metrics.

You can also use third-party tools such as Grafana and Prometheus to view and monitor latency. When used with Tyk, you can configure these tools to create alerts if latency goes beyond a threshold of your choosing.

Using CloudWatch metrics

Amazon’s CloudWatch metrics are a helpful feature of CloudWatch. If you’re using AWS, you can use these metrics to monitor your gateway latency.

Using third-party tools

As with every aspect of API management, there are a host of tools out there to help you:

The free, open source OpenTelemetry can provide you with sophisticated insights into many aspects of your systems
Prometheus is a free, open source tool that records a range of metrics
Grafana OSS is particularly handy as a data visualization and dashboarding tool

Why the emphasis on open source? Because our industry-leading open source API Gateway here at Tyk has shown just how powerful open source tools can be in securing, processing and governing APIs for global enterprises.

How can you reduce API gateway latency?

Latency reduction requires systematic troubleshooting and strategic optimization across multiple layers.

Infrastructure optimization

Right-sizing gateway instances prevents resource bottlenecks. Monitor CPU and memory utilization, scaling vertically (larger instances) or horizontally (more instances) when utilization consistently exceeds 70%. Co-locating gateways with backend services in the same region or availability zone eliminates geographic latency.

Code and configuration improvements

Profile gateway code to identify inefficient operations. Replace synchronous blocking calls with asynchronous processing where possible. Optimize transformation logic, removing unnecessary data manipulations. Configure appropriate connection pooling to backend services (as connection establishment overhead can add 10-50ms per request without pooling).

Caching strategies

Implement response caching for frequently accessed, relatively static data. Cache hits eliminate backend calls entirely, reducing latency from 50-500ms to under 5ms. Balance cache freshness requirements against performance gains. For encrypted cache requirements, evaluate whether the 2-10ms encryption overhead justifies the security benefit for specific data types.

Rate limiting and traffic management

Configure rate limiting to prevent CPU overload during traffic spikes. Load tests show that preventing CPU saturation above 80% maintains consistent latency under load. Implement request prioritization, ensuring critical traffic receives resources during high-load periods.

Authentication optimization

Choose authentication methods based on latency requirements. Local JWT validation (5-10ms) outperforms OAuth token introspection with external calls (50-100ms). For AWS environments, resource-based IAM policies avoid Secure Token Service calls required by role-based authentication, eliminating 10-30ms of overhead per request.

Troubleshooting high latency

If your API latency monitoring indicates you need to troubleshoot high gateway latency, there are various steps you can take. Looking for network issues, inadequate server resources, and unoptimized code can often solve common latency problems.

If you still need to reduce API latency after ticking those items off your troubleshooting list, the following could help:

Look into the sizing and configuration of your gateway and its dependencies. A lightweight, highly performant gateway without computationally demanding and memory intensive dependencies will serve you best.
Undertake benchmarking and experiment with rate limiting to understand how avoiding overloading your CPU can impact your baseline and introduced latency levels.

To dive deeper into fine-tuning your API gateway to reduce latency, check out this detailed example.

In addition to troubleshooting your current latency issues, use alerting to flag up any future API gateway latency problems. Doing so means you can troubleshoot proactively and jump on problems as soon as they arise – hopefully before your users even notice.

Performance issue workarounds

As well as the troubleshooting ideas mentioned above, there are various workarounds you can use to help address and avoid gateway performance issues. Let’s look at a few of these now.

Co-locating resources

Is your gateway regularly calling servers in far-off regions? Doing so can result in higher latency, so consider co-locating resources as an effective workaround.

Modifying authentication

Do you assign identity and access management roles that are verified in the backend as part of your authentication measures, for example by using AWS Identity and Access Management (IAM) roles? The way you do so can impact latency. Using a role-based invocation means the API gateway will call the Secure Token Service, thus adding latency. You can work around this by using a caller-identity or resource policy-based invocation instead (which won’t call the Secure Token Service).

Disabling cache encryption

Have you enabled encryption for your request caching? Doing so will add latency, as a result of the encryption and decryption of cache entries. Turning off cache encryption is a swift workaround for this. Just be sure that doing so is in line with your API security approach.

Alternative approaches to address high latency

Need further options to reduce API latency in relation to your gateway? These alternative approaches could provide the solution you need to address your latency issues.

Hosting functions on EC2

Amazon’s Elastic Compute Cloud (EC2) is designed to make it easier for developers to undertake web-scale computing. It erases any latency resulting from using an API gateway by removing the gateway entirely. Instead, you can handle requests with API wrappers and webservers.

Containerizing serverless functions

Microservices and containers, along with microservice gateway access patterns, go hand-in-hand when it comes to building distributed applications. Containers can also be handy for delivering efficiencies within a serverless environment – including as a strategy to reduce API latency.

Comparing serverless hosts

If the serverless route appeals to you, there are various solutions you can implement. AWS Lambda, Microsoft Azure, and Google Cloud Compute are all options to investigate if this is the approach you want to take. The advantage is that you can call resources directly and take the API gateway for microservices out of the loop entirely, thus cutting out the associated latency.

Building your own web server

Prefer to take matters into your own hands? If so, building your own webserver might be the ideal way forward. You can use NGINX or Apache to achieve what you need, providing you with an ecosystem to tweak and fine-tune however you see fit. This means you have full control over what you change and by how much, enabling you to find the ideal latency scenario to meet your business needs.

Conclusion

We’ve provided a range of options above, ensuring you have plenty of flexibility in your approach to reducing API gateway latency. Using the strategies we’ve outlined, it should be fairly simple to benchmark your performance, then implement the changes you need to conquer your API gateway latency problems and optimize your API ecosystem.

If the idea of using OpenTelemetry particularly appeals, it’s well worth checking out our tutorial on how to reduce latency issues and optimize your APIs using OpenTelemetry and Tyk.

The Tyk team is always here to help. Our in-house experts are happy to share their advice and guidance, as are API professionals around the globe via the Tyk Community.

Share the Post:

James "Hirsty" Hirst - Blog Post Header for article about the Native MCP Gateway

Start for free

Get a demo

Ready to get started?

You can have your first API up and running in as little as 15 minutes. Just sign up for a Tyk Cloud account, select your free trial option and follow the guided setup.

Tyk API Management

Deployment Options

Develop

Operate

Govern

Publish

Tyk Self-managed

Run Tyk on-prem or in your cloud for complete control over data, security, and operations

Tyk Hybrid

Blend cloud convenience with local gateways and centralised/ managed control plane for secure, scalable growth across multi-cloud and regions.

Tyk Cloud

Use Tyk as a fully managed cloud service for effortless scaling and low overhead.

Industries

Ecosystem

Comparing

Explore

Events

Company

News