API Gateway Edge or Regional? Comparing deployment options

API Gateway is the Amazon Web Services universal translation layer for synchronous integration patterns, coming built in with a ton of flexibility and options to connect to various back end services. One of the commonly misunderstood tweaks is the deployment method. API Gateway lets you set up endpoints as Edge-Optimised (proxied via AWS points of presence around the world) or Regional (existing in a single AWS region). The official documentation suggests that edge-optimised deployment “typically improves connection time for geographically diverse clients”. Is that actually true? Let’s test it.

AWS Api Gateway Regional Edge

Kai Hendri prompted this post on the AWS Developers Slack by asking the question:

I don’t quite understand when you would choose Edge over Regional. Regional sounds far more flexible because you slap the CF on yourself?

Regional deployments set up an API in a specific AWS region (for example, us-east-1 in North Virginia or eu-north-1 in Stockholm). Clients connect to the API using the public internet infrastructure (the usual way how requests would be routed).

Edge-Optimised deployments set up the API in a specific region, but then create local connection points in all AWS points of presence. This effectively uses the Amazon Web Services CDN product, CloudFront, to bring the API connectivity close to client devices. The requests from clients get routed to the closest AWS point of presence, and then go to the API using Amazon’s private links.

Both methods are available behind a simple switch. For example, using CloudFormation, all you need to set is the EndpointConfiguration property:

RegionalRestApi:
  Type: AWS::Serverless::Api
  Properties:
    EndpointConfiguration: Regional

EdgeRestApi:
  Type: AWS::Serverless::Api
  Properties:
    EndpointConfiguration: Edge

Importantly, the edge deployment method doesn’t replicate the API in every region. It just sets up a bunch of HTTPS proxies around the world. The API still runs primarily in one location.

CDNs are great when content can get cached on the edge, but API responses aren’t usually something people want to cache, so the benefits of CloudFront are reduced to just better routing. In theory, Edge-Optimised should be better for anyone who’s not close to the one specific region where the API is deployed. But my tests show that the situation isn’t so clear cut.

Testing API Gateway latency

API Gateway latencies are relatively easy to test using the Serverless Inquisitor. It sets up a bunch of combinations of APIs backed by AWS Lambda functions, and provides a convenient web site front-end to measure request duration and latencies.

Here’s one test result. Testing from Stockholm, and connecting to the AWS region in East US:

Stockholm->East USEdgeRegional
average145ms134ms
min125ms125ms
65%<137ms<134ms
95%<156ms<147ms

In theory, since Stockholm hosts an AWS region, the Edge connection should work better. However, the Regional API Gateway connection is slightly better, with less variance. I guess public internet routing between Sweden and East US is good enough so internal AWS networks can’t beat it.

OK, but what if there’s no CloudFront point immediately next door? Testing from Belgrade, Serbia (closest AWS POP is in theory Milan in Italy or Frankfurt in Germany), going to an API deployed in us-west-2 (Oregon). The Regional API access is still faster, quite significantly, with far less variance.

Belgrade->West USEdgeRegional
average278ms232ms
min216ms219ms
65%<229ms<231ms
95%<597ms<241ms

Interestingly, when testing a nearby region, the results are almost the same. I’d expect Edge to be slightly slower, since it has to go through CloudFront, but it’s actually not. This is a fantastic testimony to CloudFront performance.

Testing from Stocholm, connecting to the eu-north-1 region that should be very close by, here are the results:

Stockholm->StockholmEdgeRegional
average48ms48ms
min27ms26ms
65%<49ms<48ms
95%<59ms<56ms

We’re getting millisecond differences here, which are easily a rounding error. Another interesting result here is how quick the whole on-demand infrastructure can be. From a client browser to API Gateway, invoking a Lambda function, and coming back to the browser, is around 50ms if the user is close to the AWS region. It works even better if you’re close to some larger regions. Here’s a test from south-east UK, connecting to the API deployed in the AWS London region:

UK->LondonEdgeRegional
average31ms27ms
min21ms20ms
65%<26ms<25ms
95%<69ms<29ms

I’ve ran these tests a bunch of times from various locations in Europe and the US, and the conclusion is always the same. In terms of latency the regional API is slightly better, with less variance. Whether the difference will be visible at all to any end users is a big question. We’re talking about a few dozen milliseconds at best.

Do Edge-Optimised have any benefits?

Latency/connection time doesn’t seem to be a big difference between Edge and Regional endpoints on the tests I conducted, but this doesn’t necessarily mean there are no benefits to Edge-optimised deployments.

One key difference is that by using an edge-optimised deployment, you can get CloudFront headers. CloudFront adds some interesting headers, such as geo-location (CloudFront-Viewer-Country-Name) and device autodetection (CloudFront-Is-Mobile-Viewer), which become available to the API in an edge-optimised deployment. This can be useful for analytics purposes, or for customising API responses. If you need those additional headers, Edge deployments will be better. (Of course, as Kai suggests, you can “slap CloudFormation on the API Gateway yourself”, but that’s just more work without any clear benefits).

Also, “geographically diverse” might mean different things to different people. Your users might be located in different places from our users, so it’s worth running your own tests. Spin up an Inquisitor instance in the target regions and ask some friendly users to send you the results. That will help you decide if proxying via CloudFront is worth it.

Reducing API latencies, the right way

If you have lots of users located on the other side of the world from your API, just toggling a simple switch won’t do much unfortunately. Edge-optimised APIs are still located in a single region, AWS just handles connections locally and proxies them to the destination internally.

Here’s another test, going from Stockholm to a nearby region, to east US and to west US.

From Stockholm (regional)eu-north-1us-east-2us-west-2
average37ms 134ms 205ms
min22ms 125ms 182ms
65%<36ms<135ms<205ms
95%<49ms<148ms<218ms

As a general rule, connecting to an API close-by will give your users latencies of roughly 50ms. Connecting to another continent adds at least 100 milliseconds, regardless of how you do it. Putting the API and the application stack close to your users will improve connection performance much more than using a better routing method.

As a final tip, if you really want to shave off milliseconds from the user connection latency, and you use API Gateway to send stuff to Lambda functions, consider not using API Gateway at all. Cognito and AWS SDK can help you call Lambda functions directly from client devices in a safe way. Inquisitor also runs those tests. So here’s a quick example, comparing calls from Stockholm to a Lambda function running in the nearby region, with and without an API Gateway in between:

Stockholm->eu-north-1Regional APIDirect Lambda
average37ms27ms
min22ms18ms
65%<36ms<26ms
95%<49ms<36ms

So going to Lambda direct shaves off 5-10 milliseconds in average. Again, I’m amazed how low the overhead of API Gateway ends up, which is a great example why it’s better to use modern cloud infrastructure instead of rolling your own stuff for most people.

Conclusions

For most cases, if you deploy to a single region, and your users are located in Europe or the US, they will not notice any significant difference between Edge-Optimised and Regional. This might not be true if your users are somewhere else in the world, so test it.

If you want CloudFront headers in the API, for analytics or processing purposes, use the Edge version.

So if you really want to reduce latency for user requests, then you’ll have to deploy app workflows close to your users. That will give you the biggest effect, but it significantly increases complexity. For those who really want to reduce every single millisecond they can, go to Lambda directly and avoid API Gateway. (Although, I’d argue that anything like that will not really be observable by end users anyway).

Cover photo by Shubham Dhage on Unsplash.

Narakeet helps you create narrated videos quickly, using text-to-speech to turn Powerpoint presentations and Markdown scripts into engaging videos. It is under active development, so things change frequently. Keep up to date: RSS, Slack, Twitter, YouTube, Facebook, Instagram, TikTok