Understanding Mutual TLS Options in the Public Cloud
When delivering an API over the public internet via a cloud provider, some organizations and frameworks require mutual TLS verification as a part of the interaction for that API. Mutual TLS can be used to identify clients in a server to server interaction. The certificate exchange for mutual TLS does not add an extra hop of communication over standard TLS, making it an ideal method for securely identifying clients to target resources without adding significant latency. As SSL/TLS is defined at the TCP layer, this method for identifying clients can extend beyond HTTP APIs to use cases such as gRPC and PostgreSQL server authentication.
When looking for solutions to deliver publicly facing APIs in the cloud using Mutual TLS, there are a number of factors to consider when selecting a service within your cloud provider. These factors are:
- Cloud service Mutual TLS support
- Multi-region support and high availability
- Certificate management options
Mutual TLS Support
Not every cloud service that terminates TLS connections supports mutual TLS. This isn’t driven by a lack of architectural support, the cloud providers intend for you to use specific services to manage these types of APIs. The following services support Mutual TLS:
- AWS API Gateway – GA as of September 2020
- Azure API Management
- Azure App Services
- GCP Extensible Service Proxy (via Nginx) and Extensible Service Proxy v2 (via Envoy)
When reviewing this list of services, it is important to see what is missing from this picture. Native firewalls are not directly supported, though the firewall model is different in each one of the clouds. AWS API Gateway can provide WAF directly within a mutual TLS authentication pattern, as their WAF can be attached to an API Gateway resource. Azure API Management provides documentation about how to put an application gateway behind the API Management resource to provide WAF before hitting the backend resource. Azure Web Apps can provide a built in WAF, but only for the premium tier of this service, the App Service Environment. In GCP, I cannot figure out way to put a WAF in front of or behind the ESP. Feel free to leave a comment if you can figure out how to do this.
Another model for deploying mutual TLS here is to authenticate one cloud service to another. One example deployment model that demonstrates this is Azure API Management in front of Azure App Services using client certificates to secure communication between the two. This model allows the App Service to only accept traffic that originates through Azure API Management without requiring network restriction enforcement between the two. It also allows for additional testers to be granted temporary certificates that would allow them to communicate directly to the App Service if desired.
Cloud CDN services, such as Azure CDN, GCP Cloud CDN, and CloudFront are not supported. This is intentional as these types of caching services are not intended for use with this type of service to service architectural model. Generally global entrypoint services services, such as Azure Front Door and AWS CloudFront, do not support mutual TLS for the same reason.
The only service I could find with an active request for Mutual TLS support was Azure Application Gateway. According to the Azure feedback forum, there was a request planned in 2018 and another triaged in 2019, but no movement has taken place on either request since the product team’s comment.
When looking to support an API in a multi-regional deployment model, it is critical to understand how the service you are using for multi-region routing sends traffic to the downstream endpoints. If the service terminates the TLS connection, it must support mutual TLS at that endpoint. The service in the list above that can support a multi-region deployment model natively is Azure API Management. This does not mean that the other services cannot be a part of a multi-region solution. On the other hand, this feature enables you to simplify the management of the client certificates within the solution.
While GCP endpoints does not natively support a multi-region deployment model, some of the solutions can allow you to specify endpoints across multiple regions as the upstream source. This is specifically documented in the Cloud Run documentation. You could use one of the performance based DNS services provided by the other two cloud providers if desired.
For other API driven solutions, generally a cloud provider will not support a caching layer in front of the API. This means that the global routing provider can be defined via DNS, as DNS resolution will not interfere with the TCP connection path. AWS Route 53 and Azure Traffic Manager support more sophisticated traffic profiles for routing requests. It is critical to understand that in a server to server model, the DNS caching on the client side is also a critical component of the uptime of the system. This doesn’t specifically affect how to deploy APIs service mutual TLS, but is a consideration when working with client services that need to communicate with a mutli-region service routed via DNS. If all your clients live in US East, your active-active deployment model can unintentionally become active-passive.
In order to manage certificates for any of these services, you will need to create a Certificate Authority (CA) or use a managed CA such as AWS Certificate Manager or the beta GCP Certificate Authority service. There are many external CA service providers that are available as well.
From this certificate authority, you can create root and intermediate certificates through which to issue your client certificate. AWS provides some excellent documentation about how to run this yourself in their release page for mutual TLS on AWS API Gateway.
From the root and intermediate certificate authorities, you can issue client certificates with specific CNs. Then, sign those certificates with the required CA (the depth required is determined by the number of intermediate CAs generated in the chain). Then, provide the client with the signed certificate and each of these cloud services allow you to upload a chain of root and intermediate certificates to validate the client certificate against. Ensure that the length of your chain is not too long for the cloud service, for example AWS API Gateway can only handle a chain of four CAs.
This file is then used to configure the cloud service to authenticate the client for your API. Many of the cloud services will pass back the client certificate or client identity in a header so that the backend application can used the information about the validated client to understand the context of the request. Check out the linked documentation for your selected service to understand the way that service passes the client information to the backend.
In this document, we have discussed the options for serving an API using mutual TLS and client certificates via the three cloud providers. For those of you who have to deploy such services, good luck and I would love to hear about the cloud services you used to deploy your solution. Drop a note in the comments to share what you have been able to deploy successfully.