More Charts: Adding TLS to Airflow

In this post, we will be adding TLS to Airflow on Azure Kubernetes Service.

This is part three of a five-part series addressing Airflow at an enterprise scale. I will update these with links as they are published.

Previously, we deployed Airflow to an Azure Kubernetes Service cluster using the official Helm chart and an Azure PostgreSQL Single Server instance for the metadata database. This post will focus on configuring TLS for Airflow using Cert-Manager, LetsEncrypt and Azure DNS Zones.

Getting Started with TLS

This isn’t unique to Airflow, as we will terminate TLS at the Ingress resource but is a useful skill nonetheless. The jetpack/cert-manager chart installs a set of Custom Resource Definitions (CRDs) to our cluster. Specifically, we will use the ClusterIssuer resource to provide a TLS certificate through LetsEncrypt. This certificate is applied to the Ingress resource through the normal method. Lastly, we need an ingress controller deployed to the cluster.

Let’s start by installing an Nginx Ingress Controller:

helm repo add ingress-nginx https://kubernetes.github.io/ingress-nginx
helm repo update
helm install --set defaultBackend.enabled=true \
    --namespace airflow \
    nginx \
    nginx-ingress/nginx-ingress

Now, we are ready to install the Cert-Manager:

helm repo add jetstack https://charts.jetstack.io
helm repo update
helm install cert-manager jetstack/cert-manager --namespace ingress-basic --version v1.5.4 --set installCRDs=true 

And now we are ready to configure our ClusterIssuer manifest for LetsEncrypt:

apiVersion: cert-manager.io/v1
kind: ClusterIssuer
metadata:
  name: letsencrypt
  namespace: airflow
spec:
  acme:
    server: https://acme-v02.api.letsencrypt.org/directory
    email:<your-email>@gmail.com
    privateKeySecretRef:
      name: letsencrypt
    solvers:
    - http01:
        ingress:
          class: nginx
  • server is set to the prod instance of LetsEncrypt’s ACME servers.
  • email contains contact information for certificate renewal.
  • privateKeySecretRef.name is the secret name to store LetsEncrypt keys.
  • http.ingress.class sets the ingress class that will service this http01 solver. More on this during the Ingress configuration.

It’s Always DNS

As always, the answer to “How should I configure the DNS?” is “It depends.” That being said, I am hosting this airflow cluster on a dedicated subdomain of a domain that I own. My configuration is as follows:

  • Azure DNS Zone
    • Get the external IP from the nginx-ingress-controller (or something similarly named) service.
    • Create an A record pointing the load balancer IP, in my case, to the subdomain.
  • Go to your domain provider’s management portal and register the nameservers that Azure DNS Zones lists on the overview tab.

Building the Ingress Resource

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: airflow-ingress
  namespace: airflow
  annotations:
    kubernetes.io/ingress.class: nginx
    cert-manager.io/cluster-issuer: letsencrypt
spec:
  ingressClassName: nginx
  tls:
    - hosts:
      - airflow.************.com
      secretName: airflow-tls
  rules:
  - host: airflow.**************.com
    http:
      paths:
      - path: /
        pathType: Prefix
        backend:
          service:
            name: airflow-webserver
            port: 
              number: 8080
  • metadata.annotations.kubernetes.io/ingress.class sets the ingress class as nginx
  • metadata.annotations.cert-manager.io/cluster-issuer: letsencrypt tells Cert-Manager that it’s ClusterIssuer named letsencrypt to service any tls blocks in the Ingress.
  • spec.tls[0].hosts.secretName is the standard Ingress functionality.

The real special sauce of using Cert-Manager and LetsEncrypt is that when this Ingress resource was created with those annotations, Cert-Manager requested a cert for the host and created the secret per our configured secretName.

TLS In Action

But why does Chrome give a “Not Secure” warning?

The SSL Report looks good.

Oh no, it’s always DNS. This can be solved by adding a CAA record to the DNS configuration. The record should look like this:
airflow.*******.com CAA 1 issue "letsencrypt"

And that’s it, Airflow is secured with TLS.

About the Author

Object Partners profile.
Leave a Reply

Your email address will not be published.

Related Blog Posts
Natively Compiled Java on Google App Engine
Google App Engine is a platform-as-a-service product that is marketed as a way to get your applications into the cloud without necessarily knowing all of the infrastructure bits and pieces to do so. Google App […]
Building Better Data Visualization Experiences: Part 2 of 2
If you don't have a Ph.D. in data science, the raw data might be difficult to comprehend. This is where data visualization comes in.
Unleashing Feature Flags onto Kafka Consumers
Feature flags are a tool to strategically enable or disable functionality at runtime. They are often used to drive different user experiences but can also be useful in real-time data systems. In this post, we’ll […]
A security model for developers
Software security is more important than ever, but developing secure applications is more confusing than ever. TLS, mTLS, RBAC, SAML, OAUTH, OWASP, GDPR, SASL, RSA, JWT, cookie, attack vector, DDoS, firewall, VPN, security groups, exploit, […]