May 4, 2021

AWS Cloud HSM, Docker and NGINX

There is quite a bit of easily searchable content on the security benefits of leveraging a Hardware Security Module to manage cryptographic keys, so I will leave that to the scope of another article. The short story is that an HSM is a computing device designed for safeguarding cryptographic keys and recently I deployed a solution leveraging AWS Cloud HSM and an NGINX container deployed as a reverse proxy server.

Beyond the intricacies of getting the AWS Cloud HSM up and running, which can be found here, the biggest challenge I faced was probably getting the HSM client ‘service’ running in the container image. Additionally, understanding the logic behind the required certificates for this solution was also a little tough to follow.

So for this summary blog, I decided to put the missing components together from other documents in order to help you get this solution deployed.

Also, please note that there is no free tier for Cloud HSM, so if left deployed and unattended, your AWS bill could increase.

Once you’ve made it to the ‘Initialize the Cluster’ step in the AWS Cloud HSM setup guide, note that you will need to keep a copy of the cluster CSR in a convenient location.

This CSR would also be used to generate a production level certificate from a Certificate Authority, but for our purposes here, we are just creating a self signed certificate. Also listed in the AWS docs, these commands are fundamental in generating the CRT files needed for your NGINX container to connect to the HSM cluster. Please replace ‘cluster_ID’ with the ID of your HSM cluster.

openssl genrsa -aes256 -out customerCA.key 2048
openssl req -new -x509 -days 3652 -key customerCA.key -out customerCA.crt
openssl x509 -req -days 3652 -in cluster_ID_ClusterCsr.csr \
-CA customerCA.crt \
-CAkey customerCA.key \
-CAcreateserial \
-out cluster_ID_CustomerHsmCertificate.crt

The customerCA.crt is a self-signed certificate required for any host connected to the HSM cluster. In our case, NGINX will connect to the HSM cluster so it is indeed required.

The cluster_ID_CustomerHsmCertificate.crt is the cert used to initialize the cluster. Both of these certs are used in the NGINX docker file.

Important: You will need the .crt files from the above commands to add to the NGINX container.

Once the cluster is initialized, and your HSM is up and running, you will need to activate and configure the HSM. Essentially, you are installing the HSM client on an instance that can connect to the cluster in order to create the necessary users and a pem.key generated by the openssl HSM engine.

It’s common practice to leverage an EC2 instance for this configuration, but you will need the HSM tools for this. The AWS docs contain the setup for both the Linux and Windows HSM clients. This client will also need to be installed in the NGINX container so you will see this command in the resulting Dockerfile as well.

Using the installed client, activate your HSM cluster and use the client to configure one or more ‘Crypto Users’. The NGINX Dockerfile will need one of these usernames and passwords so be sure to make note of these credentials.

And now that the Cloud HSM Dynamic Engine has been installed, you will also need to run:

openssl genrsa -engine cloudhsm -out web_server_fake_pem.key 2048
openssl req -engine cloudhsm -new -key <web_server_fake_PEM.key> -out <web_server.csr>
openssl x509 -engine cloudhsm -req -days 365 -in <web_server.csr> -signkey <web_server_fake_PEM.key> -out <web_server.crt>

The web_server_fake_PEM.key and the web_server.crt will be added to the Dockerfile. In a production environment, your web_server.crt would be provided by an official Certificate Authority.

Your HSM cluster should now be ready for an application, which in our case is the NGINX Reverse Proxy in a Docker container. Please note that you may need to open ports 2223-2225 between your NGINX container and your AWS Cloud HSM instance. Please review the AWS Security Group Requirements for refresher information.

As mentioned before, getting the HSM client running as a ‘service’ in the NGINX container image is a little tricky.

Fortunately, I also found this helpful AWS Security Blog describing a method for doing just that. My solution added this as a script entitled ‘start_nginx.sh’ to the Dockerfile. The script runs the HSM client in the background and starts the NGINX service in the container.

At this point, we are ready to configure our NGINX container image. Below I’ve provided the following sample files: Dockerfile, nginx.conf, start_nginx.sh, nginx.service. This bare minimum configuration should allow your NGINX reverse proxy to connect to the Cloud HSM instance, provided that the network allowances are correct.

Dockerfile

Comments regarding the CloudHSM configuration have been added throughout this Dockerfile

FROM amazonlinux:latest

# Install Nginx
RUN amazon-linux-extras install nginx1 

# Install CloudHSM Client Tools
## Install CloudHSM OpenSSL Dynamic Engine
RUN curl -vLo ./cloudhsm-client-latest.el7.x86_64.rpm https://s3.amazonaws.com/cloudhsmv2-software/CloudHsmClient/EL7/cloudhsm-client-latest.el7.x86_64.rpm
RUN yum install -y ./cloudhsm-client-latest.el7.x86_64.rpm
RUN yum clean all
RUN curl -vLo ./cloudhsm-client-dyn-latest.el7.x86_64.rpm https://s3.amazonaws.com/cloudhsmv2-software/CloudHsmClient/EL7/cloudhsm-client-dyn-latest.el7.x86_64.rpm
RUN yum install -y ./cloudhsm-client-dyn-latest.el7.x86_64.rpm 
RUN yum clean all

# The ENV credentials below were configured on the EC2 instance and are the crypto user's credentials
ENV HSM_USERNAME=$HSM_USER_NAME
ENV HSM_USER_PASSWORD=$HSM_USER_PASSWORD

# Use an IP from any one node in the cluster
ENV HSM_IP=$HSM_IP

# This ENV variable is required to get the cloudhsm engine running
ENV n3fips_password=$HSM_USERNAME:$HSM_USER_PASSWORD

# Sample nginx.conf below
COPY ./nginx.conf /etc/nginx/nginx.conf

RUN mkdir -p /etc/ssl/private

# HSM IP ENV Var from above 
RUN /opt/cloudhsm/bin/configure -a ${HSM_IP}

# CRT files configured from above
COPY ./customerCA.crt /opt/cloudhsm/etc/
COPY ./<cluster_ID>_CustomerHsmCertificate.crt /opt/cloudhsm/etc/

# Openssl Key generated using the cloudhsm engine
COPY  ./web_server_fake_PEM.key /etc/ssl/private/web_server_fake_PEM.key 

# This should be the production cert generated by a CA
## or would be the self-signed cert from step ()
COPY ./web_server.crt /etc/ssl/certs/web_server.crt

# This is a custom file configures the NGINX service specifically to look at /etc/sysconfig/nginx for the n3fips_password configuration. Sample below
COPY ./nginx.service /lib/systemd/system/nginx.service

RUN touch /etc/sysconfig/nginx
RUN echo "n3fips_password=${HSM_USER_NAME}:${HSM_USER_PASSWORD}" > /etc/sysconfig/nginx

# This is a custom bash script found here: https://aws.amazon.com/blogs/security/how-to-run-aws-cloudhsm-workloads-on-docker-containers/
## Essentially, the NGINX container cannot keep the CloudHSM service running and this file provides a workaround. Shown below.

COPY start_nginx.sh .

CMD ["bash", "start_nginx.sh"]

nginx.conf

This is a mostly typical configuration with the exception of the ‘ssl_engine_cloudhsm’ and the ‘env n3fips_password’ lines. For more info testing a standard NGINX dseployment see here. Essentially our big difference is in telling NGINX to use Cloud HSM.

worker_processes  auto;

# tells engine to use the installed cloudhsm engine and the env variable configured by nginx.service
ssl_engine cloudhsm;
env n3fips_password;

error_log  /var/log/nginx/error.log info;

events {
    worker_connections  1024;
}

http {

    default_type  application/json;

    log_format  main  '$remote_addr - $remote_user [$time_local] "$request" '
                      '$status $body_bytes_sent "$http_referer" '
                      '"$http_user_agent" "$http_x_forwarded_for"';

    root /var/share/nginx;
    access_log  /var/log/nginx/access.log;
    keepalive_timeout  65;

    server {
        listen [::]:443 ssl http2;
        listen      443 ssl http2;
        
        ssl_certificate /etc/ssl/certs/web_server.crt;                  
        ssl_certificate_key /etc/ssl/private/web_server_fake_PEM.key;                          

        server_name nginx-test;
      
        location / {
            proxy_pass https://nginx-test;
        }
   
    }
}

start_nginx.sh

This file allows the cloudhsm ‘service’ to run inside the container and should allow for cloudhsm logs to be provided to your container. Additional reading on this can be found at this AWS security blog.

#! /bin/bash
# Seen here: https://aws.amazon.com/blogs/security/how-to-run-aws-cloudhsm-workloads-on-docker-containers/
# start cloudhsm client
echo -n "* Starting CloudHSM client ... "
/opt/cloudhsm/bin/cloudhsm_client /opt/cloudhsm/etc/cloudhsm_client.cfg &> /tmp/cloudhsm_client_start.log &

# wait for startup
while true
do
    if grep 'libevmulti_init: Ready !' /tmp/cloudhsm_client_start.log &> /dev/null
    then
        echo "[OK]"
        cat /tmp/cloudhsm_client_start.log
        break
    fi
    sleep 0.5
done
echo -e "\n* CloudHSM client started successfully ... \n"

/usr/sbin/nginx -g "daemon off;"

nginx.service

Inspired by this blog, this allows the nginx service to use the ‘n3fips_password’ environment variable

[Unit]
Description=The nginx HTTP and reverse proxy server
After=network.target remote-fs.target nss-lookup.target

[Service]
Type=forking
PIDFile=/run/nginx.pid
# Nginx will fail to start if /run/nginx.pid already exists but has the wrong
# SELinux context. This might happen when running `nginx -t` from the cmdline.
# https://bugzilla.redhat.com/show_bug.cgi?id=1268621
ExecStartPre=/usr/bin/rm -f /run/nginx.pid
ExecStartPre=/usr/sbin/nginx -t
ExecStart=/usr/sbin/nginx
ExecReload=/bin/kill -s HUP $MAINPID
KillSignal=SIGQUIT
TimeoutStopSec=5
KillMode=mixed
PrivateTmp=true
Environment=/etc/sysconfig/nginx

[Install]
WantedBy=multi-user.target

Once all is configured, you should have a directory structure like the below screenshot.

Provided that your security groups and permissions are correct, this should allow you to run your NGINX container in the Elastic Container Service (Fargate in particular).

One last tip, you could add this one liner to your Dockerfile to get the container logs into CloudWatch for additional troubleshooting:

RUN ln -sf /dev/stdout /var/log/nginx/access.log && ln -sf /dev/stderr /var/log/nginx/error.log

Thanks for reading!

About the Author

Dan Peterson profile.

Dan Peterson

Sr Consultant
Leave a Reply

Your email address will not be published. Required fields are marked *

Related Blog Posts
An Exploration in Rust: Musings From a Java/C++ Developer
Why Rust? It’s fast (runtime performance) It’s small (binary size) It’s safe (no memory leaks) It’s modern (build system, language features, etc) When Is It Worth It? Embedded systems (where it is implied that interpreted […]
Getting Started with CSS Container Queries
For as long as I’ve been working full-time on the front-end, I’ve heard about the promise of container queries and their potential to solve the majority of our responsive web design needs. And, for as […]
Simple improvements to making decisions in teams
Software development teams need to make a lot of decisions. Functional requirements, non-functional requirements, user experience, API contracts, tech stack, architecture, database schemas, cloud providers, deployment strategy, test strategy, security, and the list goes on. […]
JavaScript Bundle Optimization – Polyfills
If you are lucky enough to only support a small subset of browsers (for example, you are targeting a controlled set of users), feel free to move along. However, if your website is open to […]