Hashicorp’s Vault and Ansible Integrations

Introduction

After all the late nights I’ve spent juggling ssh keys for automation users with Ansible, it became clear that I needed a better way to manage this process. There are plenty of high level services to help with this, but I like to know what’s happening at a lower level and take a minimalist’s approach. A basic integration of Ansible and Hashicorp’s Vault seemed a likely place to start. The entire process remains lightweight and shows itself to have enormous potential.

What you should already know

To get the most out of this article, there are a few things you should know first.

The Problem

Ansible provides us with a real easy way to manage large numbers of machines which are provisioned with automation users. To illustrate this problem, I assume a single, common automation user (golem). Now that I have servers and golems, I can use Ansible like this:

$ ansible -u golem --private-key /super/protected/golem-key -a 'echo Hello' serverlist

Great, right? Admittedly, its pretty nice to be able to do this, but now I have a /super/protected/golem-key to keep super protected. So maybe I start chmodding directories to limit access, but that’s pretty flimsy security. I could encrypt the key for safe keeping, but now I have encryption keys to keep secret, so I have to look at this process all over again.

My entire cluster is always one secret away from being completely compromised.

Given this failing, I need a better way to manage the keys. Ideally, I wouldn’t need to keep the key myself. In fact, the only things I need to know should be disposable secrets that are replaceable, manageable, and low stress. If access secrets are compromised, I’d like the ability to lock down access to all secrets and prevent damage to my cluster environment.

    Minimal Requirements for Key Management

  • Keys are kept in a central repository
  • Access to the repository is managed
  • Access to secrets (our keys) is auditable

A Solution: Hashicorp’s Vault

Some searching around the web has shown me that Hashicorp’s Vault is suited to my lightweight key management needs. It is distributed as an all-in-one client-server binary file that you just unpack in place. A Vault Server can accommodate multiple Vaults, and each Vault can be managed by one or more people to control access to that Vault’s secrets at a fine-grained level. To fulfill my criteria, Vault is also fully auditable.

Hashicorp Vault

Implementing

My plan now is to build a Host-Only Network to house 3 servers: vault, ansible, and webserver. We won’t much be using the webserver VM, but it exists to illustrate the power of a very simple Ansible/Vault integration. If you’re unfamiliar with the construction of a Host-Only Network, you should follow the link to my blog post and give it a reading.

And, indeed, I’ll try to keep this very simple. High level tools and integrations exist to help us on our way once we understand this process at a low level (High level tools such as Consul, for example). My intention is to take the mystery out of an Ansible/Vault integration, not to present a bulletproofed integration.

I built a Vagrantfile to launch and provision my servers, ready to go. Take a moment to familiarize yourself with this file and we’ll go over it momentarily. Note that I’m using the 192.168.33.0/24 network.

I’ve also made available the scripts used during the demo, which are in my github repo here.

Those Bootstrap Keys (stupid and stupid.pub)

Before moving on, I’d like to take a moment to address something you may see in the Vagrantfile. It’s true that I’m keeping ssh keys in my public github repo: stupid and stupid.pub. These are bootstrapping keys that are loaded onto the servers in my cluster before we begin. The point of this exercise is to replace these keys with something less… er… well… stupid.

Over and over, I’ll point at the difference between requirements for the security of the private key (stupid) and the public key (stupid.pub). I’ll start pointing to this now by saying: keep the private key secured. The public key… well you can do anything you want with that. Post it on pastebin, hire a sky-writer, etc. The public key is made to be distributed, but the security is really about that private key.

A Peak At the VM: vault

The vault VM is configured for 2 major elements:

  1. The Vault Server is configured and running in systemd
  2. The golem user is set up and ready to be accessed

It all comes from this segment of the Vagrantfile:

     yum update -y
     yum install zip -y
     yum install python -y
     yum install git -y
     yum install libselinux-python -y
     adduser golem
     curl -s https://releases.hashicorp.com/vault/0.9.1/vault_0.9.1_linux_amd64.zip?_ga=2.233191262.1530298798.1514997912-403752314.1514997912 https://releases.hashicorp.com/vault/0.9.1/vault_0.9.1_linux_amd64.zip?_ga=2.233191262.1530298798.1514997912-403752314.1514997912 > /vault.zip
     cd /
     unzip vault.zip
     mv vault /bin/vault
     rm vault.zip
     echo "golem   ALL=(ALL)       NOPASSWD: ALL" >> /etc/sudoers
     mkdir /home/golem/.ssh/
     chmod 700 /home/golem/.ssh/
     chown golem:golem /home/golem/.ssh/
     mkdir /src
     cd /src
     git clone https://github.com/OPI-doug/public-utils.git
     cd public-utils
     cp stupid.pub /home/golem/.ssh/authorized_keys
     chmod 600 /home/golem/.ssh/authorized_keys
     chown golem:golem /home/golem/.ssh/authorized_keys
     cp vault-stuff/vault.service /etc/systemd/system/vault.service
     mkdir /etc/vault.d/
     cp vault-stuff/vconf.conf /etc/vault.d/vconf.conf
     systemctl daemon-reload
     systemctl start vault

To get the Vault server up, I download and install Vault (from a binary file), create a systemd service file for it, then start it up. Of special note, if you examine the service file, you’ll find this command being run to start the server:

vault server -config=/etc/vault.d/vhost.conf

This command seems to be conspicuously missing from all the Vault instructions I found. This starts up a Vault server and allows connections to be made to it for ‘vault init’ and everything else that you end up doing. Most of the Vault commands are well documented, and ‘vault server’ may be, too, but I had to spend time figuring it out so I thought I’d share that tidbit with you.

The other highlight of the provisioning process involves setting up the golem user. You’ll see that stupid.key was dropped into the /home/golem/.ssh/authorized_keys file. This is what allows us to use golem’s private-key to log into this server via ssh (and, therefore, Ansible).

Ultimately, our key refresh process requires to overwrite the public key found at /home/golem/.ssh/authorized_keys.

A Peak At the VM: ansible

Most of the demonstration will occur on this VM. Four major events happen to the ansible server for provisioning. After provisioning, you should log into the ansible vm:

doug@OPI demo-dir$ vagrant ssh ansible
[vagrant@localhost] ~$
  1. Vault gets installed and configured for client-side use.
  2. Ansible gets installed and configured for our cluster.
  3. The golem user isn’t created here, but the private key is left in our /home/vagrant directory for use with Ansible.
  4. All necessary scripts and Ansible Playbooks are left in /home/vagrant
yum update -y
     yum install ansible -y
     yum install git -y
     yum install zip -y
     yum install vim -y
     curl -s https://releases.hashicorp.com/vault/0.9.1/vault_0.9.1_linux_amd64.zip?_ga=2.233191262.1530298798.1514997912-403752314.1514997912 https://releases.hashicorp.com/vault/0.9.1/vault_0.9.1_linux_amd64.zip?_ga=2.233191262.1530298798.1514997912-403752314.1514997912 > /vault.zip
     cd /
     unzip vault.zip
     mv vault /bin/vault
     rm vault.zip
     mkdir /src
     cd /src
     git clone https://github.com/OPI-doug/public-utils.git
     cd public-utils
     cp stupid /home/vagrant/golem-key
     chmod 600 /home/vagrant/golem-key
     chown vagrant:vagrant /home/vagrant/golem-key
     cp ansible-stuff/hosts /etc/ansible/hosts
     echo "export VAULT_ADDR=http://192.168.33.12:8200" >> /etc/bashrc
     mkdir /home/vagrant/files
     chmod 755 /home/vagrant/files
     chown vagrant:vagrant /home/vagrant/files
     cp ansible-stuff/golem-key-refresh.yaml /home/vagrant/golem-key-refresh.yaml
     chmod 644 /home/vagrant/golem-key-refresh.yaml
     chown vagrant:vagrant /home/vagrant/golem-key-refresh.yaml
     cp ansible-stuff/vaultinit.sh /home/vagrant/vaultinit.sh
     chmod 755 /home/vagrant/vaultinit.sh
     chown vagrant:vagrant /home/vagrant/vaultinit.sh
     cp ansible-stuff/ansiblew.sh /home/vagrant/ansiblew.sh
     chmod 755 /home/vagrant/ansiblew.sh
     chown vagrant:vagrant /home/vagrant/ansiblew.sh
     cp ansible-stuff/keyrefreshw.sh /home/vagrant/keyrefreshw.sh
     chmod 755 /home/vagrant/keyrefreshw.sh
     chown vagrant:vagrant /home/vagrant/keyrefreshw.sh
     cp ansible-stuff/sshconfig /home/vagrant/.ssh/config
     chmod 600 /home/vagrant/.ssh/config
     chown vagrant:vagrant /home/vagrant/.ssh/config

Aside from the simple pleasure of provisioning with ansible, not a whole lot of magic is presented here. Our scripts are in place for the demo and ansible is fully configured and ready to go with a hosts file in /etc/ansible/.

Important to this demo, /home/vagrant/golem-key exists. This is a copy of ‘stupid’, the golem private key, and will be overwritten shortly (with the demo).

A Peak At the VM: webserver

There are only two things happening on the webserver.

  1. Install httpd and run it.
  2. Install the golem automation user and provide the public key.

This server doesn’t even need Vault installed. It gets a pretty minimal installation. There is also a forwarded port from the host server so we can see our Fedora standard index.html page by going to ‘localhost:8080’

Process Walkthrough

Now that the VMs have been provisioned simply by typing “vagrant up”, we can move on to the demonstration. This process has a number of steps to successfully illustrate the Vagrant/Ansible integration and key-refresh process.

  1. Initialize the Vault.
  2. Unseal the Vault.
  3. Auth the Vault.
  4. Enable Auditing.
  5. Refresh the keys with an Ansible Wrapper.
  6. Use the new keys with another Ansible Wrapper.
  7. Look at the audit logs.

Initialize the Vault and Unseal It

As every Ansible Server comes with no initialized vaults, it’s now time to set one up. Note that each vault can have a number of owners, with the owners being those people who have the keys. There are concepts of ‘key shares’ and ‘key thresholds’ to understand: if ‘key shares’ == 3 and ‘key threshold’ == 2, then 3 key shares are created and 2 of them are required to unseal the vault. First, we use the vaultinit.sh to initialize the vault and capture any necessary tokens. We set the shares and the threshold to 1 just to take a minimalist approach for this walkthrough.

#! /bin/bash
 
vault init -key-shares=1 -key-threshold=1 > init.log
 
cat init.log | grep Unseal | awk '{print $4}' > unseal.token
cat init.log | grep Root | awk '{print $4}' > root.token

After this step is complete, we should be able to unseal the vault.

$ vault unseal `cat unseal.token`

Auth

Now the Vault is unsealed and we can use our Auth Token to grant us access.

$ vault auth `cat root.token`

Worst Practice: Using the Root Auth Token

The Root token we just authed with has power over the entire vault we just initialized. By all rights, we should be using it to create less privileged tokens and using those tokens in this process. For now, though, its enough to remember that we have a much finer-grained token control available to us than we’re using. In the interest of simplicity, I’m going to just stick with using the Root token, which should never be done on a production system.

Enable Auditing

After authorization is granted, we are able to tell our Vault to go ahead and begin keeping audit logs. A number of audit back-ends are available, but I’m going to use syslog.

$ vault audit-enable syslog

Now our vault is adding request/response information to its logs for auditing. We’ll take a look at these soon.

Refresh the Keys

It’s time to examine the keyrefreshw.sh script. This script accomplishes a few things at the command line level. It creates a new set of ssh keys, uses ansible to distribute the public keys, then cleans itself up. You’ll note that we no longer have a ‘golem-key’ file in /home/vagrant.

[vagrant@localhost ~]$ ./keyrefreshw.sh 
Success! Data written to: secret/newgolemkey
Success! Data written to: secret/newgolemkeypub
 
PLAY [golem-key-refresh] *******************************************************
 
TASK [Gathering Facts] *********************************************************
ok: [192.168.33.14]
ok: [192.168.33.12]
 
TASK [copy in the golem public key] ********************************************
changed: [192.168.33.12]
changed: [192.168.33.14]
 
PLAY RECAP *********************************************************************
192.168.33.12              : ok=2    changed=1    unreachable=0    failed=0   
192.168.33.14              : ok=2    changed=1    unreachable=0    failed=0   
 
Success! Data written to: secret/golemkey
Success! Data written to: secret/golemkeypub

The process is a success, so now it’s impossible to access a golem account without first calling the private key out of vault.

Use the New Keys

Since we have no private-key anymore, the only way to access golem is to pull a key from vault first. I use a simple bash script to do this (ansiblew.sh). It creates the key, runs the ansible command, and then destroys the key. Let’s try it:

[vagrant@localhost ~]$ ./ansiblew.sh 
192.168.33.12 | SUCCESS | rc=0 >>
I worked!
 
192.168.33.14 | SUCCESS | rc=0 >>
I worked!
 
[vagrant@localhost ~]$

We can run this multiple times, demonstrating how to fetch a private key from the Vault before running the ansible command.

Let’s see the Logs

At this point, the only major item left is to examine the audit logs which were enabled earlier. These logs are stored on the vault VM by systemd. Since we turned on vault with systemd, we can find the logs managed by journald. We can log out of ansible, into vault, and access the logs with the following short run of commands:

[vagrant@localhost] ~$ exit
doug@OPI demo-dir$ vagrant ssh vault
[vagrant@localhost] ~$ sudo journalctl -f -u vault

Once auditing is enabled, vault will begin logging all its access requests and responses in json format to the system logs. Because we picked syslog, we are free to use all the powers of rsyslogd to process and monitor these access logs. An example entry is below.

Jan 15 20:27:41 localhost.localdomain vault[20681]: {"time":"2018-01-15T20:27:41.486073456Z","type":"request","auth":{"client_token":"hmac-sha256:3a6171150204efa4003a7e34f73efb47e75f7536e78aa185120cee243952cce6","accessor":"hmac-sha256:6d09a91c9eee58388e6828f4c1f092b43e0d4d345033f42ee1a7722cc41ae4f0","display_name":"root","policies":["root"],"metadata":null,"entity_id":""},"request":{"id":"4e4fe614-e095-8452-6ceb-02999d0cf01b","operation":"read","client_token":"hmac-sha256:3a6171150204efa4003a7e34f73efb47e75f7536e78aa185120cee243952cce6","client_token_accessor":"hmac-sha256:6d09a91c9eee58388e6828f4c1f092b43e0d4d345033f42ee1a7722cc41ae4f0","path":"secret/newgolemkey","data":null,"policy_override":false,"remote_address":"192.168.33.13","wrap_ttl":0,"headers":{}},"error":""}
Jan 15 20:27:41 localhost.localdomain vault[20681]: {"time":"2018-01-15T20:27:41.48698996Z","type":"response","auth":{"client_token":"hmac-sha256:3a6171150204efa4003a7e34f73efb47e75f7536e78aa185120cee243952cce6","accessor":"hmac-sha256:6d09a91c9eee58388e6828f4c1f092b43e0d4d345033f42ee1a7722cc41ae4f0","display_name":"root","policies":["root"],"metadata":null,"entity_id":""},"request":{"id":"4e4fe614-e095-8452-6ceb-02999d0cf01b","operation":"read","client_token":"hmac-sha256:3a6171150204efa4003a7e34f73efb47e75f7536e78aa185120cee243952cce6","client_token_accessor":"hmac-sha256:6d09a91c9eee58388e6828f4c1f092b43e0d4d345033f42ee1a7722cc41ae4f0","path":"secret/newgolemkey","data":null,"policy_override":false,"remote_address":"192.168.33.13","wrap_ttl":0,"headers":{}},"response":{"secret":{"lease_id":""},"data":{"value":"hmac-sha256:3dceea3651130ae851db1411376f4b1cd3a0b22c47fdb9aedffb44a19c3d6a3f"}},"error":""}

A few data items are of immediate note. Check through the following list and find the items in the JSON above:

  • type “request” or “response”
  • remote_address the IP of the originating request
  • path above, it’s “secret/newgolemkey”. This is the secret requested and returned.
  • data A hash with a “value” key. This value key contains encrypted data.

All of the data in the JSONs should be understood in a production environment, but the four above are important to show the logged request/responses.

Further Study

This is a brief look into the power and usability of Vault as a secret-store for use in an Ansible controlled cluster. There are endless possibilities from here, but a short list springs to mind immediately.

  • Use Consul.
  • Examine Ansible’s built in vault, which seems like a less powerful way to store secrets but should be understood in better detail.
  • Vault comes with a built-in ssh that should be explored.
  • Use of the Vault API could provide a robust suite of CLI tools.

Conclusion and Thanks

Thank you for reading this far. I hope you’ve enjoyed this little demonstration of an extremely basic Ansible/Vault integration. You should now understand this integration at a low-level, allowing you to make decisions about high-level integrations with greater confidence.

I’d like to give special thanks to Mr. Teigen Leonard, without whose support and inquisitive mind I would never have tackled this topic.


Also published on Medium.

Leave a Reply

Your email address will not be published. Required fields are marked *

*

*