Building a Scalable Terraform Project Framework

In this post I will be introducing a framework we have developed for managing Terraform projects, which helps us keep our code reusable and consistent across environments. This post assumes you are already familiar with the basic operations of managing infrastructure with Terraform, and familiar with the ideas behind using modules. My examples are quite simplified and only serve to show our project structure.

Laying the Groundwork

The first step in creating a robust, scalable Terraform project is to develop a centralized library of Terraform modules. By bundling resources using modules we can guarantee consistency, as well as enforce default policy and configuration of services. Especially in the large enterprise, Terraform is often utilized as a self-service tool that developers can leverage to deploy their own infrastructure. If we are to trust users to manage their own resources, we should define what an approved usage looks like. As long as someone uses our module, they are implicitly approved to deploy that infrastructure.

This particular project framework leverages the fact that each module exists in its own Git repository. This allows us to use Git references (tags, branches, commit shas, etc.) when telling Terraform where to find the source code for our module. By default, a Git source in a Terraform module declaration will clone the default branch of that project (usually master). Isolating modules to their own repositories comes with a number or additional benefits. If I am working on a new feature that I don’t want to force upon our user base, I can build on a new branch and isolate testing of that module. I can also use tags to version the modules and control when and if breaking changes are introduced. When there are scenarios where we need to push security updates or change default configuration for everyone, we can release a new version of our module and notify teams that they should begin upgrading.

In my example module below, I’m just tying my EC2 instance definition to an IAM instance profile, and using a variable instance_type to scale my EC2 instance size depending on my environment.

data "aws_ami" "amzn2_linux" {
  most_recent = true
  owners      = ["amazon"]
 
  filter {
    name   = "name"
    values = ["amzn2-ami-hvm-2.0.*-x86_64-gp2"]
  }
}
 
resource "aws_iam_instance_profile" "default" {
  name = "default"
  role = aws_iam_role.default.name
}
 
resource "aws_iam_role" "default" {
  name = "default"
  path = "/"
 
  assume_role_policy = <<EOF
{
	"Version": "2012-10-17",
	"Statement": [{
		"Sid": "",
		"Effect": "Allow",
		"Principal": {
			"Service": [
				"ec2.amazonaws.com",
				"ssm.amazonaws.com"
			]
		},
		"Action": "sts:AssumeRole"
	}]
}
EOF
}
 
resource "aws_instance" "default" {
  ami                  = data.aws_ami.amzn2_linux.id
  instance_type        = var.instance_type
  iam_instance_profile = aws_iam_instance_profile.default.name
 
  tags = {
    Name = var.instance_name
  }
}

Building our Project

When we have decided what infrastructure to deploy in each environment, we create what we’re calling a “baseline”. This baseline should also live in its own repository and be treated as a module itself. We can then start to define our environments by referencing the baseline module, and passing in our environment specific configuration. For now let’s assume our baseline just contains this one EC2 module reference.

module "ec2" {
  source = "git::https://gitlab.com/terraform-framework-example/aws_terraform_modules/ec2_instance.git?ref=v0.1.0"
  instance_type = var.instance_type
}

We can now define our dev environment accepting the baseline’s default variables as such:

module "baseline" {
  source = "https://gitlab.com/terraform-framework-example/baseline.git"
}

And prod will have customized variables when we want to override the default values:

module "baseline" {
  source = "https://gitlab.com/terraform-framework-example/baseline.git"
 
  instance_type = "r4.xlarge"
}

This very simple example shows how you can start to build out repeatable environments, while simply tweaking individual configuration values per environment. I can now guarantee that my dev and prod environments will have the exact same IAM configuration when launching EC2 instances with this module without rewriting the entire EC2 module configuration in both environments. This concept extends to much of the services you want to provision with Terraform. With this baseline we can define the same service topology while scaling things like compute and storage resources to match our environment’s use case.

You may notice I’ve left my module name the same in both my dev and prod projects when calling the baseline. This is a step that makes refactoring and debugging of Terraform changes across environments super easy. If I want to rename a module in the baseline and update my dev statefile, I can now write a script to perform my various ‘terraform state mv ‘ commands and rerun it in stage/prod environments. By wrapping everything in the baseline, each environment now has consistent resource naming.

It’s worth mentioning that Terraform itself has some limitations that can sometimes make this structure a bit cumbersome. Mostly revolving around the fact that you need to “bubble up” all of your variable and/or output blocks from individual modules, to your baseline and/or environment projects. We work around this by keeping our names consistent across different modules and different layers of our project. If we name something ‘instance_type’ in our EC2 module, we will just copy/paste that exact variable block throughout our baseline and environment variables.tf files.

Finally, we’re expanding even further on the value of Terraform modules by reducing the complexity and amount of code needed to define our infrastructure. In our baseline we’ve taken our initial EC2 module example from >50 lines of code down to 4. This also adds value to our module library when used in a self-service model. When we can abstract away all that extra complexity, non-infrastructure folks are much more willing to deploy their own resources if they just need to give something a name and “hit Go”. We’ve experienced teams are more willing to start building their own “baseline” infrastructure when the building block modules are as straight-forward as they can be. More on building solid, reusable Terraform modules in a future blog post.

Wrap-up

I would be remiss to not mention Terragrunt in this post. They have developed a wrapper for Terraform itself that helps implement a lot of these same ideas and goals. I feel the approach I’ve described provides a number of benefits over that tool. If we decide to also version our baseline, we can promote different versions of our whole collection of modules to each environment without relying on pipelining or the order in which individual pipelines are triggered. By sticking with the native Terraform module concept and wrapping our modules themselves within our “baseline”, we avoid introducing a specific required directory structure or third-party dependency.

Feel free to leave a comment and let me know your thoughts. I will be expanding on these ideas in the future and would love to include any feedback you have. If you’d like to check out my examples used in this post in their more refined file structure, you can check those out here.

One thought on “Building a Scalable Terraform Project Framework

  1. John Engelman says:

    Great write up Andy! Glad to see you still working in this space!

  2. Chris Christensen says:

    Great write up. Are you using anything like Atlantis or Terraform Teams for collaboration or does everyone work independently.

    1. Currently all collaboration is provided through GitLab pipelines utilizing merge requests and approvals. We may look more into the recently announced Terraform Cloud though! You can learn more about that here: https://www.hashicorp.com/blog/announcing-terraform-cloud

Leave a Reply

Your email address will not be published. Required fields are marked *

*

*