Feature Flags in Terraform

Feature flagging any code can be useful to developers but many don’t know how to or even that you can do it in Terraform.

Some benefits of Feature Flagging your code

  • You can enable different features for different environments
    • For example, if you want to enable something in a lower environment to test it but are not ready to deploy it to production
    • This is also beneficial to save money, you might want to only have a large deployment in your production environment but you don’t want to in a lower environment. So, you enable a feature in production and disable it in your lower environments.
  • Enable/Disable features quickly
    • If you have something wrong in your environment and you want to quickly disable a resource that is causing issues, you can just change the Terraform variable to false and re-run your pipeline
    • Otherwise, you would have to delete or comment out that section of your code

How to Feature Flag in Terraform

First, you will want to create a variable that is named in a way that anyone can come in and know what they are enabling/disabling:

variable "enable_postgres" {
    description = "Variable to enable or disable postgres deployment"
    default     = false
}

Now you can use this variable with a conditional expression in the resources related to the feature (postgres in this example). How you do this is with count which can be added to any Terraform resource.

resource "azurerm_postgresql_flexible_server" "example" {
  name                   = "example-psqlflexibleserver"
  resource_group_name    = azurerm_resource_group.example.name
  location               = azurerm_resource_group.example.location
  version                = "12"
  administrator_login    = "psqladmin"
  administrator_password = "<password>"
  storage_mb             = 32768
  sku_name               = "GP_Standard_D4s_v3"
  count                  = var.enable_postgres ? 1 : 0
}

resource "azurerm_postgresql_flexible_server_database" "example" {
  name      = "example-db"
  server_id = azurerm_postgresql_flexible_server.example[0].id
  collation = "en_US.utf8"
  charset   = "utf8"
  count     = var.enable_postgres ? 1 : 0
}

How this works is the argument count is being set by your variable, the conditional expression is checking your variable and if it is true then it is setting count to 1 and will deploy the resource. If the variable is set to false it is setting the count to 0 and will not deploy the resource.

You can use a single feature flag with multiple resources as usually you need more than one resource for a deployment. This will control if all of them get deployed on the next terraform apply at the same time. In my example with postgres you can see both the azurerm_postgresql_flexible_server and azurerm_postgresql_flexible_server_database both have enable_postgres on them so if the variable is set to true they will both be deployed.

Additional Considerations

When you add a count argument to any resource in Terraform, it changes how you call that resource in the rest of your Terraform code. If you notice in my example above, in the azurerm_postgresql_flexible_server_database block I had to call the server with a [0]

Example calling without count:

resource "azurerm_postgresql_flexible_server_database" "example" {
  name      = "example-db"
  server_id = azurerm_postgresql_flexible_server.example.id
  collation = "en_US.utf8"
  charset   = "utf8"
}

Example calling with count:

resource "azurerm_postgresql_flexible_server_database" "example" {
  name      = "example-db"
  server_id = azurerm_postgresql_flexible_server.example[0].id
  collation = "en_US.utf8"
  charset   = "utf8"
}

This is because count is a meta argument and tells Terraform how many of that resource to create, so when it is set to 0 it creates 0 of that resource. But let’s say you tell Terraform to create 5, you would have to call the correct one you need and that’s where the index comes into play. When you use count to feature flag, you still have to call the index even if there is only one of that resource.

Why it is [0] in this example? This is because Terraform uses zero-based counting, which means it starts at 0 so the first object will always be index 0.

What about Modules?

You can also feature flag an entire module instead of each individual resource underneath a module. Then everything in the module either will or will not be built depending on how your feature flag variable is set.

module "postgresql" {
  source = "Azure/postgresql/azurerm"

  resource_group_name = azurerm_resource_group.example.name
  location            = azurerm_resource_group.example.location

  server_name                  = "example-server"
  sku_name                     = "GP_Gen5_2"
  storage_mb                   = 5120
  backup_retention_days        = 7
  geo_redundant_backup_enabled = false
  administrator_login          = "login"
  administrator_password       = "password"
  server_version               = "9.5"
  ssl_enforcement_enabled      = true
  db_names                     = ["my_db1", "my_db2"]
  db_charset                   = "UTF8"
  db_collation                 = "English_United States.1252"

  firewall_rule_prefix = "firewall-"
  firewall_rules = [
    { name = "test1", start_ip = "10.0.0.5", end_ip = "10.0.0.8" },
    { start_ip = "127.0.0.0", end_ip = "127.0.1.0" },
  ]

  vnet_rule_name_prefix = "postgresql-vnet-rule-"
  vnet_rules = [
    { name = "subnet1", subnet_id = "<subnet_id>" }
  ]

  tags = {
    Environment = "Production",
    CostCenter  = "Contoso IT",
  }

  postgresql_configurations = {
    backslash_quote = "on",
  }

  depends_on = [azurerm_resource_group.example]
  count = var.enable_postgres ? 1 : 0
}

About the Author

Andrew Huddleston profile.

Andrew Huddleston

Sr. Consultant
Leave a Reply

Your email address will not be published. Required fields are marked *

Related Blog Posts
Retrofit2: Get the body from an error response
Retrofit2 is a nice library for making HTTP rest requests. It includes a static utility (CallUtils) for getting the result from your request, but if the api you’re calling doesn’t return a 2xx request it […]
Airflow Logging: Task logs to Elasticsearch
This is part three of a five-part series addressing Airflow at an enterprise scale. I will update these with links as they are published. Airflow: Planning a Deployment Airflow + Helm: Simple Airflow Deployment More […]
Using Nix as a Professional
How to use Nix as a tool to optimize developer time with real-life examples.
Enterprise Auth for Airflow: Azure AD
This is part three of a five-part series addressing Airflow at an enterprise scale. I will update these with links as they are published. Airflow: Planning a Deployment Airflow + Helm: Deploying the Chart Without […]