The Cloud Coach

CloudOps Educational Resources

Using Terraform

Posted at — Jul 29, 2019

Let’s explore the basics of getting started with Terraform in AWS. This brief article explores Terraform 0.12, which is the latest release at the time of writing.

Why?

This is the right question to be asking at this point. And to expand on it a little:

Why should I learn to use Terraform, specifically? What makes it special versus, say, CloudFormation?

If you’re managing any Cloud based infrastructure manually then you’re putting your business at risk.

Terraform works on a concept known as “Infrastructure As Code” (IaC.) This concept is simple to grasp: instead of using the web UIs or CLI tools to build, manage and decommission infrastructure, use code and store it in Git; test it; lint it; validate it; do code reviews on it; audit it; and treat it just like regular code.

If you take the steps to appreciate the power of IaC and start learning Terraform, you’ll be in a much better position further down the line and you’ll mitigate a lot of the risks that are associated with humans doing the work robots are better at doing.

Notes

The Code

If you’ve been here before and you’re just looking for the code, it can be found in this GitLab repository.

If you’d like to follow along, I recommend cloning the repository and reading through the guide. A better option for you to learn the concepts might be to write the code as you read.

Installation

Terraform is written in Google’s Go. It’s a single binary and as such it’s very easy to install.

If you’re on Windows or Linux…

  1. Head over to the download page
  2. Download the version suitable for your OS and or architecture
  3. Unzip the archive you’re given and place the terraform binary somewhere in your PATH

If however you’re using macOS, then consider installing Homebrew and using that instead of managing the installation manually:

brew install terraform

If you’re looking to avoid Homebrew for whatever reason, then follow the same steps as above for Windows and Linux, downloading the macOS option instead.

AWS Credentials

You’ll need AWS credentials to use Terraform (at least to play along with everything below this point.)

  1. Login to your AWS console and go to IAM
  2. Go to your account and create an access key and secret pair
  3. Save these locally into ~/.aws/credentials or a location relevant for your OS

You’re ready to use Terraform to manage AWS resources.

Directory Structure

The first thing I want to cover is the directory structure and how Terraform looks for and handles .tf files.

If you’re current working directory is ~/terraform/ and you execute terraform plan, Terraform is going to look in your current working directory for .tf files and consider them all one whole “piece”. That is, Terraform will create one large “code base” from all the files it finds.

What Terraform does not do is look above or below your current working directory for files. It simply looks in . (the current directory the terraform ... commands were executed inside of.)

The only time this general rule is broken is when using modules, but this article doesn’t cover modules and you don’t need to worry about those to get started with and begin using Terraform.

The Objective

We’re going to write some Terraform that will build an environment in AWS. This environment will have all the networking provisioned and a simple EC2 Instance. We’ll install Nginx on this EC2 Instance and make sure it’s publicaly available over the Internet.

The Terraform file we’ll create is called webserver.tf. This file can be called anything you like, but it’s name is indicative of what we’re going to be using it for. There’s also a pre-made one in the GitLab repository mentioned above.

The Provider

The provider{} block in Terraform configures what (Cloud) provider you’ll be using and how-to talk to it. All we need for now is the following code:

provider "aws" {
	region = "ap-southeast-2"
}

You should change region to meet your local requirements. There’s no point creating resources in Sydney, Australia if you’re in New York or Berlin.

This is your first bit of exposure to the configuration language Terraform uses: HashiCorp Configuration Language (HCL). It’s a custom language designed from the ground up to support Terraform’s end goal of being perfectly suited to its job. You can read more about it in the documentation.

Resources

Using Terraform is simple. You define resource{} blocks that outline what you’d like it to create and manage for you. We’ll create plenty of resources in this article so you’ll get a feel for how they look and how-to define them.

In short the format is: resource "type" "internal_name" {}

The type is using to be provider specific, such as aws_vpc. Review the documentation to get an idea of what resources are available.

The internal_name is how you reference the resource throughout the rest of your Terraform code base. These names can be references from other .tf files in the same directory (think of the directory like a scope.) This name has no bearing at all on how resources in remote providers, such as AWS, are labelled.

Networking

We’re going to need a few resources before we can create an EC2 instance:

Networking takes up most of the configuration compared to other resources. It’s one of the most complex set of resources you’ll have to configure, but you mostly have to do it once.

The VPC

resource "aws_vpc" "test" {
  cidr_block = "10.1.0.0/16"
  tags = {
	Name = var.tag_name
  }
}

Simple enough. Note the use of tags = {} to tag the resource. Not every resource supports tags (an AWS restriction) but for those that do make sure you’re utilising them to keep things organised within your account.

The Subnet

resource "aws_subnet" "webserver" {
  vpc_id     = aws_vpc.test.id
  cidr_block = "10.1.1.0/24"
  availability_zone = "ap-southeast-2a"

  tags = {
	Name = "${var.tag_name}-webserver"
  }
}

We’ve now encounter our first use of a variable: aws_vpc.test.id. This is where Terraform shines. Because we’re looking for the VPC’s ID so the subnet can be created in the correct VPC, Terraform will create the VPC first and then create the subnet.

This is where terraform’s use of a Directed Acyclic Graph (DAG) comes into play and it’s the reason it can create resources in parellel whilst also blocking others until the resources they depend on have been created.

The use of the variable here is simple: resource_type.name.attribute. We’re asking for an aws_vpc called test and we want it’s id attribute. We’ll see other examples of variables being used below.

Network ACL (NACL) and Security Group (SG)

resource "aws_network_acl" "webserver" {
  vpc_id = aws_vpc.test.id

  egress {
	protocol   = "-1"
	rule_no    = 200
	action     = "allow"
	cidr_block = "0.0.0.0/0"
	from_port  = 0
	to_port    = 0
  }

  ingress {
	protocol   = "-1"
	rule_no    = 100
	action     = "allow"
	cidr_block = "0.0.0.0/0"
	from_port  = 0
	to_port    = 0
  }

  tags = {
	Name = var.tag_name
  }
}

There’s that variable usage again. We can start to see how Terraform manages its internal DAG using these kinds of references.

resource "aws_security_group" "protected" {
  name        = "protected"
  description = "Only allows HTTP and SSH"
  vpc_id      = aws_vpc.test.id

  ingress {
	# TLS (change to whatever ports you need)
	from_port   = 80
	to_port     = 80
	protocol    = "tcp"
	cidr_blocks = ["0.0.0.0/0"]
  }

  ingress {
	# TLS (change to whatever ports you need)
	from_port   = 22
	to_port     = 22
	protocol    = "tcp"
	cidr_blocks = ["0.0.0.0/0"]
  }

  egress {
	from_port       = 0
	to_port         = 0
	protocol        = "-1"
	cidr_blocks     = ["0.0.0.0/0"]
  }
}

The Security group is mostly obvious. We’re seeing our first use of an array here: ["0.0.0.0/0"].

We’re also seeing - for the second time - the literal configuration blocks inside the resource: ingress{} and egress{} are examples of resource configuration blocks that are specific to the aws_security_group resource. Each resource may have zero or more of such blocks. Review the documentation for the resources you’re interested in to see what configuration items they support.

An Internet Gateway & Route Table

We’ll need to be able to route traffic from the EC2 instance to you, the person trying to use the HTTP and SSH services we’ll be spinning up. This will require an aws_internet_gateway and an aws_route_table to send traffic from the subnet to you and allow the EC2 instance to serve traffic over 80 and 22 to the public Internet.

resource "aws_internet_gateway" "igw" {
  vpc_id = aws_vpc.test.id

  tags = {
	Name = var.tag_name
  }
}

Extremely simple resource to get setup. So is the route table…

resource "aws_route_table" "default" {
  vpc_id = aws_vpc.test.id

  route {
	cidr_block = "0.0.0.0/0"
	gateway_id = aws_internet_gateway.igw.id
  }

  tags = {
	Name = var.tag_name
  }
}

resource "aws_route_table_association" "default" {
  subnet_id      = aws_subnet.webserver.id
  route_table_id = aws_route_table.default.id
}

Also an easy resource to get going, but here we’re using an association resource to pair the aws_route_table to the aws_subnet we created earlier, ensuring it’ll use it to route traffic inside our VPC and through to the aws_internet_gateway when an Internet routable IP address is addressed.

An Elastic IP

We technically don’t need an EIP here. We could have the subnet auto-assign a public IP to our EC2 instance, but that’s boring…

resource "aws_eip" "webserver" {
  instance = aws_instance.webserver.id
  vpc      = true
  depends_on = ["aws_internet_gateway.igw"]
}
```

So what's this `depends_on` here for? I do not believe this is strictly required, but what it does
is simple: it ensures the `aws_internet_gateway` has been created before the EIP has been created.

This can be useful for other situations, for example an EC2 instance that will try and connect to an
RDS instance on boot – you'll need the RDS instance to come up first. Use `depends_on` in such
situations.

### SSH Key & External Data

That's the networking sorted. Before we setup the EC2 instance let's make sure our SSH public key is
present in our AWS account so we can SSH into the instance after it's come up.

```
resource "aws_key_pair" "webserver" {
  key_name   = "webserver-keypair"
  public_key = "ssh-rsa ..."
}
```

(You'll see a real SSH public key in the code in GitLab. This is mine. I will SSH into your boxes
and troll you if you don't change it! ;-))

This is pretty simple and I hope no explanation is needed.

```
data "local_file" "webserver-install" {
  filename = "${path.root}/webserver.sh"
}
```

So this is interesting because now we're doing something we've not seen before.

Here we're defining a local "block" of data based on a local file, hence `local_file`. It's
essentially a "data resource" we can use throughout our code that gets some information or data from
an external source.

In this case it pulls data from a file. More specifically it's pulling in a  shell script we're
going to execute on our SSH instance:

```
#!/bin/bash

apt update -y
apt install nginx -y 
systemctl enable nginx
systemctl start nginx
```

We're just going to update our system's package cache and then insall, enable and start Nginx.
Simple.

It's not the only time we need to pull some custom, external data. We're also using a `data{}`
resource to pull an AMI ID. We'll need one for our EC2 instance.

```
data "aws_ami" "ubuntu" {
  most_recent = true

  filter {
	name   = "name"
	values = ["ubuntu/images/hvm-ssd/ubuntu-bionic-18.04-amd64-server-*"]
  }

  filter {
	name   = "virtualization-type"
	values = ["hvm"]
  }

  owners = ["099720109477"] # Canonical
}
```

This is a specific `data` type: an `aws_ami`. Note how our first `data` type was a `local_file`.
This is going to look for an AWS AMI ID based on the filters we've defined:

- `name` matches `ubuntu/images/hvm-ssd/ubuntu-bionic-18.04-amd64-server-*`
- The attribute named `virtualization-type` contains `hvm` as a virtualisation type
- The `owners` include Canaonical (the organisation behind Ubuntu.)

Let's now put these two data resources to use.

### Compute

```
resource "aws_instance" "webserver" {
  ami                     = data.aws_ami.ubuntu.id
  availability_zone       = var.aws_region
  instance_type           = "t2.large"
  key_name                = aws_key_pair.webserver.key_name
  vpc_security_group_ids  = [aws_security_group.protected.id]
  subnet_id               = aws_subnet.webserver.id
  user_data               = data.local_file.webserver-install.content
  root_block_device       = {
	volume_size = 50
  }

  tags = {
	Name = var.tag_name
  }
}
```

There are two interesting points to be observed here:

1. The use of `var.aws_region`
2. Our `data` types being used in `ami` and `user_data`

The `var.aws_region` is a variable I defined at the top of the `.tf` file. We can define custom
variables to help us make code bases more flexible. For example this variable can be overridden on
the command-line to change the region.

Defining a variable is simple:

```
variable "aws_region" {
  default = "ap-southeast-2"
}
```

We also used our `data` type from above and although I think the AMI type is simple to understand,
the file is more interesting: `data.local_file.webserver-install.content` – it's the use of
`content` that interesting. It demonstrates what can be done using these data types.

### Output

```
output "instance_ip_addr" {
  value = aws_eip.webserver.public_ip
}
```

Our final piece of code will result in Terraform taking an attribute from a resource, in this case
our `aws_eip.webserver`, and printing it to `STDOUT`: `public_ip`. This is useful because we'll want
to grab that IP and test HTTP and SSH is working.

### Plan it Out

Before we make this infrastructure live though we should use one of terraform's best features:
`terraform plan`. You'll get something along these lines:

```
Terraform will perform the following actions:

  + aws_internet_gateway.igw
	  id:                                          <computed>
	  owner_id:                                    <computed>
	  tags.%:                                      "1"
	  tags.Name:                                   "test"
	  vpc_id:                                      "${aws_vpc.test.id}"

  + aws_key_pair.webserver
	  id:                                          <computed>
	  fingerprint:                                 <computed>
	  key_name:                                    "webserver-keypair"
	  public_key:                                  "ssh-rsa ..."

...

Plan: 10 to add, 0 to change, 0 to destroy.
```

This plan is ideal because it means we can double check our work and make sure Terraform is going to
do what we intended with our code. This is a powerful concept when you combine it with code reviews
(because Terraform is just code - Infrastructure As Code)

### Apply It

Now start creating some resources: `terraform apply`. You'll be asked to confirm your actions by
typing in `yes` or `no`. You can override this behaviour using `-auto-approve`, which works nicely
in CI/CD pipelines.

Once Terraform has finished it will create a `.tfstate` file. This file is like gold dust and you
should treasure it, protect it and back it up. In fact what you actually need to do is use a remote
resource, such as S3, to store the file... and Terraform supports this, and more, out of the box. 

Once everything is in place, run `terraform apply` (or `plan`) again and you'll notice Terraform
knows that there's nothing to do. Perfect.

### Making a Change

Now make a change to the code. Instead of adding anything, we'll do something quick and easy to
demonstrate how Terraform computes changes. Below we've changed the value of the `tag_name` variable
so that it propagates throughout the state.

```
variable "tag_name" {
  default = "terraform10x-changed"
}
```

Now run `terraform plan` and look at what Terraform wants to do: update the tags on very resource
we've applied this variable to. Pretty simple, but powerful stuff.

If you make a more complex change, such as changing our subnet's CIDR range (which you're free and
encouraged to try) you'll notice the changes are substantially more impactful. This is why the
`plan` functionality exists – it enables you to check your understanding of what's going to happen
and the consequences of those changes.

### Blowing it all Away

Now for another key benefit of terraform: destroying everything is a one-off command, making clean
up easy.

We'll execute `terraform destroy` and after confirming your actions Terraform will ensure everything
is deleted.

There is one caveat here though: the RDS instance. If you try to delete it you'll see Terraform
complains it cannot. This is due to RDS wanting you to take a final snapshot (or not) of the
instance before deleting it. I have no doubt Terraform can override this behaviour but I personally
recommend deleting RDS instances manually to ensure you're given time to back up any data needed.

### Documentation

Terraform's documentation is actually very good. It can [be found
here](https://www.terraform.io/docs/).

A pro tip for finding resource specific documentation is to use Google: `terraform aws_vpc` will get
you a top hit straight to the correct docs. Follow that format and you'll find what you're looking
for quickly.