Lets write your infrastructure as code - step by step (Part 1)

10 Mar 2016

This is a step by step guide to setting up your infrastructure on Amazon using
code (Terraform) for a reproducible, version controlled stack.

Why?

While discussing infrastructure as code, the most common question is why?.
Amazon gives you a lot of options with the UI, you can basically do everything from the UI without really touching any code or learning any new tool.

For me, the best thing about it is that I have a state of what my infrastructure looks like at any given moment. I can version control it, I can share the state file with my peers.

Have opinions on why? or why not?? please share them in the comments.

Lets start now.

Prerequisites

Amazon AWS Account
Access key and secret key

Tools required

terraform (0.6.12 was used here)
Text Editor of your choice.

Fair Warning

If you progress with this article and execute the apply commands you should
know you are creating resources on your (or your company’s) Amazon account.
Make sure you have the permission to do so.

Getting Started

In order to get started lets create a file called ~/.aws/credentials.
Make sure this file is never checked in to source control or anyone will be
able to access your Amazon account.

In this file, add your profile like so:

[avi]
aws_access_key_id = ACCESS_KEY
aws_secret_access_key = SECRET_KEY

avi can of course be replaced with the profile name to your liking, it’s just
a string and you can give it any name you want.

After you have the profile setup in a credentials file, we can continue.

Terraforming

Without going too much into terraform (like I said earlier, I highly encourage
you to read up on it) it gives you the ability to “describe” your
infrastructure as code and “apply” the changes.

Think of it as a git for your infrastructure, you can change resources and you
can “diff” between the code and the version that is currently running.

Now, lets jump into the code.

Your first terraform file

Creating a directory

$ mkdir ~/Code/terraform-test
$ cd ~/Code/terraform-test

Create a file called main.tf.

Provider

Lets begin by declaring the provider we will be working with.

provider "aws" {
  region  = "${var.aws_region}"
  profile = "avi"
}

profile here needs to be the same profile name that you declared earlier in
~/.aws/credentials
region if your AWS region. As you can see it’s coming from a variable that
we will be covering soon

In this post, I assume you are using us-east-1, if you don’t the code sample will not just work for you, you will need to edit the variables with the correct AMI id for the region.

VPC and more…

Obviously, I can’t cover environments here. But usually you will have
production, staging and test. Each of those will have it’s own VPC, route
tables, internet gateways and more. For the purpose of this post, I will only
cover production.

Lets begin with describing our VPC, route tables and internet gateways. Those
are the foundations for our cluster and everything else depends on them.

resource "aws_vpc" "production" {
  cidr_block = "10.0.0.0/16"
}

resource "aws_internet_gateway" "production" {
  vpc_id = "${aws_vpc.production.id}"
}

resource "aws_route" "internet_access" {
  route_table_id         = "${aws_vpc.production.main_route_table_id}"
  destination_cidr_block = "0.0.0.0/0"
  gateway_id             = "${aws_internet_gateway.production.id}"
}

resource "aws_subnet" "production-1a" {
  availability_zone       = "us-east-1a"
  vpc_id                  = "${aws_vpc.production.id}"
  cidr_block              = "10.0.1.0/24"
  map_public_ip_on_launch = true
}

resource "aws_subnet" "production-1d" {
  availability_zone       = "us-east-1d"
  vpc_id                  = "${aws_vpc.production.id}"
  cidr_block              = "10.0.2.0/24"
  map_public_ip_on_launch = true
}

resource "aws_subnet" "production-1c" {
  availability_zone       = "us-east-1c"
  vpc_id                  = "${aws_vpc.production.id}"
  cidr_block              = "10.0.3.0/24"
  map_public_ip_on_launch = true
}

Now, that’s a lot to take in but basically lets dive in to what we described
here:

Virtual private cloud called “production”
Route for internet access that allows access to all traffic (outside)
Subnets for 3 zones with an IP range and all exist inside the production VPC

All of these terms can be intimidating at first and I know that the simplicity
of the “git push” to heroku is in the back of your mind this entire time, but
in a real production environment, you will need fine grain control over most of
these things. You want to make sure in the network level, services cannot
communicate with what they shouldn’t be communicating with (Just as a single
example).

Security groups

Now that we have the subnets and VPC, we want to describe some security groups.

Lets think of what we need.

Load balancer accessible from the outside
Instances accessible from the load balancer
Services accessible from instances
Instances accessible from “allowed connections”

I like to create elb, internal and external groups to allow those
rules.

resource "aws_security_group" "internal" {
  name        = "internal"
  description = "Internal Connections"
  vpc_id      = "${aws_vpc.production.id}"

  tags {
    Name = "Internal Security Group"
  }

  ingress {
    from_port = 0
    to_port   = 65535
    protocol  = "tcp"
    self      = true
  }

  egress {
    from_port   = 0
    to_port     = 0
    protocol    = "-1"
    cidr_blocks = ["0.0.0.0/0"]
  }
}

resource "aws_security_group" "elb" {
  name        = "elb"
  description = "Load Balancer Security Group"
  vpc_id      = "${aws_vpc.production.id}"

  tags {
    Name = "Load balancer security group"
  }

  ingress {
    from_port   = 80
    to_port     = 80
    protocol    = "tcp"
    cidr_blocks = ["0.0.0.0/0"]
  }

  egress {
    from_port   = 80
    to_port     = 80
    protocol    = "tcp"
    cidr_blocks = ["0.0.0.0/0"]
  }
}

resource "aws_security_group" "external" {
  name        = "external"
  description = "Connection From the world"
  vpc_id      = "${aws_vpc.production.id}"

  ingress {
    from_port   = 22
    to_port     = 22
    protocol    = "tcp"
    cidr_blocks = ["YOUR_IP_HERE/32"]
  }

  egress {
    from_port   = 0
    to_port     = 0
    protocol    = "-1"
    cidr_blocks = ["0.0.0.0/0"]
  }
}

Load balancer

Now that our security groups are all described we can continue with our web
accessible infrastructure (load balancer and instance)

resource "aws_elb" "web" {
  name            = "web-production"

  subnets         = ["${aws_subnet.production-1a.id}", "${aws_subnet.production-1c.id}", "${aws_subnet.production-1d.id}"]
  security_groups = ["${aws_security_group.elb.id}"]
  instances       = ["${aws_instance.prod-web.id}"]

  tags {
    Name = "prod-web-elb"
  }

  health_check {
    healthy_threshold   = 2
    unhealthy_threshold = 2
    timeout             = 20
    target              = "HTTP:80/"
    interval            = 30
  }

  listener {
    instance_port     = 80
    instance_protocol = "http"
    lb_port           = 80
    lb_protocol       = "http"
  }
}

Here we described a load balancer that is web accessible.

To simplify things, I didn’t add an SSL listener with a certificate, but you
can obviously do that (let me known in the comments if that’s what you need).

Instances

You can see, the instances are identified by aws_instance.prod-web.id. so
lets describe those instances now.

First, lets create the keypair for the instance.

$ keyname=the-startup-stack
$ keymail="devops@the-startup-stack.com"
$ ssh-keygen -t rsa -b 4096 -f $keyname -C $keymail

Now that you have your key ready, lets start describing it with code.

resource "aws_key_pair" "auth" {
  key_name = "${var.key_name}"
  public_key = "${file(var.public_key_path)}"
}

I really like my hostnames on the instances to describe what they are and also
give me some context on the instance. so I usually have a minimal
cloud-config script.

Terraform has a great option for those using templates so lets start with
those.

Create a file called web_userdata.tpl

#cloud-config

bootcmd:
 - hostname web.${domain_name}.`curl http://169.254.169.254/latest/meta-data/instance-id`
 - echo 127.0.1.1 web.${domain_name}.`curl http://169.254.169.254/latest/meta-data/instance-id` >> /etc/hosts
 - echo web.${domain_name}.`curl http://169.254.169.254/latest/meta-data/instance-id` > /etc/hostname

preserve_hostname: true

Then in terraform, we can use that file as a template

resource "template_file" "web_userdata" {
  template = "${file("web_userdata.tpl")}"

  vars {
    domain_name = "yourdomain"
  }
}

Now, lets create the instance

resource "aws_instance" "prod-web" {
  count     = 1
  user_data = "${template_file.web_userdata.rendered}"

  connection {
    user = "ubuntu"
  }

  tags {
    Name = "prod-web-${count.index + 1}"
  }

  instance_type          = "m3.xlarge"

  key_name               = "${aws_key_pair.auth.id}"
  ami                    = "${lookup(var.aws_amis, var.aws_region)}"

  vpc_security_group_ids = ["${aws_security_group.external.id}", "${aws_security_group.internal.id}"]
  subnet_id              = "${aws_subnet.production-1a.id}"
}

Before we finish up things here, we need to supply all the variables we
declared.

Lets create a file called variables.tf

variable "aws_region" {
    description = "AWS region to launch servers."
    default = "us-east-1"
}

variable "aws_amis" {
    default = {
        "us-east-1" = "ami-7ba59311"
    }
}

variable "key_name" {
}

variable "public_key_path" {
}

Now we can execute terraform plan in order to check what will terraform
create on our Amazon account.

If you set up everything correctly, you should see something similar to this:

var.key_name
  Enter a value: the-startup-stack

var.public_key_path
  Enter a value: the-startup-stack.pub

Refreshing Terraform state prior to plan...


The Terraform execution plan has been generated and is shown below.
Resources are shown in alphabetical order for quick scanning. Green resources
will be created (or destroyed and then created if an existing resource
exists), yellow resources are being changed in-place, and red resources
will be destroyed.

Note: You didn't specify an "-out" parameter to save this plan, so when
"apply" is called, Terraform can't guarantee this is what will execute.

+ aws_elb.web
    availability_zones.#:                   "" => "<computed>"
    connection_draining:                    "" => "0"
    connection_draining_timeout:            "" => "300"
    dns_name:                               "" => "<computed>"
    health_check.#:                         "" => "1"
    health_check.0.healthy_threshold:       "" => "2"
    health_check.0.interval:                "" => "30"
    health_check.0.target:                  "" => "HTTP:80/"
    health_check.0.timeout:                 "" => "20"
    health_check.0.unhealthy_threshold:     "" => "2"
    idle_timeout:                           "" => "60"
    instances.#:                            "" => "<computed>"
    internal:                               "" => "<computed>"
    listener.#:                             "" => "1"
    listener.3057123346.instance_port:      "" => "80"
    listener.3057123346.instance_protocol:  "" => "http"
    listener.3057123346.lb_port:            "" => "80"
    listener.3057123346.lb_protocol:        "" => "http"
    listener.3057123346.ssl_certificate_id: "" => ""
    name:                                   "" => "web-production"
    security_groups.#:                      "" => "<computed>"
    source_security_group:                  "" => "<computed>"
    source_security_group_id:               "" => "<computed>"
    subnets.#:                              "" => "<computed>"
    tags.#:                                 "" => "1"
    tags.Name:                              "" => "prod-web-elb"
    zone_id:                                "" => "<computed>"

+ aws_instance.prod-web
    ami:                      "" => "ami-7ba59311"
    availability_zone:        "" => "<computed>"
    ebs_block_device.#:       "" => "<computed>"
    ephemeral_block_device.#: "" => "<computed>"
    instance_state:           "" => "<computed>"
    instance_type:            "" => "m3.xlarge"
    key_name:                 "" => "${aws_key_pair.auth.id}"
    placement_group:          "" => "<computed>"
    private_dns:              "" => "<computed>"
    private_ip:               "" => "<computed>"
    public_dns:               "" => "<computed>"
    public_ip:                "" => "<computed>"
    root_block_device.#:      "" => "<computed>"
    security_groups.#:        "" => "<computed>"
    source_dest_check:        "" => "1"
    subnet_id:                "" => "${aws_subnet.production-1a.id}"
    tags.#:                   "" => "1"
    tags.Name:                "" => "prod-web-1"
    tenancy:                  "" => "<computed>"
    user_data:                "" => "948c5ae186c03822f50780fa376b228673b02f26"
    vpc_security_group_ids.#: "" => "<computed>"

+ aws_internet_gateway.production
    vpc_id: "" => "${aws_vpc.production.id}"

+ aws_key_pair.auth
    fingerprint: "" => "<computed>"
    key_name:    "" => "the-startup-stack"
    public_key:  "" => "YOUR PUBLIC KEY"

+ aws_route.internet_access
    destination_cidr_block:     "" => "0.0.0.0/0"
    destination_prefix_list_id: "" => "<computed>"
    gateway_id:                 "" => "${aws_internet_gateway.production.id}"
    instance_owner_id:          "" => "<computed>"
    origin:                     "" => "<computed>"
    route_table_id:             "" => "${aws_vpc.production.main_route_table_id}"
    state:                      "" => "<computed>"

+ aws_security_group.elb
    description:                          "" => "Load Balancer Security Group"
    egress.#:                             "" => "1"
    egress.2214680975.cidr_blocks.#:      "" => "1"
    egress.2214680975.cidr_blocks.0:      "" => "0.0.0.0/0"
    egress.2214680975.from_port:          "" => "80"
    egress.2214680975.protocol:           "" => "tcp"
    egress.2214680975.security_groups.#:  "" => "0"
    egress.2214680975.self:               "" => "0"
    egress.2214680975.to_port:            "" => "80"
    ingress.#:                            "" => "1"
    ingress.2214680975.cidr_blocks.#:     "" => "1"
    ingress.2214680975.cidr_blocks.0:     "" => "0.0.0.0/0"
    ingress.2214680975.from_port:         "" => "80"
    ingress.2214680975.protocol:          "" => "tcp"
    ingress.2214680975.security_groups.#: "" => "0"
    ingress.2214680975.self:              "" => "0"
    ingress.2214680975.to_port:           "" => "80"
    name:                                 "" => "elb"
    owner_id:                             "" => "<computed>"
    tags.#:                               "" => "1"
    tags.Name:                            "" => "Load balancer security group"
    vpc_id:                               "" => "${aws_vpc.production.id}"

+ aws_security_group.external
    description:                          "" => "Connection From the world"
    egress.#:                             "" => "1"
    egress.482069346.cidr_blocks.#:       "" => "1"
    egress.482069346.cidr_blocks.0:       "" => "0.0.0.0/0"
    egress.482069346.from_port:           "" => "0"
    egress.482069346.protocol:            "" => "-1"
    egress.482069346.security_groups.#:   "" => "0"
    egress.482069346.self:                "" => "0"
    egress.482069346.to_port:             "" => "0"
    ingress.#:                            "" => "1"
    ingress.3452538839.cidr_blocks.#:     "" => "1"
    ingress.3452538839.cidr_blocks.0:     "" => "YOUR_IP_HERE/32"
    ingress.3452538839.from_port:         "" => "22"
    ingress.3452538839.protocol:          "" => "tcp"
    ingress.3452538839.security_groups.#: "" => "0"
    ingress.3452538839.self:              "" => "0"
    ingress.3452538839.to_port:           "" => "22"
    name:                                 "" => "external"
    owner_id:                             "" => "<computed>"
    vpc_id:                               "" => "${aws_vpc.production.id}"

+ aws_security_group.internal
    description:                          "" => "Internal Connections"
    egress.#:                             "" => "1"
    egress.482069346.cidr_blocks.#:       "" => "1"
    egress.482069346.cidr_blocks.0:       "" => "0.0.0.0/0"
    egress.482069346.from_port:           "" => "0"
    egress.482069346.protocol:            "" => "-1"
    egress.482069346.security_groups.#:   "" => "0"
    egress.482069346.self:                "" => "0"
    egress.482069346.to_port:             "" => "0"
    ingress.#:                            "" => "1"
    ingress.3544538468.cidr_blocks.#:     "" => "0"
    ingress.3544538468.from_port:         "" => "0"
    ingress.3544538468.protocol:          "" => "tcp"
    ingress.3544538468.security_groups.#: "" => "0"
    ingress.3544538468.self:              "" => "1"
    ingress.3544538468.to_port:           "" => "65535"
    name:                                 "" => "internal"
    owner_id:                             "" => "<computed>"
    tags.#:                               "" => "1"
    tags.Name:                            "" => "Internal Security Group"
    vpc_id:                               "" => "${aws_vpc.production.id}"

+ aws_subnet.production-1a
    availability_zone:       "" => "us-east-1a"
    cidr_block:              "" => "10.0.1.0/24"
    map_public_ip_on_launch: "" => "1"
    vpc_id:                  "" => "${aws_vpc.production.id}"

+ aws_subnet.production-1c
    availability_zone:       "" => "us-east-1c"
    cidr_block:              "" => "10.0.3.0/24"
    map_public_ip_on_launch: "" => "1"
    vpc_id:                  "" => "${aws_vpc.production.id}"

+ aws_subnet.production-1d
    availability_zone:       "" => "us-east-1d"
    cidr_block:              "" => "10.0.2.0/24"
    map_public_ip_on_launch: "" => "1"
    vpc_id:                  "" => "${aws_vpc.production.id}"

+ aws_vpc.production
    cidr_block:                "" => "10.0.0.0/16"
    default_network_acl_id:    "" => "<computed>"
    default_security_group_id: "" => "<computed>"
    dhcp_options_id:           "" => "<computed>"
    enable_classiclink:        "" => "<computed>"
    enable_dns_hostnames:      "" => "<computed>"
    enable_dns_support:        "" => "<computed>"
    main_route_table_id:       "" => "<computed>"

+ template_file.web_userdata
    rendered:         "" => "<computed>"
    template:         "" => "#cloud-config\n\nbootcmd:\n - hostname web.${domain_name}.`curl http://169.254.169.254/latest/meta-data/instance-id`\n - echo 127.0.1.1 web.${domain_name}.`curl http://169.254.169.254/latest/meta-data/instance-id` >> /etc/hosts\n - echo web.${domain_name}.`curl http://169.254.169.254/latest/meta-data/instance-id` > /etc/hostname\n\npreserve_hostname: true\n"
    vars.#:           "" => "1"
    vars.domain_name: "" => "yourdomain"


Plan: 13 to add, 0 to change, 0 to destroy.
~/

If you execute terraform apply now, terraform will create all the resources for you.

It will create a load balancer and attach the instance to it, all you need to do now is deploy your code (coming in part 3)

Source Code

You can find the source code on github.

Summing up

You can probably see now that you can pretty easily describe your
infrastructure with code.

In part two of this post, we will create the database, elasticache (redis),
open up security groups for more options and more.

Would love your feedback and comments as always

Avi Zurel