Like many other companies that are deploying their applications to the cloud, the majority of our estate uses Linux. However we do need to use Windows for a couple of purposes. This could be for application testing, or for specific Windows features.

We also recently adopted Packer to build our machine images, to allow them to be defined in code (and therefore within version control). In Amazon, these machine images are called AMIs. Think of them like “golden images”, a known base to run your applications on top of.

Packer itself can bootstrap machines using shell scripts, single commands, or you can make use of a configuration management tool to install the base utilities and necessary files to prepare the image. We currently use Ansible do do this.

Basics of Packer

Packer can build images for a number of different providers, and this doesn’t just include the likes of AWS, Azure or GCP. You could use it for creating Docker images, to help creating Vagrant images, VMware, Proxmox, as well as smaller providers like Digital Ocean, Hetzner and a number of others. For the full list, take a look at the builders page in the Packer documentation.

Packer defines the characteristics of a machine image using JSON. The most basic of images could look something like: -

{
  "builders": [
   {
    "type": "amazon-ebs",
    "region": "{{ user `aws_region`}}",
    "profile": "{{ user `aws_profile`}}",
    "source_ami_filter": {
      "filters": {
        "virtualization-type": "hvm",
        "name": "debian-stretch-*",
        "architecture": "x86_64",
        "root-device-type": "ebs"
      },
      "owners": ["379101102735"],
      "most_recent": true
    },
    "instance_type": "t3.micro",
    "ssh_username": "admin",
    "ssh_keypair_name": "{{ user `ssh_key_name`}}",
    "ssh_private_key_file": "{{ user `local_ssh_key_location`}}",
    "ssh_pty": true,
    "ami_name": "debian9-base-{{ timestamp }}",
    "vpc_id": "{{ user `default_vpc_id`}}",
    "subnet_id": "{{ user `subnet_id_1a`}}",
    "security_group_id": "{{ user `default_security_group`}}",
    "tags": {
       "ami_type": "debian9-base",
       "created_by": "Packer"
    }
  }
  ],
  "provisioners": [
   {
    "type": "shell",
    "inline_shebang": "/bin/sh -x",
    "inline": [
      "sleep 30",
      "sudo apt-get update -qy && sudo apt-get dist-upgrade -qy",
      "sudo apt-get install -qy tcpdump telnet iotop htop dnsutils net-tools sysstat vim git wget zsh",
      "wget -O - https://repo.saltstack.com/apt/debian/9/amd64/latest/SALTSTACK-GPG-KEY.pub | sudo apt-key add -",
      "echo 'deb http://repo.saltstack.com/apt/debian/9/amd64/latest stretch main' | sudo tee /etc/apt/sources.list.d/saltstack.list",
      "sudo apt update && sudo apt-get install -y salt-minion nagios-nrpe-server nagios-plugins"
    ]
   }
 ]
}

The above is a mix of JSON, but with Jinja-style templated variables. Anything which is prefaced with user, e.g. {{ user `ssh_key_name` }} is a user-defined variable. In the above, we also have {{ timestamp }}, which is just a Unix timestamp (so that each generated image has a unique name).

There are two sections, the Builder and the Provisioner. The builder is where you are spinning up the official/existing image with the specified parameters, the provisioner is how you will customize the image after it starts.

You can also make use of multiple provisioners. For example, you could start bootstrapping with a couple of shell commands, and then apply the rest of the configuration with Ansible. The various provisioners are detailed here

Typically in the main cloud providers, you would make use of an Official machine image, e.g. the Debian-created Stretch 64-bit x86 AWS AMI, and then apply your customization from there.

To make use of user variables, you can either provide them at the command line, or you can make use of a “vars” file. An example of the file would be the below: -

{
 "aws_region": "eu-central-1",
 "aws_profile": "testing",
 "default_vpc_id": "vpc-xxxxxx",
 "subnet_id_1a": "subnet-xxxxxx",
 "default_security_group": "sg-xxxxxxx",
 "windows_security_group": "sg-xxxxxxx",
 "ssh_key_name": "$SSH_KEY_NAME",
 "local_ssh_key_location": "~/.ssh/$SSH_KEY_NAME",
 "environment": "staging",
 "private_domain": "packer.yetiops.net",
 "aws_account": "xxxxxxxxx"
}

Unlike Terraform, Packer does not automatically make use of “vars” files. Instead, you need to specify it like such: -

$ packer build -var-file=vars.json packer-ami-build.json

After this, you will see Packer build and generate the image: -

$ packer build -var-file=vars.json packer-ami-build.json
==> amazon-ebs: Prevalidating AMI Name: debian9-base-1580820854
    amazon-ebs: Found Image ID: ami-06d77f4fcb1f698eb
==> amazon-ebs: Using existing SSH private key
==> amazon-ebs: Launching a source AWS instance...
==> amazon-ebs: Adding tags to source instance
    amazon-ebs: Adding tag: "Name": "Packer Builder"
    amazon-ebs: Instance ID: i-0bc40cf110e2e886e
==> amazon-ebs: Waiting for instance (i-0bc40cf110e2e886e) to become ready...
==> amazon-ebs: Using ssh communicator to connect: 10.100.1.1
==> amazon-ebs: Waiting for SSH to become available...
==> amazon-ebs: Connected to SSH!
[...]

Beyond the shell provisioner

As mentioned, you can also use a number of different configuration management tools to bootstrap an image ready for usage. This could be Puppet Masterless, Salt Masterless, or (as in our scenario, Ansible). More information on them is available here

We do make use of Salt generally for our configuration management, but Ansible was chosen for this due to the ease of setup and configuration for basic tasks. Salt has, in my experience, a speed advantage over Ansible (due to the lack of agents in Ansible). However this is negated when using Salt Masterless, at which point Ansible seemed like a good choice.

To make use of Ansible within a Packer build, you would define a provisioner like such: -

  "provisioners": [
   {
    "type": "ansible",
    "extra_arguments": [
      "--extra-vars",
      "ansible_python_interpreter=/usr/bin/python"
    ],
    "playbook_file": "../ansible/base/debian/base.yaml",
    "user": "admin"
   }
  ]

As you can see, you do not require a lot of configuration to add a provisioner. You could then reference an Ansible Playbook that does something like adding a few packages, e.g.: -

- hosts: all
  become: yes
  become_method: sudo
  tasks:
  - name: Ensure gnupg2 is installed to add keys
    package:
      name: gnupg2
      state: present
      update_cache: yes

  - name: Add Salt GPG
    apt_key:
      url: https://repo.saltstack.com/apt/debian/9/amd64/latest/SALTSTACK-GPG-KEY.pub
      state: present

  - name: Add Salt Repo
    apt_repository:
      repo: deb http://repo.saltstack.com/apt/debian/9/amd64/latest stretch main
      state: present
  
  - name: Update cache and upgrade packages
    apt:
      name: "*"
      state: latest
      update_cache: yes

  - name: Dist Upgrade
    apt:
      upgrade: dist

  - name: Get all required base packages
    apt:
      name: "{{ packages }}"
    vars:
      packages:
       - tcpdump
       - telnet
       - iotop
       - htop
       - dnsutils
       - net-tools
       - sysstat
       - vim
       - git
       - wget
       - zsh
       - salt-minion
       - nagios-nrpe-server
       - nagios-plugins
       - exim4

Given the breadth of modules available within Ansible, there is a lot more you could do to an image than the above. It does at least show the basics of what can be done using Packer and Ansible together.

So what about Windows?

Previously, we had Packer building Windows images using Powershell commands. For example, the below installs Chocolatey (a Windows package manager), restarts Windows to allow Chocolatey to install correctly, installs a few packages, and then transfers some files over, ready to be used by NSClient++ (the Windows equivalent to NRPE, for Nagios remote checks): -

 "provisioners": [
    {
      "type": "powershell",
      "scripts": [
        "../build-files/windows/disable-uac.ps1",
        "../build-files/windows/ChocolateyInstall.ps1"
       ]
    },
    {
      "type": "windows-restart",
      "restart_check_command": "powershell -command \"& {Write-Output 'restarted.'}\""
    },
    {
      "type": "powershell",
      "inline": [
        "choco install -y nscp",
        "choco install -y prometheus-wmi-exporter.install",
        "choco install -y openssh"
      ]
    },
    {
      "type": "file",
      "source": "../build-files/windows/nsclient.ini",
      "destination": "C:/Program Files/NSClient++/nsclient.ini"
    }
  ]

After a while, this kind of configuration can start to become unmanageable. So what can we do?

Since Ansible version 1.7, Ansible has supported managing Windows machines. This gives us a good option for applying configuration, adding packages, and updating files on the Windows images before they are used.

Stumbling blocks

When I first attempted to use Packer with Ansible to manage Windows machine, I could never get Ansible to talk to the Windows machines correctly. Take the below as an example: -

amazon-ebs output will be in this color.

==> amazon-ebs: Prevalidating AMI Name: win2016-base-1580822124
    amazon-ebs: Found Image ID: ami-0e484c84e6d59f3a3
==> amazon-ebs: Creating temporary keypair: packer_5e396e6c-0cbc-fc22-f978-015e782280b4
==> amazon-ebs: Launching a source AWS instance...
==> amazon-ebs: Adding tags to source instance
    amazon-ebs: Adding tag: "Name": "Packer Builder"
    amazon-ebs: Instance ID: i-021e6d35e7e6e5ae9
==> amazon-ebs: Waiting for instance (i-021e6d35e7e6e5ae9) to become ready...
==> amazon-ebs: Waiting for auto-generated password for instance...
    amazon-ebs: It is normal for this process to take up to 15 minutes,
    amazon-ebs: but it usually takes around 5. Please wait.
    amazon-ebs:  
    amazon-ebs: Password retrieved!
==> amazon-ebs: Using winrm communicator to connect: 10.100.1.1
==> amazon-ebs: Waiting for WinRM to become available...
    amazon-ebs: WinRM connected.
==> amazon-ebs: #< CLIXML
==> amazon-ebs: <Objs Version="1.1.0.1" xmlns="http://schemas.microsoft.com/powershell/2004/04"><Obj S="progress" RefId="0"><TN RefId="0"><T>System.Management.Automation.PSCustomObject</T><T>System.Object</T></TN><MS><I64 N="SourceId">1</I64><PR N="Record"><AV>Preparing modules for first use.</AV><AI>0</AI><Nil /><PI>-1</PI><PC>-1</PC><T>Completed</T><SR>-1</SR><SD> </SD></PR></MS></Obj><Obj S="progress" RefId="1"><TNRef RefId="0" /><MS><I64 N="SourceId">1</I64><PR N="Record"><AV>Preparing modules for first use.</AV><AI>0</AI><Nil /><PI>-1</PI><PC>-1</PC><T>Completed</T><SR>-1</SR><SD> </SD></PR></MS></Obj></Objs>
==> amazon-ebs: Connected to WinRM!
==> amazon-ebs: Provisioning with Ansible...
==> amazon-ebs: Executing Ansible: ansible-playbook --extra-vars packer_build_name=amazon-ebs packer_builder_type=amazon-ebs -o IdentitiesOnly=yes -i /tmp/packer-provisioner-ansible207671579 /home/stuh84/git/Infrastructure/packer/ansible/base/windows/base.yaml -e ansible_ssh_private_key_file=/tmp/ansible-key559438764--extra-vars ansible_shell_type=powershell --extra-vars ansible_shell_executable=None
    amazon-ebs:
    amazon-ebs: PLAY [all] *********************************************************************
    amazon-ebs:
    amazon-ebs: TASK [Install required packages] ***********************************************
    amazon-ebs: fatal: [default]: FAILED! => {"changed": false, "module_stderr": "Warning: Permanently added '[127.0.0.1]:39793' (RSA) to the list of known hosts.\r\nParameter format not correct - ;\r\n", "module_stdout": "", "msg": "MODULE FAILURE\nSee stdout/stderr for the exact error", "rc": 1}
    amazon-ebs:
    amazon-ebs: PLAY RECAP *********************************************************************
    amazon-ebs: default                    : ok=0    changed=0    unreachable=0    failed=1    skipped=0    rescued=0    ignored=0
    amazon-ebs:
==> amazon-ebs: Provisioning step had errors: Running the cleanup provisioner, if present...
==> amazon-ebs: Terminating the source AWS instance...

The provisioner at this point looks almost exactly like the one we use for Linux deployments

  "provisioners": [
   {
    "type": "ansible",
    "extra_arguments": [
      "--extra-vars",
      "ansible_shell_type=powershell",
      "--extra-vars",
      "ansible_shell_executable=None"
    ],
    "playbook_file": "../ansible/base/windows/base.yaml",
    "user": "Administrator"
   },

It’s worth mentioning at this point that Ansible does not talk to the AWS Packer Builder image (i.e. the image that you are customizing) directly, instead it talks to the machine through a connection that Packer sets up as a kind of proxy to the machine. This is the same with Linux or Windows.

However there seems to be an issue with using this directly when Ansible is trying to use WinRM/Powershell (i.e. its native method of talking to Windows machines).

Packer connection plugin for Ansible

After some searching around and looking through a number of bug reports, I found reference to an Ansible connection plugin for Packer. This allows Ansible to use the existing Packer connection to the image.

To make use of it, do the following: -

$ sudo mkdir -p /usr/share/ansible/plugins/connection/
$ cd /usr/share/ansible/plugins/connection/
$ wget https://raw.githubusercontent.com/hashicorp/packer/master/test/fixtures/provisioner-ansible/connection_plugins/packer.py

You’ll also need to edit your ansible.cfg file (by default, in the /etc/ansible directory) and make sure that there is a line that says

  • connection_plugins = /usr/share/ansible/plugins/connection

To make use of this connection plugin in your Packer file, all you need to do is add the following line: -

    "type": "ansible",
    "extra_arguments": [
      "--connection", "packer",          <---------- This one
      "--extra-vars",
      "ansible_shell_type=powershell",
      "--extra-vars",
      "ansible_shell_executable=None"

When you next attempt to run a Packer build, Ansible should start working through the Packer connection, and then deploy as normal. For example: -

amazon-ebs output will be in this color.

==> amazon-ebs: Prevalidating AMI Name: win2016-base-1580823483
    amazon-ebs: Found Image ID: ami-0e484c84e6d59f3a3
==> amazon-ebs: Creating temporary keypair: packer_5e3973bb-4e9f-1749-d007-8addeee4cf0d
==> amazon-ebs: Launching a source AWS instance...
==> amazon-ebs: Adding tags to source instance
    amazon-ebs: Adding tag: "Name": "Packer Builder"
    amazon-ebs: Instance ID: i-XXXXXXXXXXXXXXX
==> amazon-ebs: Waiting for instance (i-XXXXXXXXXXX) to become ready...
==> amazon-ebs: Waiting for auto-generated password for instance...
    amazon-ebs: It is normal for this process to take up to 15 minutes,
    amazon-ebs: but it usually takes around 5. Please wait.
    amazon-ebs:  
    amazon-ebs: Password retrieved!
==> amazon-ebs: Using winrm communicator to connect: 10.100.1.1
==> amazon-ebs: Waiting for WinRM to become available...
==> amazon-ebs: #< CLIXML
    amazon-ebs: WinRM connected.
==> amazon-ebs: <Objs Version="1.1.0.1" xmlns="http://schemas.microsoft.com/powershell/2004/04"><Obj S="progress" RefId="0"><TN RefId="0"><T>System.Management.Automation.PSCustomObject</T><T>System.Object</T></TN><MS><I64 N="SourceId">1</I64><PR N="Record"><AV>Preparing modules for first use.</AV><AI>0</AI><Nil /><PI>-1</PI><PC>-1</PC><T>Completed</T><SR>-1</SR><SD> </SD></PR></MS></Obj><Obj S="progress" RefId="1"><TNRef RefId="0" /><MS><I64 N="SourceId">1</I64><PR N="Record"><AV>Preparing modules for first use.</AV><AI>0</AI><Nil /><PI>-1</PI><PC>-1</PC><T>Completed</T><SR>-1</SR><SD> </SD></PR></MS></Obj></Objs>
==> amazon-ebs: Connected to WinRM!
==> amazon-ebs: Provisioning with Ansible...
==> amazon-ebs: Executing Ansible: ansible-playbook --extra-vars packer_build_name=amazon-ebs packer_builder_type=amazon-ebs -o IdentitiesOnly=yes -i /tmp/packer-provisioner-ansible213632365 /home/stuh84/git/Infrastructure/packer/ansible/base/windows/base.yaml -e ansible_ssh_private_key_file=/tmp/ansible-key670542534 --connection packer --extra-vars ansible_shell_type=powershell --extra-vars ansible_shell_executable=None
    amazon-ebs:
    amazon-ebs: PLAY [all] *********************************************************************
    amazon-ebs:
    amazon-ebs: TASK [Install required packages] ***********************************************
    amazon-ebs: [WARNING]: Chocolatey was missing from this system, so it was installed during
    amazon-ebs: changed: [default]
    amazon-ebs: this task run.
    amazon-ebs:
    amazon-ebs: TASK [nsclient INI] ************************************************************
    amazon-ebs: changed: [default]
    amazon-ebs:
    amazon-ebs: TASK [Invoke OpenSSD Install script] *******************************************
    amazon-ebs: changed: [default]
    amazon-ebs:
    amazon-ebs: TASK [Enable SSHD] *************************************************************
    amazon-ebs: changed: [default]
    amazon-ebs:
    amazon-ebs: TASK [SSHD Firewall Rule] ******************************************************
    amazon-ebs: changed: [default]
    amazon-ebs:
    amazon-ebs: PLAY RECAP *********************************************************************
    amazon-ebs: default                    : ok=10   changed=9    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0
    amazon-ebs:
==> amazon-ebs: Provisioning with Powershell...
==> amazon-ebs: Provisioning with powershell script: /tmp/powershell-provisioner627331863
    amazon-ebs:
    amazon-ebs: TaskPath                                       TaskName                          State
    amazon-ebs: --------                                       --------                          -----
    amazon-ebs: \                                              Amazon Ec2 Launch - Instance I... Ready
==> amazon-ebs: Stopping the source instance...
    amazon-ebs: Stopping instance
==> amazon-ebs: Waiting for the instance to stop...
==> amazon-ebs: Creating AMI win2016-base-1580823483 from instance i-0e60a616f082a01bd
    amazon-ebs: AMI: ami-XXXXXXXXXXXXXX
==> amazon-ebs: Waiting for AMI to become ready...
==> amazon-ebs: Adding tags to AMI (ami-XXXXXXXXXXX)...
==> amazon-ebs: Tagging snapshot: snap-XXXXXXXXXXX
==> amazon-ebs: Creating AMI tags
    amazon-ebs: Adding tag: "ami_type": "win2016-base"
    amazon-ebs: Adding tag: "created_by": "Packer"
==> amazon-ebs: Creating snapshot tags
==> amazon-ebs: Terminating the source AWS instance...
==> amazon-ebs: Cleaning up any extra volumes...
==> amazon-ebs: No volumes to clean up, skipping
==> amazon-ebs: Deleting temporary keypair...
Build 'amazon-ebs' finished.

An example playbook to apply would be something like the following: -

- hosts: all
  remote_user: Administrator 
  gather_facts: false
  tasks:
  - name: Install required packages
    win_chocolatey:
      name:
      - nscp
      - prometheus-wmi-exporter.install
      - openssh

  - name: nsclient INI
    win_copy: 
      src: ../../../build-files/windows/nsclient.ini
      dest: C:/Program Files/NSClient++/nsclient.ini

  - name: Invoke OpenSSD Install script
    win_shell: powershell.exe -ExecutionPolicy Bypass -File "c:\Program Files\OpenSSH-Win64\install-sshd.ps1"

  - name: Enable SSHD
    win_service:
      name: sshd
      state: started
      start_mode: auto

  - name: SSHD Firewall Rule
    win_firewall_rule:
      name: OpenSSHD
      localport: 22
      action: allow
      direction: in
      protocol: tcp
      state: present
      enabled: yes

An interesting point to note here is that when you use the win_chocolatey module, Ansible will install Chocolatey before attempting to install any packages, meaning you don’t need to worry about installing it first.

I also add in OpenSSH, because I still haven’t entirely gotten Powershell remoteing/WinRM working from Linux. Having SSH available allows me to manage it like a Linux box (just with a different shell).

Sysprep and InitializeInstance

To ensure that any Windows machines generated do not have the same GUIDs and any other identifying IDs (which can cause issues with joining Active Directory, and Licensing), AWS provide Sysprep and InitializeInstance scripts as part of their base Windows AMI (which all our Windows images use as base for customization).

I did initially try to execute these with Ansible. However it appears that in making the image unique, it breaks the Ansible-over-Packer connectivity. Instead, to finish off any Windows image, I just apply these scripts using the Powershell provisioner: -

    {
      "type": "powershell",
      "inline": [
        "C:\\ProgramData\\Amazon\\EC2-Windows\\Launch\\Scripts\\InitializeInstance.ps1 -Schedule",
        "C:\\ProgramData\\Amazon\\EC2-Windows\\Launch\\Scripts\\SysprepInstance.ps1 -NoShutdown"
      ]
    }

While I haven’t attempted to build Windows machine images on other platforms/cloud providers, I would assume they provide similar methods to ensure that a unique machine is generated when you fire up your machine image later. If you do find issues with Ansible dropping connectivity when trying to kick off these scripts, I would recommend using the Powershell provisioner instead.

Bonus: Using the images with Terraform

If you are using Terraform to deploy your cloud infrastructure, you can easily make use of the images Packer builds, using the below: -

data "aws_ami" "packer-windows-image" {
  most_recent = true

  filter {
    name   = "name"
    values = ["windows2016-base-*"]
  }

  filter {
    name   = "virtualization-type"
    values = ["hvm"]
  }

  filter {
    name   = "tag:ami_type"
    values = ["windows2016-base"]
  }

  owners = ["${var.aws_account}"]
}

resource "aws_instance" "windows-ec2" {
  ami                  = data.aws_ami.packer-windows-image.id
[...]

The above will match the most recent version of your base image, and use it as a data source for your EC2 AWS Instance resource.

References

To put this together, I used a few points from this repository, this GitHub issue and this GitHub Gist.