amazon-web-services dynamic ansible inventory

Ansible Dynamic inventory with static group with dynamic children

I am sure many who work with Terraform and Ansible or just Ansible on a daily basis must have come across this question.

Some background:

I create my infrastructure on AWS using Terraform and configure my machines using Ansible. my inventory file contains hardcoded public ip addresses with some variables. As the business demands, I create and destroy my machines very often.

My question:

I want don't want to update my inventory file with new public IP addresses every time I destroy and create my instances. So my fundamental requirement is - every time I destroy my machine I should be able run my Terraform script to recreate the machines and when I run my Ansible Playbook, Ansible should be able to pick up the right target machines and run the playbook. I need to know what I need to describe in my inventory file to achieve this automation. Domain name (www.fooexample.com) and static public IP addresses in the inventory file is not an option in my case? I have seen scripts that do it with, what it looks like a hostname (webserver1)

There are forums that talk about using the ec2.py option but ec2.py is getting all the public ip addresses associated with the account but i only want to target some of the machines as you can imagine and not all of them with my playbook.

Any help regarding this would be appreciated.

Thanks in Advance

Solution

I do something similar in GCP but the concept should apply to AWS.

Starting with Ansible 2.7 there is a new inventory plugin architecture and some inventory plugins to replace the dynamic inventory scripts (such as ec2.py and gcp.py). The AWS plugin documentation is at https://docs.ansible.com/ansible/2.9/plugins/inventory/aws_ec2.html.

First, you need to tag the groups of hosts you want to target in AWS. You should be able to handle this with Terraform (such as Service = Web).

Next, enable the aws_ec2 plugin in ansible.cfg by adding:

[inventory]
enable_plugins = aws_ec2

Now, convert over to using the new plugin instead of ec2.py. This means creating a aws_ec2.yaml file based on the documentation. An example might look like:

plugin: aws_ec2
regions:
  - us-east-1
keyed_groups:
  - prefix: tag
    key: tags
# Set individual variables with compose
compose:
  ansible_host: public_ip_address

The key parts here are the keyed_groups and compose section. This will give you the public IP addresses as the host to connect to in inventory and groups you can limit to with -l or --limit.

Considering you had some instances in us-east-1 tagged with Service = Web you could target them like:

ansible -i aws_ec2.yaml -m ping -l tag_Service_Web

This would target just those tagged hosts on their public IP address. Any dynamic scaling you do (such as increasing the count in Terraform for that resource) will be picked up by the inventory plugin on next run.

You can also use the tag in playbooks. If you had a playbook that you always targeted at these hosts you can set hosts: tag_Service_Web in the playbook.

Bonus:

I've been experimenting with an Ansible Pull model that automates some of this bootstrapping. The idea is to combine cloud-init with a special script to bootstrap the playbook for that host automatically.

Example script that cloud-init kicks off:

#!/bin/bash

set -euo pipefail

lock_files=(
    /var/lib/dpkg/lock
    /var/lib/apt/lists/lock
    /var/lib/dpkg/lock-frontend
    /var/cache/apt/archives/lock
    /var/lib/apt/daily_lock
)

export ANSIBLE_HOST_PATTERN_MISMATCH="ignore"
export PATH="/tmp/ansible-venv/bin:$PATH"

for file in "${lock_files[@]}"; do
    while fuser "$file" >/dev/null 2>&1; do
        echo "Waiting for lock $file to be available..."
        sleep 5
    done
done

apt-get update -qy
apt-get install --no-install-recommends -qy virtualenv python-virtualenv python-nacl python-wheel python-bcrypt

virtualenv -p /usr/bin/python --system-site-packages /tmp/ansible-venv
pip install ansible==2.7.10 apache-libcloud==2.3.0 jmespath==0.9.3

ansible-pull myplaybook.yaml \
    -U git@github.com:myorg/infrastructure.git \
    -i gcp_compute.yaml \
    --private-key /tmp/ansible-keys/infrastructure_ssh_deploy_key \
    --vault-password-file /tmp/ansible-keys/vault \
    -d /tmp/ansible-infrastructure \
    --accept-host-key

This script is a bit simplified from my actual one (leaving out some domain specific authentication and key providing stuff). But you can adapt it to AWS by doing something like bootstrapping keys from S3 or KMS or another boot time configuration service. I find that ansible-pull works well when the playbook only takes a minute or two to run and doesn't have any dependencies on external inventory (like references to other groups such as to gather IP addresses).