amazon-web-services ansible ubuntu-16.04 dpkg

Ansible Script dpkg lock on aws launch ubuntu 16.04

I have a launch script (user data) that runs on startup in aws with an ubuntu 16.04 image, and the issue I'm having is that when it gets to the part where it runs an ansible playbook the playbook fails saying this basic error message Could not get lock /var/lib/dpkg/lock. Now when I log in and try to run the ansible script manually it works, but if I run it from the aws user data, it fails with the error.

This is the full error

TASK [rabbitmq : install packages (Ubuntu default repo is used)] ***************
task path: /etc/ansible/roles/rabbitmq/tasks/main.yml:50
<localhost> ESTABLISH LOCAL CONNECTION FOR USER: root
<localhost> EXEC /bin/sh -c '( umask 77 && mkdir -p "` echo $HOME/.ansible/tmp/ansible-tmp-1480352390.01-116502531862586 `" && echo ansible-tmp-1480352390.01-116502531862586="` echo $HOME/.ansible/tmp/ansible-tmp-1480352390.01-116502531862586 `" ) && sleep 0'
<localhost> PUT /tmp/tmpGHaVRP TO /.ansible/tmp/ansible-tmp-1480352390.01-116502531862586/apt
<localhost> EXEC /bin/sh -c 'chmod u+x /.ansible/tmp/ansible-tmp-1480352390.01-116502531862586/ /.ansible/tmp/ansible-tmp-1480352390.01-116502531862586/apt && sleep 0'
<localhost> EXEC /bin/sh -c 'LANG=en_US.UTF-8 LC_ALL=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 /usr/bin/python /.ansible/tmp/ansible-tmp-1480352390.01-116502531862586/apt; rm -rf "/.ansible/tmp/ansible-tmp-1480352390.01-116502531862586/" > /dev/null 2>&1 && sleep 0'
fatal: [localhost]: FAILED! => {"cache_update_time": 0, "cache_updated": 
false, "changed": false, "failed": true, "invocation": {"module_args": 
{"allow_unauthenticated": false, "autoremove": false, "cache_valid_time": 
null, "deb": null, "default_release": null, "dpkg_options": "force-
confdef,force-confold", "force": false, "install_recommends": null, "name": 
"rabbitmq-server", "only_upgrade": false, "package": ["rabbitmq-server"], 
"purge": false, "state": "present", "update_cache": false, "upgrade": null}, 
"module_name": "apt"}, "msg": "'/usr/bin/apt-get -y -o \"Dpkg::Options::=--
force-confdef\" -o \"Dpkg::Options::=--force-confold\"     install 
'rabbitmq-server'' failed: E: Could not get lock /var/lib/dpkg/lock - open 
(11: Resource temporarily unavailable)\nE: Unable to lock the administration 
directory (/var/lib/dpkg/), is another process using it?\n", "stderr": "E:     Could 
not get lock /var/lib/dpkg/lock - open (11: Resource temporarily 
unavailable)\nE: Unable to lock the administration directory (/var/lib/dpkg/), 
is another process using it?\n", "stdout": "", "stdout_lines": []}

Solution

I ran into the same lock issue. I found that ubuntu was installing some packages on first boot which cloud-init did not wait for.

I use the following script to check that the lock file is available for at least 15 seconds prior to trying to install anything.

#!/bin/bash

i="0"
while [ $i -lt 15 ] 
do 
if [ $(fuser /var/lib/dpkg/lock) ]; then 
  i="0" 
fi 
sleep 1 
i=$[$i+1] 
done

The reason I prefer this vs sleep 5m because in an autoscale group the instance may be removed before it's even provisioned.