I've been trying to debug this for a while now, and I thought I had it working, but then made some other changes, and now back again.
Basically, I have Vagrant looping over a list of machines definitions and while my Ansible inventory looks perfectly fine, I find that only one host is actually being provisioned.
Generated Ansible Inventory -- The SSH ports are all different, groups are correct
# Generated by Vagrant
kafka.cp.vagrant ansible_host=127.0.0.1 ansible_port=2200 ansible_user='vagrant' ansible_ssh_private_key_file='/workspace/confluent/cp-ansible/vagrant/.vagrant/machines/kafka.cp.vagrant/virtualbox/private_key' kafka='{"broker": {"id": 1}}'
zk.cp.vagrant ansible_host=127.0.0.1 ansible_port=2222 ansible_user='vagrant' ansible_ssh_private_key_file='/workspace/confluent/cp-ansible/vagrant/.vagrant/machines/zk.cp.vagrant/virtualbox/private_key'
connect.cp.vagrant ansible_host=127.0.0.1 ansible_port=2201 ansible_user='vagrant' ansible_ssh_private_key_file='/workspace/confluent/cp-ansible/vagrant/.vagrant/machines/connect.cp.vagrant/virtualbox/private_key'
[preflight]
zk.cp.vagrant
kafka.cp.vagrant
connect.cp.vagrant
[zookeeper]
zk.cp.vagrant
[broker]
kafka.cp.vagrant
[schema-registry]
kafka.cp.vagrant
[connect-distributed]
connect.cp.vagrant
Generated hosts file -- IPs and hostnames are correct
## vagrant-hostmanager-start id: aca1499c-a63f-4747-b39e-0e71ae289576
192.168.100.101 zk.cp.vagrant
192.168.100.102 kafka.cp.vagrant
192.168.100.103 connect.cp.vagrant
## vagrant-hostmanager-end
Ansible Playbook I want to run -- Correctly correspond to the groups in my inventory
- hosts: preflight
tasks:
- import_role:
name: confluent.preflight
- hosts: zookeeper
tasks:
- import_role:
name: confluent.zookeeper
- hosts: broker
tasks:
- import_role:
name: confluent.kafka-broker
- hosts: schema-registry
tasks:
- import_role:
name: confluent.schema-registry
- hosts: connect-distributed
tasks:
- import_role:
name: confluent.connect-distributed
For any code missing here, see Confluent :: cp-ansible.
The following is a sample of my Vagrantfile. (I made a fork, but haven't committed until I get this working...)
I know that this if index == machines.length - 1
should work according to the Vagrant documentation, and it does start all the machines, then only runs Ansible on the last machine, but its just all the tasks are executed on first one for some reason.
machines = {"zk"=>{"ports"=>{2181=>nil}, "groups"=>["preflight", "zookeeper"]}, "kafka"=>{"memory"=>3072, "cpus"=>2, "ports"=>{9092=>nil, 8081=>nil}, "groups"=>["preflight", "broker", "schema-registry"], "vars"=>{"kafka"=>"{\"broker\": {\"id\": 1}}"}}, "connect"=>{"ports"=>{8083=>nil}, "groups"=>["preflight", "connect-distributed"]}}
Vagrant.configure("2") do |config|
if Vagrant.has_plugin?("vagrant-hostmanager")
config.hostmanager.enabled = true
config.hostmanager.manage_host = true
config.hostmanager.ignore_private_ip = false
config.hostmanager.include_offline = true
end
# More info on http://fgrehm.viewdocs.io/vagrant-cachier/usage
if Vagrant.has_plugin?("vagrant-cachier")
config.cache.scope = :box
end
if Vagrant.has_plugin?("vagrant-vbguest")
config.vbguest.auto_update = false
end
config.vm.box = VAGRANT_BOX
config.vm.box_check_update = false
config.vm.synced_folder '.', '/vagrant', disabled: true
machines.each_with_index do |(machine, machine_conf), index|
hostname = getFqdn(machine.to_s)
config.vm.define hostname do |v|
v.vm.network "private_network", ip: "192.168.100.#{101+index}"
v.vm.hostname = hostname
machine_conf['ports'].each do |guest_port, host_port|
if host_port.nil?
host_port = guest_port
end
v.vm.network "forwarded_port", guest: guest_port, host: host_port
end
v.vm.provider "virtualbox" do |vb|
vb.memory = machine_conf['memory'] || 1536 # Give overhead for 1G default java heaps
vb.cpus = machine_conf['cpus'] || 1
end
if index == machines.length - 1
v.vm.provision "ansible" do |ansible|
ansible.compatibility_mode = '2.0'
ansible.limit = 'all'
ansible.playbook = "../plaintext/all.yml"
ansible.become = true
ansible.verbose = "vv"
# ... defined host and group variables here
end # Ansible provisioner
end # If last machine
end # machine configuration
end # for each machine
end
I setup an Ansible task like this
- debug:
msg: "FQDN: {{ansible_fqdn}}; Hostname: {{inventory_hostname}}; IPv4: {{ansible_default_ipv4.address}}"
Just with that task, notice that the following ansible_fqdn
is always zk.cp.vagrant
, and this lines up with the fact that only that VM is getting provisioned by Ansible.
ok: [zk.cp.vagrant] => {
"msg": "FQDN: zk.cp.vagrant; Hostname: zk.cp.vagrant; IPv4: 10.0.2.15"
}
ok: [kafka.cp.vagrant] => {
"msg": "FQDN: zk.cp.vagrant; Hostname: kafka.cp.vagrant; IPv4: 10.0.2.15"
}
ok: [connect.cp.vagrant] => {
"msg": "FQDN: zk.cp.vagrant; Hostname: connect.cp.vagrant; IPv4: 10.0.2.15"
}
Update with minimal example: hostname -f
is only one host, and I assume that's what gather_facts
is running for ansible_fqdn
ansible all --private-key=~/.vagrant.d/insecure_private_key --inventory-file=/workspace/confluent/cp-ansible/vagrant/.vagrant/provisioners/ansible/inventory -a 'hostname -f' -f1
zk.cp.vagrant | SUCCESS | rc=0 >>
kafka.cp.vagrant
connect.cp.vagrant | SUCCESS | rc=0 >>
kafka.cp.vagrant
kafka.cp.vagrant | SUCCESS | rc=0 >>
kafka.cp.vagrant
Turns out I can get around the problem with not having this section in my ansible.cfg
[ssh_connection]
control_path = %(directory)s/%%h-%%r