Search code examples
ansibleyamlansible-inventory

Ansible - splitting YAML-formatted list of vars across hosts.yml and group_vars


I have the following directory structure:

├── ansible.cfg
├── hosts.yml
├── playbook.yml
├── group_vars
|   ├── all.yml
│   └── vm_dns.yml
└── roles
    └── pihole
        ├── handlers
        │   └── main.yml
        └── tasks
            └── main.yml

In ansible.cfg I simply have:

[defaults]
inventory = ./hosts.yml

In group_vars/all.yml I have some generic settings:

---
aptcachetime: 3600
locale: "en_GB.UTF-8"
timezone: "Europe/Paris"

And in hosts.yml I setup my PiHole VMs:

---
all:
  vars:
    ansible_python_interpreter: /usr/bin/python3
vm_dns:
  vars:
    dns_server: true
  hosts:
    vmb-dns:
      pihole:
        dns: 
          - "185.228.168.10"
          - "185.228.169.11"
        network:
          ipv4: "192.168.2.4/24"
          interface: eth0
    vmk-dns: 
      pihole:
        dns: 
          - "185.228.168.10"
          - "185.228.169.11"
        network:
          ipv4: "192.168.3.4/24"
          interface: eth0

At this point, I've not attempted to move any vars to group_vars, and everything works.
Now, I felt could make the hosts file more readable by breaking out the settings that are the same for all vm_dns hosts to a group_vars file. So I removed all the dns and interface lines from hosts.yml, and put them in a group_vars/vm_dns.yml file, like this:

---
pihole:
  dns: 
    - "185.228.168.10"
    - "185.228.169.11"
  network:
    interface: eth0

At this point, the hosts.yml thus contains:

---
all:
  vars:
    ansible_python_interpreter: /usr/bin/python3
vm_dns:
  vars:
    dns_server: true
  hosts:
    vmb-dns:
      pihole:
        network:
          ipv4: "192.168.2.4/24"
    vmk-dns: 
      pihole:
        network:
          ipv4: "192.168.3.4/24"

But when I now run the playbook, once it tries to execute a task that uses one of the vars that were moved from hosts.yml to group_vars/vm_dns.yml, Ansible fails with AnsibleUndefinedVariable: dict object has no attribute ....

I'm not really sure if I'm simply misunderstanding the "Ansible way", or if what I'm trying to do (essentially having different parts of the same list split across hosts and group_vars, I suppose) is not just doable. I thought the "flattening" that Ansible does was supposed to handle this, but it seems Ansible is not incorporating the vars defined in group_vars/vm_dns.yml at all.

I've read the docs on the subject, and found some almost-related posts, but found none demonstrating YAML-formatted lists used across hosts and group_vars in this manner.

Edit: other SO or Github issues that are actually related to this question
In Ansible, how to combine variables from separate files into one array?
https://github.com/ansible/ansible/issues/58120
https://docs.ansible.com/ansible/latest/reference_appendices/config.html#default-hash-behaviour


Solution

  • Since you are keeping a definition for the pihole var in your inventory at host level, this one wins the game by default and replaces the previous definition at group level. See the variable precedence documentation. So if you later try to access e.g. pihole.dns or pihole.network.interface, the mappings do not exist anymore and ansible fires the above error.

    This is the default behavior in ansible: replacing a previous variable by the latest by order of precedence. But you can change this behavior for dicts by setting hash_behaviour=merge in ansible.cfg.

    My personal experimentation with this settings where not really satisfactory: it behaved correctly with my own playbooks/roles that where made specifically for this but started to fire hard to trace bugs when including third party contributions (playbook snippets, roles, custom modules....). So I definitely don't recommend it. Moreover, this configuration has been deprecated in ansible 2.10 and will therefore be removed in ansible 2.14. If you still want to use it, you should limit the scope of the setting as narrow as possible and certainly not set it on a global level (i.e. surely not in /etc/ansible/ansible.cfg)

    What I globally use nowadays to solve this kind of problems:

    1. define your variable for each host/group/whatever containing only the specific information. In your case for you host
      ---
      pihole_host:
        network:
          ipv4: "192.168.2.4/24"
      
    2. define somewhere the defaults for those settings. In your case for your group.
      ---
      pihole_defaults:
        dns: 
          - "185.228.168.10"
          - "185.228.169.11"
        network:
          interface: eth0
      
      (Note that you can define those defaults at different level taking advantage of the above order of precedence for vars)
    3. at a global level (I generally put this in group_vars/all.yml), define your var which will be the combination of default and specific, making sure it always defaults to empty
      ---
      # Calculate pihole from group defaults and host specific
      pihole: >-
        {{
          (pihole_defaults | default({}))
          | combine((pihole_host | default({})), recursive=true)
        }}