Search code examples
ansibleansible-facts

How to gather facts about disks using Ansible


I am trying to write an Ansible playbook which will identify newly added disks on a RHEL machine. The plan is to run the playbook and cache the disks at that point in time as a fact prior to creating the new disks. After creating the new disks, the the same playbook would be run again and would compute the difference in disks before and after the disks are created.

For example, lsblk initially returns the following:

NAME              SIZE  TYPE
sda               100G  disk
├─sda1              1G  part
└─sda2             99G  part
  ├─rhel-root      50G  lvm
  ├─rhel-swap     7.9G  lvm
  └─rhel-home    41.1G  lvm
sr0              1024M  rom

after adding 8 new disks, lsblk returns:

NAME              SIZE  TYPE
sda               100G  disk
├─sda1              1G  part
└─sda2             99G  part
  ├─rhel-root      50G  lvm
  ├─rhel-swap     7.9G  lvm
  └─rhel-home    41.1G  lvm
sdb              18.6G  disk
sdc              18.6G  disk
sdd              18.6G  disk
sde              18.6G  disk
sdf              18.6G  disk
sdg              18.6G  disk
sdh              18.6G  disk
sdi              18.6G  disk
sr0              1024M  rom

Ideally I would be able to gather an initial list of disks of the form:

['sda']

and after creating the disks gather another list of disks of the form:

['sda', 'sdb', 'sdc', 'sdd', 'sde', 'sdf', 'sdg', 'sdh', 'sdi']

Computing the difference between the two lists would yield:

['sdb', 'sdc', 'sdd', 'sde', 'sdf', 'sdg', 'sdh', 'sdi']

which are the 8 newly created disks.

I am trying to avoid using a shell or command module call if possible.


Solution

  • This information is automatically gathered via Ansible's fact gathering mechanism. See Variables discovered from systems: Facts.

    For example:

    #!/usr/bin/env ansible-playbook
    - name: Lets look at some disks
      hosts: localhost
      become: false
      gather_facts: true
      tasks:
      - name: Output disk information
        debug:
          var: hostvars[inventory_hostname].ansible_devices
    

    If we instead use the gather_subset configuration on the setup module instead we can speed up the fact gathering and only gather information about system hardware.

    We can then combine this with the Python keys() method and the Jinja2 list filter to produce your desired output.

    #!/usr/bin/env ansible-playbook
    - name: Lets look at some disks
      hosts: localhost
      become: false
      gather_facts: false
      tasks:
      - name: Collect only facts about hardware
        setup:
          gather_subset:
          - hardware
    
      - name: Output disks
        debug:
          var: hostvars[inventory_hostname].ansible_devices.keys() | list
    

    It is also possible to configure which facts to gather in the Ansible configuration file ansible.cfg using the gather_subset key in the [defaults] section.

    If you want to filter out various disk types the easiest way would be to use map('regex_search', '*search string*') to extract the values you want. You can the remove the nulls via select('string').

    For example with disks of the form sd*:

    #!/usr/bin/env ansible-playbook
    - name: Lets look at some disks
      hosts: localhost
      become: false
      gather_facts: false
      tasks:
      - name: Collect only facts about hardware
        setup:
          gather_subset:
          - hardware
    
      - name: Output disks
        debug:
          var: hostvars[inventory_hostname].ansible_devices.keys() | map('regex_search', 'sd.*') | select('string') | list