Search code examples
jsonmergeansiblejq

merge 2 json arrays using ansible or jq


So I have 2 json array files one with some simple data (could be any number of servers):

    [
    {
    "playbook": "simplerun.yml",
    "server": "abc",
    "status": "success"
    },
    {
    "playbook": "simplerun.yml",
    "server": "def",
    "status": "success"
    }
    etc
    ]

and a master file (hundreds of thousands of servers) with enhanced data about the server (but is called Name)

    [
    {
    "name": "abc",
    "fqdn": "abc.com",
    "env": "UAT",
    "ip_address": "0.0.0.0",
    "owner": "John Doe"
    },
    {
    "name": "def",
    "fqdn": "def.com",
    "env": "PROD",
    "ip_address": "0.0.0.1",
    "owner": "Jane Doe"
    }
    etc
    ]

what I would like to have is

    [
    {
    "playbook": "simplerun.yml",
    "server": "abc",
    "status": "success",
    "fqdn": "abc.com",
    "env": "UAT",
    "ip_address": "0.0.0.0",
    "owner": "John Doe"
    },
    {
    "playbook": "simplerun.yml",
    "server": "def",
    "status": "success",
    "fqdn": "def.com",
    "env": "PROD",
    "ip_address": "0.0.0.1",
    "owner": "Jane Doe"
    }
    ]

I know it must be simple, but

Any help is greatly appreciated.

I have tried combine

     - name: set fact
       set_fact:
         key1: "server"
         key2: "name"
     
     - name: join data
       set_fact:
         joined_data: >- 
           {{ jsonfile1 | map ('combine',
              jsonfile2 |selectattr (key2,'eq', item[key1]|list|first|default ({}))
           }}
       loop: "{{ jsonfile1 }}"
       loop_control:
         loop_var: item

however that only produced the same values for all the entries in the simple data.


Solution

  • Read the files, for example

      names: "{{ lookup('file', 'names.yml') }}"
      pb_status: "{{ lookup('file', 'pb_status.yml') }}"
    

    gives

      names:
      - env: UAT
        fqdn: abc.com
        ip_address: 0.0.0.0
        name: abc
        owner: John Doe
      - env: PROD
        fqdn: def.com
        ip_address: 0.0.0.1
        name: def
        owner: Jane Doe
    
      pb_status:
      - playbook: simplerun.yml
        server: abc
        status: success
      - playbook: simplerun.yml
        server: def
        status: success
    

    There are more options:

    1. Get the list of servers
      servers: "{{ pb_status | map(attribute='server') }}"
    

    gives

      servers:
      - abc
      - def
    

    Select names in the servers' list

      names_sel: "{{ names | selectattr('name', 'in', servers) }}"
    

    gives the same items here in testing, but there are a lot more items in production

      names_sel:
      - env: UAT
        fqdn: abc.com
        ip_address: 0.0.0.0
        name: abc
        owner: John Doe
      - env: PROD
        fqdn: def.com
        ip_address: 0.0.0.1
        name: def
        owner: Jane Doe
    

    Rename the key server to name and merge the lists

      target:
          - {after: name, before: server}
      result: "{{ pb_status | community.general.replace_keys(target=target) |
                    community.general.lists_mergeby(names_sel, 'name') }}"
    

    gives what you want

      result:
      - env: UAT
        fqdn: abc.com
        ip_address: 0.0.0.0
        name: abc
        owner: John Doe
        playbook: simplerun.yml
        status: success
      - env: PROD
        fqdn: def.com
        ip_address: 0.0.0.1
        name: def
        owner: Jane Doe
        playbook: simplerun.yml
        status: success
    

    Example of a complete playbook for testing

    - hosts: localhost
    
      vars:
    
        names: "{{ lookup('file', 'names.yml') }}"
        pb_status: "{{ lookup('file', 'pb_status.yml') }}"
    
        servers: "{{ pb_status | map(attribute='server') }}"
        names_sel: "{{ names | selectattr('name', 'in', servers) }}"
        target:
          - {after: name, before: server}
        result: "{{ pb_status | community.general.replace_keys(target=target) |
                    community.general.lists_mergeby(names_sel, 'name') }}"
    
      tasks:
    
        - debug:
            var: names
        - debug:
            var: pb_status
        - debug:
            var: servers
        - debug:
            var: names_sel
        - debug:
            var: result
    

    1. Group the status items and create a dictionary if there might be more status items for the same server
      pb_status_group: "{{ pb_status | groupby('server') | community.general.dict }}"
    

    gives

      pb_status_group:
        abc:
        - playbook: simplerun.yml
          server: abc
          status: success
        def:
        - playbook: simplerun.yml
          server: def
          status: success
    

    Convert the list of names to a dictionary too. It is expected that the names are unique

      names_dict: "{{ dict(names | json_query('[].[name, @]')) }}"
    

    gives

      names_dict:
        abc:
          env: UAT
          fqdn: abc.com
          ip_address: 0.0.0.0
          name: abc
          owner: John Doe
        def:
          env: PROD
          fqdn: def.com
          ip_address: 0.0.0.1
          name: def
          owner: Jane Doe
    

    After you have converted the lists to dictionaries the search will be more efficient. Select names

      names_sel: "{{ pb_status_group | map('extract', names_dict) }}"
    

    gives the same items here in testing, but there are a lot more items in production

      names_sel:
      - env: UAT
        fqdn: abc.com
        ip_address: 0.0.0.0
        name: abc
        owner: John Doe
      - env: PROD
        fqdn: def.com
        ip_address: 0.0.0.1
        name: def
        owner: Jane Doe
    

    Convert the dictionary pb_status_group to a list and merge the lists by name

      result: "{{ pb_status_group | dict2items(key_name='name', value_name='status') |
                  community.general.lists_mergeby(names_sel, 'name') }}"
    

    gives what want

      result:
      - env: UAT
        fqdn: abc.com
        ip_address: 0.0.0.0
        name: abc
        owner: John Doe
        status:
        - playbook: simplerun.yml
          server: abc
          status: success
      - env: PROD
        fqdn: def.com
        ip_address: 0.0.0.1
        name: def
        owner: Jane Doe
        status:
        - playbook: simplerun.yml
          server: def
          status: success
    

    Example of a complete playbook for testing

    - hosts: localhost
    
      vars:
    
        names: "{{ lookup('file', 'names.yml') }}"
        pb_status: "{{ lookup('file', 'pb_status.yml') }}"
    
        pb_status_group: "{{ pb_status | groupby('server') | community.general.dict }}"
        names_dict: "{{ dict(names | json_query('[].[name, @]')) }}"
        names_sel: "{{ pb_status_group | map('extract', names_dict) }}"
        result: "{{ pb_status_group | dict2items(key_name='name', value_name='status') |
                    community.general.lists_mergeby(names_sel, 'name') }}"
    
      tasks:
    
        - debug:
            var: names
        - debug:
            var: pb_status
        - debug:
            var: pb_status_group
        - debug:
            var: names_dict
        - debug:
            var: names_sel
        - debug:
            var: result