Search code examples
pythonloopscomparecomparison

Get differences and similarities between two dicts/lists


I have two configuration files that are basically two yaml lists/dicts.

config_one: [{"ip": "0.0.0.0/24", "id": 1, "name": "First"},{"ip": "0.0.0.2/24", "id": 2, "name": "Second"},{"ip": "0.0.0.3/24", "id": 3, "name": "Third"}]
config_two: [{"ip": "0.0.0.3/24", "id": 30, "name": "Third"},{"ip": "0.0.0.0/24", "id": 1,"name": "First"}, {"ip": "0.0.0.2/24", "id": 2, "name": "Second"}]

I would like to compare these two config files with each other and write/print the similarities and differences. To make it even more fun, let's say "config_one" is "the truth" and I would like that, if there is a difference, to also print what it should be. Something in the lines of,

If there is a match:

"First config - 0.0.0.0/24 - id: 1 can be found in config_two and is in line with expected "First config - 0.0.0.0/24 - id: 1" entry found in config_one"

If there is a difference between "as is" config_two and "should be" config one:

"Third config - 0.0.0.3/24 - id: 30 can be found in config_two but is not in line with expected "Third config - 0.0.0.3/24 - id: 3" entry found in config one"

I tried playing around with some nested for loops but got stuck and was never able to truly find the way to actually address the keys and values in the second list without getting stuck in an "endless loop".

   for i in config_one:
       for j in config_two:
         if i == j:
           print: i['name'] + i['ip'] + i['id'] + " matches " + j['name'] + j['ip'] + j['id']
         else:
           print i['name'] + i['ip'] + i['id'] + " does not match, it should be " + j['name'] + j['ip'] + j['id'] 
   

any idea how I could tackle this?


Solution

  • If the config can fit into memory, then using dictionaries makes this pretty easy. (I'm assuming that we need to match up the individual configs according to a key - e.g. "ip" or "name", etc.)

    config_one = [{"ip": "0.0.0.0/24", "id": 1, "name": "First"},{"ip": "0.0.0.2/24", "id": 2, "name": "Second"},{"ip": "0.0.0.3/24", "id": 3, "name": "Third"}]
    config_two = [{"ip": "0.0.0.3/24", "id": 30, "name": "Third"},{"ip": "0.0.0.0/24", "id": 1,"name": "First"}, {"ip": "0.0.0.2/24", "id": 2, "name": "Second"}]
    
    def compare_configs(config_one, config_two, key):
        matches = []
        differences = []
        missing = []
        lookup = {item[key]: item for item in config_two}
    
        for item in config_one:
            if item[key] in lookup:
                is_match = item == lookup[item[key]]
                if is_match:
                    matches.append(item)
                else:
                    differences.append((item, lookup[item[key]]))
            else:
                missing.append(item)
    
        return matches, differences, missing
    
    matches, differences, missing = compare_configs(config_one, config_two, "ip")
    print(matches)
    print(differences)
    print(missing)
    

    This is the result:

    [{'ip': '0.0.0.0/24', 'id': 1, 'name': 'First'}, {'ip': '0.0.0.2/24', 'id': 2, 'name': 'Second'}]
    [({'ip': '0.0.0.3/24', 'id': 3, 'name': 'Third'}, {'ip': '0.0.0.3/24', 'id': 30, 'name': 'Third'})]
    []
    

    Here I create three lists, matches, differences, and missing

    • matches contains all those configs that are the same in each list
    • differences contains configs that match on a key, but some other value is different
    • missing contains configs in config_one that aren't in config_two