Search code examples
pythonlistdictionarycomparesublist

Difference between multiple elements in list with same string . Python 2.7


This is a little confusing so I will try my best to explain my goal. In a nutshell i'm trying to look at a sublist within a list. In those sublists, some have the same starting element (sublist[0]) and i want to record the differences between that sublist with other sublists starting with the same element

data = [['o1415', '1', '0', '1'], ['o1415', '0', '0', '0'], ['o1414', '0', '0', '0'], ['o1414', '1', '0', '0'], ['o1414', '0', '0', '0'], ['o1408', '0', '0', '1'], ['o1406', '0', '0', '0']]
D_changes = {}

here is a list with 4 elements . . the first of which has a name, 2nd/3rd/4th elements have digits .

i'm trying to generate a dictionary that has the {name:[then,the,differences])}

for example data[0] and data[1] both have 'o1415' as their first element . since they have the same string for the first element i want to compare the rest of the lists with each other . so data[0] differs in data[0][1] and data[0][2] from data[1] . . . so i want to add 'o1415':['first','third'] to the empty dictionary D_changes.

another example would be 'o1414' which is in data[2],data[3],data[4] and for these lists, one element is different in the [1] position so i'd like to add 'o1414' : ['first'] to the empty dictionary above

in the end i want to obtain a dictionary with this type of content

desired_changes = {'o1415':['first','third'],'o1414':['first'],'o1408':[],'o1406':[]}

Solution

  • I'll give you a direction more than a full answer.

    First, load up a dict to group like items for further processing; I'll use a defaultdict:

    d = defaultdict(list)
    
    data = [['o1415', '1', '0', '1'], ['o1415', '0', '0', '0'], ['o1414', '0', '0', '0'], ['o1414', '1', '0', '0'], ['o1414', '0', '0', '0'], ['o1408', '0', '0', '1'], ['o1406', '0', '0', '0']]
    
    for sub in data:
        d[sub[0]].append([int(x) for x in sub[1:]])
    

    Then, for a given key, simply look at the zip of its values. i.e. for 'o1414':

    d['o1414']
    Out[58]: [[0, 0, 0], [1, 0, 0], [0, 0, 0]]
    
    list(zip(*d['o1414']))
    Out[59]: [(0, 1, 0), (0, 0, 0), (0, 0, 0)]
    

    We know if they're all equal if it's all 1, or all 0; otherwise it's different. So just do:

    [any(x) and not all(x) for x in zip(*d['o1414'])]
    Out[60]: [True, False, False]
    

    I particularly like the aesthetics of that - any(x) and not all(x). Python can be beautiful sometimes.

    Anyway, True means that you have a differing value in that slot. I'll leave it up to you do do that for all your keys and to get it into the format that you want.