Search code examples
pythonpython-3.xpandascollectionsdefaultdict

Updating the Key-Value pairs in defaultdict


enter image description here

The above dataframe is generated from following code:

newCols = ['Book-1', 'Book-2', 'Similarity Score']

l1 = ['b1', 'b1', 'b2']
l2 = ['b2', 'b3', 'b3']
score1 = [0.95, 0.87, 0.84]

duplicateProductList = pd.DataFrame(columns=newCols)

duplicateProductList['Book-1'] = l1
duplicateProductList['Book-2'] = l2
duplicateProductList['Similarity Score'] = score1

print(duplicateProductList)

I generated a dictionary from a Pandas Dataframe(duplicateProductList (shown above) ), using the following code:

from collections import defaultdict    

new_dict = {}

my_list = [(i,[a,b]) for i, a,b in zip(duplicateProductList['Book-1'], duplicateProductList['Book-2'], duplicateProductList['Similarity Score'])]
for (key, value) in my_list:
    if key in new_dict:
        new_dict[key].append(value)
    else:
        new_dict[key] = [value]

print(new_dict)

The above code snippet yields the following dictionary :

{'b1':[['b2', 0.95], ['b3', 0.87]], 'b2':[['b3', 0.84]]}

Instead, I want to yield the following dictionary:

{'b1':[['b2', 0.95], ['b3', 0.87]], 'b2':[['b1', 0.95],['b3', 0.84]], 'b3':[['b1', 0.87],['b2', 0.84]]}

Could someone help me in modifying the dictionary comprehension to yield the above dictionary?


Solution

  • >>> import collections
    >>> from pprint import pprint
    
    >>> df
      Book-1 Book-2  Similarity Score
    0     b1     b2              0.95
    1     b1     b3              0.87
    2     b2     b3              0.84
    >>> 
    >>> d = collections.defaultdict(list)
    >>> for row in df.itertuples(index=False):
        a,b,c = row
        d[a].append((b,c))
        d[b].append((a,c))
    
    
    >>> pprint(d)
    defaultdict(<class 'list'>,
                {'b1': [('b2', 0.95), ('b3', 0.87)],
                 'b2': [('b1', 0.95), ('b3', 0.84)],
                 'b3': [('b1', 0.87), ('b2', 0.84)]})