Search code examples
pythonpandasdictionaryrankingrecommendation-engine

Pythonic way to efficient create dictionaries from Pandas


I have a Pandas dataframe that contains columns id, date_created, rank_1, rank_2, rank_3. Below shows 2 rows of the dataframe.

id date_created rank_1 rank_2 rank_3
2223 3/3/21 3:26 www.google.com www.yahoo.com www.ford.com
1112 2/25/21 1:35 www.autoblog.com www.motor1.com www.webull.com

I am trying to assign a new column to this df and call it rank_dict, which will assign number 3 to the rank_1 URL, number 2 to rank_2 URL and number 1 to rank_3 URL. So the ideal result would look like this:

id date_created rank_1 rank_2 rank_3 rank_dict
2223 3/3/21 3:26 www.google.com www.yahoo.com www.ford.com {www.google.com:3, www.yahoo.com:2, www.ford.com:1}
1112 2/25/21 1:35 www.autoblog.com www.motor1.com www.webull.com {www.autoblog.com:3, www.motor1.com:2, www.webull.com:1}

I know how to do this if it's not a Pandas df. For example, if I have these key values lists:

keys = ['www.google.com','www.yahoo.com','www.ford.com']

values = [3, 2, 1]

I can do res_dict = dict(zip(keys, values)) to turn it into the dict: {'www.google.com': 3, 'www.yahoo.com': 2, 'www.ford.com': 1}.

But I couldn't figure out an elegant way to perform this dictionary creation in a Pandas df. Could anyone help me?


Solution

  • One way is apply and use enumerate to get the, well, enumeration:

    df['rank_dict'] = (df.filter(like='rank_')
                         .apply(lambda x: {v:3-k for k,v in enumerate(x)}, axis=1)
                      )
    

    Output:

         id   date_created             rank_1           rank_2          rank_3                                          rank_dict
    0  2223   3/3/21 3:26     www.google.com    www.yahoo.com     www.ford.com  {'www.google.com': 3, 'www.yahoo.com': 2, 'w...'
    1  1112  2/25/21 1:35   www.autoblog.com   www.motor1.com   www.webull.com  {'www.autoblog.com': 3, 'www.motor1.com': 2,...'