Search code examples
pythonpython-3.xdictionary-comprehension

Python3 Dictionary Comprehension


I'm having difficulty working out a dictionary comprehension.

I have a list of dictionaries, where each dictionary contains identical keys with different values:

  list_of_dictionaries = [{k1:v1, k2:v2}{k1:v3, k2:v4}{k1:v5, k2:v6}, ...]

I would like to have a single dictionary of lists, where each key has a value which is a list of those values found under that key in the list of dictionaries:

  dictionary_of_lists = {k1:[v1,v3,v5], k2:[v2,v4,v6], ...}

At the moment I'm creating this single, consolidated dictionary by manually entering the keys and using a list comprehension to fetch the values:

dictionary_of_lists = {
   k1:[i[k1] for i in list_of_dictionaries],
   k2:[i[k2] for i in list_of_dictionaries],
   ...
}

It's not so bad with a few keys, but with over twenty it's looking messy with repeated code. I'm struggling to formulate a dictionary comprehension which would achieve the same result. Something like "for each dictionary in this list, add the values corresponding to each key to a list represented by the same key in another dictionary"? I've tried the dict.update() method, which won't allow me to add the values to a list - it erases and 'updates' the value already there instead.


Solution

  • If you are allowed to use pandas, this is a much simpler solution.

    Using pandas, here's what you will get:

    import pandas as pd
    list_of_dicts = [{'k1':'v1', 'k2':'v2'}, {'k1':'v3', 'k2':'v4'},
                     {'k1':'v5', 'k2':'v6'}, {'k1':'v7', 'k2':'v8'},
                     {'k1':'v9', 'k2':'v10'}]
    df = pd.DataFrame(list_of_dicts)
    k = {c:df[c].tolist() for c in df.columns}
    print (k)
    

    The output of this will be:

    {'k1': ['v1', 'v3', 'v5', 'v7', 'v9'], 'k2': ['v2', 'v4', 'v6', 'v8', 'v10']}
    

    With this approach, you can keep adding as many keys as you like, the solution will be same.

    import pandas as pd
    list_of_dicts = [{'k1':'v1' , 'k2':'v2' , 'k3': 'v3'},
                     {'k1':'v4' , 'k2':'v5' , 'k3': 'v6'},
                     {'k1':'v7' , 'k2':'v8' , 'k3': 'v9'},
                     {'k1':'v10', 'k2':'v11', 'k3': 'v12'},
                     {'k1':'v13' ,'k2':'v14', 'k3': 'v15'}]
    df = pd.DataFrame(list_of_dicts)
    k = {c:df[c].tolist() for c in df.columns}
    print (k)
    

    This will result in:

    {'k1': ['v1', 'v4', 'v7', 'v10', 'v13'], 'k2': ['v2', 'v5', 'v8', 'v11', 'v14'], 'k3': ['v3', 'v6', 'v9', 'v12', 'v15']}
    

    The only limitation is that each set of dicts have to have same number of elements (k1, k2, k3). You cannot have (k1,k2) and (k1,k2,k3). Then the code will break as the dataframe is looking for same number of elements per column.