Search code examples
pythonpandasdataframecounter

Transform a Counter object into a Pandas DataFrame


I used Counter on a list to compute this variable:

final = Counter(event_container)

print final gives:

Counter({'fb_view_listing': 76, 'fb_homescreen': 63, 'rt_view_listing': 50, 'rt_home_start_app': 46, 'fb_view_wishlist': 39, 'fb_view_product': 37, 'fb_search': 29, 'rt_view_product': 23, 'fb_view_cart': 22, 'rt_search': 12, 'rt_view_cart': 12, 'add_to_cart': 2, 'create_campaign': 1, 'fb_connect': 1, 'sale': 1, 'guest_sale': 1, 'remove_from_cart': 1, 'rt_transaction_confirmation': 1, 'login': 1})

Now I want to convert final into a Pandas DataFrame, but when I'm doing:

final_df = pd.DataFrame(final)

but I got an error.

I guess final is not a proper dictionary, so how can I convert final to a dictionary? Or is it an other way to convert final to a DataFrame?


Solution

  • You can construct using from_dict and pass param orient='index', then call reset_index so you get a 2 column df:

    In [40]:
    from collections import Counter
    d = Counter({'fb_view_listing': 76, 'fb_homescreen': 63, 'rt_view_listing': 50, 'rt_home_start_app': 46, 'fb_view_wishlist': 39, 'fb_view_product': 37, 'fb_search': 29, 'rt_view_product': 23, 'fb_view_cart': 22, 'rt_search': 12, 'rt_view_cart': 12, 'add_to_cart': 2, 'create_campaign': 1, 'fb_connect': 1, 'sale': 1, 'guest_sale': 1, 'remove_from_cart': 1, 'rt_transaction_confirmation': 1, 'login': 1})
    df = pd.DataFrame.from_dict(d, orient='index').reset_index()
    df
    
    Out[40]:
                              index   0
    0                         login   1
    1   rt_transaction_confirmation   1
    2                  fb_view_cart  22
    3                    fb_connect   1
    4               rt_view_product  23
    5                     fb_search  29
    6                          sale   1
    7               fb_view_listing  76
    8                   add_to_cart   2
    9                  rt_view_cart  12
    10                fb_homescreen  63
    11              fb_view_product  37
    12            rt_home_start_app  46
    13             fb_view_wishlist  39
    14              create_campaign   1
    15                    rt_search  12
    16                   guest_sale   1
    17             remove_from_cart   1
    18              rt_view_listing  50
    

    You can rename the columns to something more meaningful:

    In [43]:
    df = df.rename(columns={'index':'event', 0:'count'})
    df
    
    Out[43]:
                              event  count
    0                         login      1
    1   rt_transaction_confirmation      1
    2                  fb_view_cart     22
    3                    fb_connect      1
    4               rt_view_product     23
    5                     fb_search     29
    6                          sale      1
    7               fb_view_listing     76
    8                   add_to_cart      2
    9                  rt_view_cart     12
    10                fb_homescreen     63
    11              fb_view_product     37
    12            rt_home_start_app     46
    13             fb_view_wishlist     39
    14              create_campaign      1
    15                    rt_search     12
    16                   guest_sale      1
    17             remove_from_cart      1
    18              rt_view_listing     50