Search code examples
pythonpandasdataframe

Calculate % in Pandas pivot_table?


I have a table with data for who viewed a page and who clicked on it. The following code gets me the pivot table below:

# users who clicked, by hour and weekday
pct = df.pivot_table(columns=['weekday'],index=['hour'], values=['users_who_clicked','users_who_viewed'], aggfunc= sum, fill_value=0, margins=True)
pct

pivot table

I've been googling and searching here for a solid hour but I can't seem to find how to just display the rate: i.e. 'users_who_clicked' divided by 'users_who_viewed'. I don't need to see the other two series.

I feel like if I just create a new column across the whole data set dividing the two it won't aggregate correctly, so the aggregation needs to happen within the table itself. It's easy in Excel, just use a calculated field, so I imagine it must be fairly straightforward in pandas as well. Any thoughts?

Thanks!

p.s. Tried using a lambda function but got a "TypeError: unsupported operand type(s) for +: 'int' and 'str'" error. Tried putting the two tables directly in aggfunc and no luck.


Solution

  • Use DataFrame.xs for select by MultiIndex, rename first level and divide, last append to original DataFrame:

    d = {('users_who_clicked','a'): [5, 6, 7, 8], 
         ('users_who_clicked','b'): [9, 10, 11, 12], 
         ('users_who_viewed','a'): [4, 1, 3, 7], 
         ('users_who_viewed','b'): [1, 3, 7, 3]}
    
    df = pd.DataFrame(data=d)
      
    df1 = (df.xs('users_who_clicked', axis=1, level=0, drop_level=False)
             .rename(columns={'users_who_clicked':'%'}, level=0))
    df2 = (df.xs('users_who_viewed', axis=1, level=0, drop_level=False)
             .rename(columns={'users_who_viewed':'%'}, level=0))
    
    out = pd.concat([df, df1.div(df2)], axis=1)  
    print (out)
      users_who_clicked     users_who_viewed            %          
                      a   b                a  b         a         b
    0                 5   9                4  1  1.250000  9.000000
    1                 6  10                1  3  6.000000  3.333333
    2                 7  11                3  7  2.333333  1.571429
    3                 8  12                7  3  1.142857  4.000000