Search code examples
pandasdataframecountsumpercentage

Frequency of Value Column Given a Count Column


A dataframe has two columns ['Value', 'Count']. Value contains non-unique values. Count contains the number of occurances of Value. I want to plot Value vs sum of Count. Although this code works, I feel it doesn't utilize the power of pandas. What am I missing?

df = pd.DataFrame({'Value':[1,3,2,1],'Count':[5,2,1,4]})
gdf = df.groupby('Value')
sumdf = pd.DataFrame({'Value':k,'Sum':g['Count'].sum()} for k,g in gdf)
sumdf['Pct'] = sumdf['Sum'] / sumdf['Sum'].sum() * 100
sumdf.plot(x='Value',y='Pct',kind='bar',title='Frequency of Value')

Solution

  • Here's a one-liner:

    ax = (df.groupby('Value')['Count'].sum() / df['Count'].sum() * 100).plot.bar(title='Frequency of Value')
    

    Output:

    enter image description here