I have a dataframe with quantities and values of different categories summarized. I need to visualize this to show how many categories are under different groups of quantities and what is the value they have earlier in summation.
Sample dataframe to use:
df = pd.DataFrame({'cat': ['A','B','C','D','E','F','G','H','I','J'],
'count': [5,10,50,20,3,18,28,93,42,31],
'value': [100,245,890,510,85,690,730,2470,1870,1180],
})
I created the histogram for counts using this:
df.plot(kind='hist',y='count',bins=[0,20,40,60,80,100])
This will show me the distribution of 'cat' in different groups (classes) of 'count' variable.
Now, for each such class, I need to have a total of 'value' visualized on the same chart. Either just the sum shown as a number against each histogram bar or a line with secondary y-axis on the right of the same chart (axes).
This will enable me to show that categories having count of (say) 0-20 have earned value in total of 1220. [value(A+B+E+F)]
Also, you may suggest if instead of histogram, I should be using some other chart to visualize this statement better.
I used the pandas.cut() method to create bins manually and generated another dataframe which was aggregate of the earlier one.
This is the closest that I could come up with. But I still do not get a clear visualization of what I want to show.
df['Bins'] = pd.cut(df['count'],bins=range(0,70,10))
df1 = df.groupby('Bins').agg({'Bins':'count','value':'sum'})
df1.plot(kind='bar',subplots=True,figsize=(15,8))
plt.show()