Suppose I have this pandas dataframe,
pC Truth
0 0.601972 0
1 0.583300 0
2 0.595181 1
3 0.418910 1
4 0.691974 1
'pC' is the probability of 'Truth' being 1. 'Truth' is binary value. I want to create histogram of the probability, and inside of each bin will be the proportion 0 vs proportion 1.
I tried the following,
df[['pC','Truth']].plot(kind='hist',stacked=True)
It just put 'Truth' value between 0 and 1.
Reproducible:
shape = 1000
df_t = pd.DataFrame({'pC': np.random.rand(shape),
'Truth':np.random.choice([0,1],size=shape)})
df_t['factor'] = pd.cut(df_t.pC,5)
How do I do this? Thanks
Solved this with,
shape = 1000
df_t = pd.DataFrame({'pC': np.random.rand(shape),
'Truth':np.random.choice([0,1],size=shape)})
df_t['factor'] = pd.cut(df_t.pC,5)
df_p = (df_t[['factor','Truth']]
.pivot_table(columns='Truth',index='factor',aggfunc=len,fill_value=0)
.reset_index())
df_p[['factor',0,1]].plot(kind='bar',stacked=True,x='factor');