I would like to create a stacked bar plot from the following dataframe:
VALUE COUNT RECL_LCC RECL_PI
0 1 15686114 3 1
1 2 27537963 1 1
2 3 23448904 1 2
3 4 1213184 1 3
4 5 14185448 3 2
5 6 13064600 3 3
6 7 27043180 2 2
7 8 11732405 2 1
8 9 14773871 2 3
There would be 2 bars in the plot. One for RECL_LCC and other for RECL_PI. There would be 3 sections in each bar corresponding to the unique values in RECL_LCC and RECL_PI i.e 1,2,3 and would sum up the COUNT for each section. So far, I have something like this:
df = df.convert_objects(convert_numeric=True)
sub_df = df.groupby(['RECL_LCC','RECL_PI'])['COUNT'].sum().unstack()
sub_df.plot(kind='bar',stacked=True)
However, I get this plot:
Any idea on how to obtain 2 columns (RECL_LCC and RECL_PI) instead of these 3?
So your problem was that the dtypes were not numeric so no aggregation function will work as they were strings, so you can convert each offending column like so:
df['col'] = df['col'].astype(int)
or just call convert_objects
on the df:
df.convert_objects(convert_numeric=True)