Search code examples
pythonpandasstacked

Stacked plot from pandas dataframe


I would like to create a stacked bar plot from the following dataframe:

   VALUE     COUNT  RECL_LCC  RECL_PI
0      1  15686114         3        1
1      2  27537963         1        1
2      3  23448904         1        2
3      4   1213184         1        3
4      5  14185448         3        2
5      6  13064600         3        3
6      7  27043180         2        2
7      8  11732405         2        1
8      9  14773871         2        3

There would be 2 bars in the plot. One for RECL_LCC and other for RECL_PI. There would be 3 sections in each bar corresponding to the unique values in RECL_LCC and RECL_PI i.e 1,2,3 and would sum up the COUNT for each section. So far, I have something like this:

df = df.convert_objects(convert_numeric=True)    
sub_df = df.groupby(['RECL_LCC','RECL_PI'])['COUNT'].sum().unstack()
sub_df.plot(kind='bar',stacked=True)

However, I get this plot:

enter image description here

Any idea on how to obtain 2 columns (RECL_LCC and RECL_PI) instead of these 3?


Solution

  • So your problem was that the dtypes were not numeric so no aggregation function will work as they were strings, so you can convert each offending column like so:

    df['col'] = df['col'].astype(int)
    

    or just call convert_objects on the df:

    df.convert_objects(convert_numeric=True)