Search code examples
pythonpandasbar-chartseabornstacked-chart

Barplot from a dataframe using a column to set the bar colors


I have a dataframe such as (this is a subset of the dataframe):

    Species     Pathway        Number of Gene Families
0   Glovio      ABC                    0.5
1   Glovio      ABC/Synthase           1.0
2   Glovio      Synthase               0.0
3   Glovio      Wzy                   10.0
4   Glovio      Wzy/ABC                0.0
5   n2          ABC                    2.0
6   n2          ABC/Synthase           0.0
7   n2          Synthase               13.0
8   n2          Wzy                    7.0
9   n2          Wzy/ABC                0.0
10  Glokil      ABC                    2.0
11  Glokil      ABC/Synthase           1.0
12  Glokil      Synthase               0.0
13  Glokil      Wzy                    4.0
14  Glokil      Wzy/ABC                0.0

I want to plot a stacked bar plot where each bar corresponds to the species (x-axis). The y-axis would display the Number of Gene Families, colour-coded by the Pathway.

I have tried simple things, such as:

df[['Pathway']].plot(kind='bar', stacked=True)

But I get an error stating that:

Empty 'DataFrame': no numeric data to plot

Any ideas?

Thank you!


Solution

  • I would do a set_index().unstack():

    (df.set_index(['Species','Pathway'])
       ['Number of Gene Families']
       .unstack('Pathway')
       .plot.bar(stacked=True)
    )
    

    Output:

    enter image description here