Search code examples
pythonpandasstacked-area-chart

How to only plot the top n highest values in stacked bar chart in a pandas df?


I have a df that looks like:

  CD1  CD2  CD3   ...  FG1  FG2
0 3.8  2.9  0     ...  0.1  0.1
1 0.1  0    4.1   ...  5.2  0
# 35 columns and 2 rows

And I plot a stacked bar chart using:

colors = plt.cm.jet(np.linspace(0, 1, 35))
df3.plot(kind='barh',stacked=True, figsize=(15,10),color=colors, width=0.08)

But my issue is that this plots all 35 columns however I want to only plot the n columns with the highest values e.g. only plot CD1 and CD2 for row 0 and CD3 and FG1 for row 1...

  CD1  CD2  CD3   ...  FG1  FG2
0 3.8  2.9  -     ...  -     -
1  -    -   4.1   ...  5.2   -

Is there a way to do this?


Solution

  • If I understand what you're asking for... It seems you can accomplish this by getting the max for each column followed by nlargest to pick the top 10 columns:

    df.max().nlargest(10)
    

    The result should be a Series indexed by column names, so it should be easy to plot that data.