Search code examples
pythonpandasmatplotlibgroup-bydata-analysis

groupby multiple values, and plotting results


I'm using some data on fungicide usage which has the Year, Fungicide, Amount used, along with some irrelevant columns in a panda DataFrame. It looks somewhat like:

Year, State,      Fungicide, Value
2011, California, A,         12879
2011, California, B,         29572
2011, Florida,    A,         8645
2011, Florida,    B,         19573
2009, California, A,         8764
2009, California, B,         98643,
...

What I want from it is a single plot of total fungicide used over time, with a line plotted for each individual fungicide (in a different colour). I've used .groupby to get the total amount of each fungicide used each year:

apple_fplot = df.groupby(['Year','Fungicide'])['Value'].sum()

This gives me the values I want to plot, something like:

Year, Fungicide, Value
...
2009, A,        128635
      B,        104765
2011, A,        154829
      B,        129865

Now I need to plot it so that each fungicide (A, B, ...) is a separate line on a single plot of Value over Time

Is there a way of doing this without separating it all out? Forgive my ignorance, I'm new to python and am still getting familiar with it.


Solution

  • For a clean solution that properly prints legend and xticks, you could

    apple_fplot = df.groupby(['Year','Fungicide'])['Value'].sum()
    plot_df = apple_fplot.unstack('Fungicide').loc[:, 'Value']
    plot_df.index = pd.PeriodIndex(plot_df.index.tolist(), freq='A')
    plot_df.plot()
    

    enter image description here For subplots, just set the respective keyword to True:

    plot_df.plot(subplots=True)
    

    to get:

    enter image description here