Search code examples
pythonpandasseabornreshapepercentile

python plot line for each percentile


How can I plot percentiles computed via pandas.DataFrame.describe for each percentile using seaborn?

Currently, I need to iterate over each one. Instead, I want a single chart with all the percentiles. https://seaborn.pydata.org/generated/seaborn.lineplot.html has some nice examples with hue and style, but I currently wonder how to properly reshape the data frame to be able to use this method.

import pandas as pd
%pylab inline

df = pd.DataFrame({'dt':['2020-01-01', '2020-01-02', '2020-01-03', '2020-01-03'], 'bar':[1,2,3, 4], 'baz':[3,4,5, 6]})
df['dt'] = pd.to_datetime(df['dt'])
display(df)

df = df.groupby(['dt']).describe()
df = df.reset_index()
df = df.set_index(['dt'], drop=False)
display(df)

import seaborn as sns; sns.set()

# iterate for each column (bar, baz)
df_plot = df[['dt']].copy()

# iterate for each percentile
df_plot['metric'] = df['bar']['25%']
sns.lineplot(x='dt', y='metric', data=df_plot)
plt.show()

df_plot['metric'] = df['bar']['50%']
sns.lineplot(x='dt', y='metric', data=df_plot)
plt.show()

df_plot['metric'] = df['bar']['75%']
sns.lineplot(x='dt', y='metric', data=df_plot)
plt.show()

Solution

  • You can simplify all of this using the following:

    import pandas as pd
    import seaborn as sns
    %pylab inline
    
    df = pd.DataFrame({'dt':['2020-01-01', '2020-01-02', '2020-01-03', '2020-01-03'], 'bar':[1,2,3, 4], 'baz':[3,4,5, 6]})
    df = df.groupby(['dt']).describe()
    sns.lineplot(data=df['baz'][['25%', '50%', '75%']])
    

    Result (maybe with an extra plt.show()? I don't have pylab installed to test.):

    enter image description here