Search code examples
pythonpandasstatisticsconfidence-interval

Calculate confidence interval for each row in a pandas DataFrame


I have a pandas dataframe where each column is a prediction of a time series and would like to calculate a mean and the confidence interval around it so I can plot it. For now I am looping on each row, calculating the mean, min, and max, then plotting the mean with fill_between(min,max), but I don't think that is the correct way to do it. The dataframe would look like:

Pred1 Pred2 Pred3
x1 x1 x1
x2 x2 x2
x3 x3 x3

Except it would be larger. Around 50 columns and 300 rows. Any idea of how I can do this efficiently? Have to do it on many similar tables.


Solution

  • IIUC you can try with mean, min and max along the columns (axis=1)

    ax = df.mean(axis=1).plot()
    ax.fill_between(df.index, df.min(axis=1), df.max(axis=1), alpha=0.2)