Search code examples
time-seriesconfidence-intervalfacebook-prophet

Can I calculate the confidence bound of a Prophet model that would contain a certain value?


can I use the y-hat variance, the bounds, and the point estimate from the forecast data frame to calculate the confidence level that would contain a given value?

I've seen that I can change my interval level prior to fitting but programmatically that feels like a LOT of expensive trial and error. Is there a way to estimate the confidence bound using only the information from the model parameters and the forecast data frame?

Something like:

for level in [.05, .1, .15, ... , .95]:
  if value_in_question in (yhat - Z_{level}*yhat_variance/N, yhat + Z_{level}*yhat_variance/N):
        print 'im in the bound level {level}'
# This is sudo code not meant to run in console 

EDIT: working prophet example

# csv from fbprohets working examples https://github.com/facebook/prophet/blob/master/examples/example_wp_log_peyton_manning.csv

import pandas as pd
from fbprophet import Prophet
import os
df = pd.read_csv('example_wp_log_peyton_manning.csv')
m = Prophet()
m.fit(df)
future = m.make_future_dataframe(periods=30)
forecast = m.predict(future)

# the smallest confidence level s.t. the confidence interval of the 30th prediction contains 9

## My current approach
def __probability_calculation(estimate, forecast, j = 30):
    sd_residuals = (forecast.yhat_lower[j] - forecast.yhat[j])/(-1.28)
    for alpha in np.arange(.5, .95, .01):
        z_val = st.norm.ppf(alpha)
        if (forecast.yhat[j]-z_val*sd_residuals < estimate < forecast.yhat[j]+z_val*sd_residuals):
           return alpha

prob = __probability_calculation(9, forecast)

Solution

  • fbprophet uses the numpy.percentile method to estimate the percentiles as you can see here in the source code: https://github.com/facebook/prophet/blob/0616bfb5daa6888e9665bba1f95d9d67e91fed66/python/prophet/forecaster.py#L1448

    How to inverse calculate percentiles for values is already answered here: Map each list value to its corresponding percentile

    Combining everything based on your code example:

    import pandas as pd
    import numpy as np
    import scipy.stats as st
    from fbprophet import Prophet
    
    url = 'https://raw.githubusercontent.com/facebook/prophet/master/examples/example_wp_log_peyton_manning.csv'
    df = pd.read_csv(url)
       
    # put the amount of uncertainty samples in a variable so we can use it later.
    uncertainty_samples = 1000 # 1000 is the default
    m = Prophet(uncertainty_samples=uncertainty_samples)
    m.fit(df)
    future = m.make_future_dataframe(periods=30)
    
    # You need to replicate some of the preparation steps which are part of the predict() call internals
    tmpdf = m.setup_dataframe(future)
    tmpdf['trend'] = m.predict_trend(tmpdf)
    sim_values = m.sample_posterior_predictive(tmpdf)
    

    The sim_values object contains for every datapoint 1000 simulations on which the confidence interval is based.

    Now you can call the scipy.stats.percentileofscore method with any target value

    target_value = 8
    st.percentileofscore(sim_values['yhat'], target_value, 'weak') / uncertainty_samples
    # returns 44.26
    

    To prove this works backwards and forwards you can get the output of the np.percentile method and put it in the scipy.stats.percentileofscore method. This works for an accuracy of 4 decimals:

    ACCURACY = 4
    for test_percentile in np.arange(0, 100, 0.5):
        target_value = np.percentile(sim_values['yhat'], test_percentile)
        if not np.round(st.percentileofscore(sim_values['yhat'], target_value, 'weak') / uncertainty_samples, ACCURACY) == np.round(test_percentile, ACCURACY):
            print(test_percentile)
            raise ValueError('This doesnt work')