Search code examples
machine-learningpytorchtime-seriesforecastingmse

The standard way to evaluate multi-step timeseries predicted values with the true values?


I'm working in a time series forecasting: if I have the target column (y) as a stock price to be forecasted,

I have a test data set size of 65 observations,

suppose that the prediction length (predicated prices for future days) = 20.

So, each observation of the test dataset will generate 20 predictions = array(65,20).

Now, to calculate the evaluation metrics MAE between(predictions, trues). what is the standard way to do so? I mean, should I: Take the first 20 predicted values and compare them with the first 20 true values. Calculate the absolute difference between each predicted value and its corresponding true value. or what? Can you please give me hints or some codes? This is what I have done:

preds = np.array(np.reshape(predictions, (65, 20)))
trues = np.array(np.reshape(values, (65, 20)))
preds= mm.inverse_transform(preds)
trues= mm.inverse_transform(trues)
preds= preds[0:,1]
trues= trues[0:,1]
metrics = metric(preds, trues)

I have seen other developers do it like that:

vals = np.concatenate(values, axis=0).ravel()
preds = np.concatenate(predictions, axis=0).ravel()

Solution

  • Approach 1: Assuming, you have a 2D array of predictions and true values, where each row corresponds to a different observation and each column corresponds to a different time step in the future. You can calculate the MAE for each observation and then average these to get the overall MAE.

    Using this code snippet:

    import numpy as np
    from sklearn.metrics import mean_absolute_error
    
    # Assuming preds and trues are your arrays of predictions and true values
    # They should have shape (65, 20)
    
    # Calculate the MAE for each observation
    mae_per_observation = np.mean(np.abs(preds - trues), axis=1)
    
    # Calculate the overall MAE
    overall_mae = np.mean(mae_per_observation)
    
    print(f"Overall MAE: {overall_mae}")
    

    The above code calculates the absolute difference between the predicted and true values for each time step and each observation. It then averages these to get the MAE for each observation and then averages these to get the overall MAE.

    Approach 2: The other approach you mentioned, where the predictions and true values are concatenated into 1D arrays, is also valid. This would give you the MAE across all time steps and all observations, rather than averaging the MAE for each observation.

    Like so:

    # Concatenate the predictions and true values into 1D arrays
    preds_1d = preds.ravel()
    trues_1d = trues.ravel()
    
    # Calculate the MAE
    overall_mae = mean_absolute_error(preds_1d, trues_1d)
    
    print(f"Overall MAE: {overall_mae}")
    

    The choice between the two methods depends on what you're interested in understanding from your model.

    1. Per-observation MAE: This method calculates the MAE for each observation (i.e., each stock price time series) separately and then averages them. This approach gives equal weight to each time series, regardless of its length. This might be useful if you're interested in understanding how well your model performs on average for each stock price time series. For example, if you have multiple stocks and you want to ensure your model performs reasonably well for each individual stock, this would be more appropriate.

    2. Overall MAE: This method calculates the MAE across all time steps and all observations. This approach gives equal weight to each individual prediction, regardless of which observation (stock) it belongs to. This might be useful if you're interested in the overall performance of your model across all predictions. For instance, if you're using the model to make a large number of trades/portfolio and care more about the total profit or loss across all trades, rather than the performance for individual stocks. This would be the way to go.