Search code examples
machine-learninggradient-descentamazon-sagemaker

How should I read the sum of the values that RMSPROP produces?


I have a 2D time series data-set with integers ranging in 1,000,000 - 2,000,000 output on any given day. Of course my data is not limited, as I can sum up to weekly values hence the range increasing to over 10,000,000.

I'm able to achieve RMSE = 0.02 whenever I normalize my data, but when I feed the raw(1 million range) data into the algorithm, RSME can equal up to 30k - 150k error range.

Why in one version of the RMSE outputs my "global minima" is 0.02, while the other output in higher ranges? I've been testing with AdaDelta.


Solution

  • The definition of RMSE is:

    enter image description here

    The scale of this value directly depends on the scale on predictions and actuals, so it's quite normal that you get a higher RMSE value when you don't normalize the dataset.

    This is why normalization is important, as it lets us compare error metrics across models and datasets.