Search code examples
neural-networknon-linear-regression

Normalization before data split in Neural Network


I am trying to run a MLP regressor on my dataset with one hidden layer. I am doing a standardization of my data but I want to be clear as whether it matters if I do the standardization after or before splitting the dataset in Training and Test set. I want to know if there will be any difference in my prediction values if I carry out standardization before data split.


Solution

  • Yes and no. If mean and variance of the training and test set are different, standardization can lead to a different outcome.

    That being said, a good training and test set should be similar enough so that the data points are distributed in a similar way, and post-split standardization should give the same results.