Feature Scaling

I read it from a post that someone said:

For feature scaling, you learn the means and standard deviation of the training set, and then:

Standardize the training set using the training set means and standard deviations.
Standardize any test set using the training set means and standard deviations.

But now my question is, after fitting a model using scaled training data, should I then apply this fitted model onto scaled or unscaled test data? Thanks!

Solution

Yes, you should also scale the test data. If you have scaled your training data and fitted a model to that scaled data, then the test set should also undergo equivalent preprocessing as well. This is standard practice, as it ensures that the model is always provided a data set of consistent form as input.

In Python, the process might look as follows:

scaler = StandardScaler()
X_train = scaler.fit_transform(X_train)
X_test = scaler.transform(X_test)

There is a detailed write up on this topic on another thread that might be of interest to you.