When I use the normalization with keras:
tf.keras.layers.Normalization()
Where should I use it?
I adapt it with train data:
x_train, x_test, y_train, y_test = train_test_split(X, y, test_size=0.25, random_state=0)
normalization_layer = tf.keras.layers.Normalization()
normalization_layer.adapt(x_train)
And then, I can normalize the data (train and test) prior to use in model.fit and model.evaluate:
x_train = normalization_layer(x_train)
x_test = normalization_layer(x_test)
Or include the layer in de network model, as the FIRST layer
model = tf.keras.models.Sequential()
model.add(normalization_layer)
...
If I use the latter option:
The adapt
method computes mean and variance of provided data (in this case train data). When this layer is added to model it uses those values to normalize the input data. This is simply done by
output = (input - mean)/sqrt(var)
where mean
and var
are values computed in the adapt
method.
You should only use train data for the adapt step as test data should not be used anywhere in creating/training the model.