Search code examples
tensorflowkerasneural-networkkeras-layertf.keras

How to write a custom stdandardization layer with tf.keras


currently I am building my first neural networks with tensorflow and keras API. I want to write a Layer for standardization of the input, because doing it in the preprocess could lead to errors when the model is used after training. Therefor I want to pass my training dataset to the init function of that Layer and calculate the mean and standard deviation.

The Problem is that I also want to save and load the model after training. But if I call load_model(modelname) I would get an error because the init function expects the training data as an argument. Furthermore I am not sure if its correct to specify the mean and std as a tf.Variable or if there is a better way to do it, such that these values get loaded when using load_model/load_weights.

I'm happy for every answer.

The following code represents the basic idea for such a layer.

class stdLayer(tf.keras.layers.Layer):
    def __init__(self, train_x, trainable=False,**kwargs):
        super(stdLayer, self).__init__(trainable=trainable,**kwargs)
        means=np.mean(train_x,axis=0)
        stds=np.std(train_x,axis=0)
        self.means = tf.Variable(means,
                                dtype=tf.float32,
                                name="means",
                                trainable=False)
        self.stds = tf.Variable(stds,
                                dtype=tf.float32,
                                name="stds",
                                trainable=False)


    def call(self, input):
        input_st = (input-self.means)/self.stds
        return input_st

Solution

  • Maybe you could just have a BatchNormalization layer at the start of your model.

    In all cases, though, you need that your production data is in the same format as the training data.


    Alternatively, instead of passing the training data, pass only means and stds.

    class StdLayer(tf.keras.layers.Layer):
        def __init__(self, means, stds, trainable=False,**kwargs):
            ...
            self.means_init = means
            self.stds_init = stds
            ...
    
        def build(self, input_shape):
            self.means = self.add_weight(name='means', 
                                          shape=self.means_init.shape,
                                          initializer='zeros',
                                          trainable=True) 
                                           #if you put false here, they will never be trainable
            self.stds =  self.add_weight(name='stds', 
                                          shape=self.stds_init.shape,
                                          initializer='ones',
                                          trainable=True)
                                          #if you put false here, they will never be trainable
    
            keras.backend.set_value(self.means, self.means_init)
            keras.backend.set_value(self.stds, self.stds)
    
            self.built = True
    
        def call(self, inputs):
            ...
    
        def compute_output_shape(self, input_shape):
            ...
    
        def get_config(self):
            config = {
                'means_init': self.means_init,
                'stds_init': self.stds_init,
            }
            base_config = super(StdLayer, self).get_config()
            return dict(list(base_config.items()) + list(config.items()))
    

    To be honest, I'm not sure if numpy arrays can be passed to config.
    You can maybe use __init__(self, means=None, stds=None, ..., and put an if inside build for the set_values. Find another way to calculate the shapes in add_weight and you can get rid of the config vars.