Search code examples
pythontensorflowneural-networkkeras-layer

how to keep the values of tensors in each epoch in one layer and pass it to Next epoch in tensorflow


I have a general question.

I am developing a new layer to incorporate into an autoencoder. To be more specific, the layer is something like the KCompetitive class over here. What I want is that I need to save the output of this layer in a variable let's call it previous_mat_values, and then pass it to that same layer in the next epoch as well.

To put it another way, I want to be able to save the output of this layer of epoch 1 in one variable, and then in epoch 2, again use that same matrix.

So the question arises what would be the values of this matrix in the first epoch because it does not have the output of that layer yet. We can initialize an array with the same shape of the weight matrix but with values of 0 I will do like this.

previous_mat_values = tf.zeros_like(weight_tensor)

So the steps are like this:

  1. In the first epoch, previous_mat_values and weight_mat will pass to the layer

    1.a at the end of the function of that layer, the final value of which we call it modified_weight_mat will save into the previous_mat_values

    previous_mat_values = modified_weight_mat

  2. In the second epoch, previous_mat_values and weight_mat will pass to the layer, however, previous_mat_values has the values saved in the first epoch.

I don't have any problem passing weight_mat and doing stuff related to that. The only issue here is that how we can save the values of previous_mat_values in each epoch and pass it to the next epoch.

I was thinking to create a global tensor variable in the class of that layer and initialize it with zero, but I don't think it will help to keep the values of the previous epoch into the second epoch.

Do you have any idea how can I implement this?

Please let me know if my explanations are not clear.

Update 1:

This is the implementation of the layer:

class KCompetitive(Layer):
    '''Applies K-Competitive layer.
    # Arguments
    '''
    def __init__(self, topk, ctype, **kwargs):
        self.topk = topk
        self.ctype = ctype
        self.uses_learning_phase = True
        self.supports_masking = True
        super(KCompetitive, self).__init__(**kwargs)

    def call(self, x):
        if self.ctype == 'ksparse':
            return K.in_train_phase(self.kSparse(x, self.topk), x)
        elif self.ctype == 'kcomp':
            return K.in_train_phase(self.k_comp_tanh(x, self.topk), x)
        else:
            warnings.warn("Unknown ctype, using no competition.")
            return x

    def get_config(self):
        config = {'topk': self.topk, 'ctype': self.ctype}
        base_config = super(KCompetitive, self).get_config()
        return dict(list(base_config.items()) + list(config.items()))

    def k_comp_tanh(self, x, topk, factor=6.26):
        ###Some modification on x so now the x becomes 
        x= x+1
        res = x
        return res

Update 2

For more clarification I will add this:

sample 1 of data:

x_prev = zero
mask = tf.greate(x, x_prev)   # x here related to sample 1
x_modified = x[mask]
x_prev = x_modified

Sample 2 of data:

mask = tf.greater(x, x_prev)   # x here related to sample 2  and 
x_prev is from previous sample
x_modified = x[mask]
x_prev = x_modified

Solution

  • I'm not sure if this is what you mean, but you can have a variable in your layer that simply gets updated with the previous value of another variable on each training step, something along these lines:

    import tensorflow as tf
    
    class MyLayer(tf.keras.layers.Layer):
        def __init__(self, units, **kwargs):
            super(MyLayer, self).__init__(**kwargs)
            self.units = units
    
        def build(self, input_shape):
            self.w = self.add_weight(shape=(input_shape[-1], self.units),
                                    initializer='random_normal',
                                    trainable=self.trainable,
                                    name='W')
            self.w_prev = self.add_weight(shape=self.w.shape,
                                          initializer='zeros',
                                          trainable=False,
                                          name='W_prev')
    
        def call(self, inputs, training=False):
            # Only update value of w_prev on training steps
            deps = []
            if training:
                deps.append(self.w_prev.assign(self.w))
            with tf.control_dependencies(deps):
                return tf.matmul(inputs, self.w)
    

    Here is a usage example:

    import tensorflow as tf
    import numpy as np
    
    tf.random.set_seed(0)
    np.random.seed(0)
    # Make a random linear problem
    x = np.random.rand(50, 3)
    y = x @ np.random.rand(3, 2)
    # Make model
    model = tf.keras.Sequential()
    my_layer = MyLayer(2, input_shape=(3,))
    model.add(my_layer)
    model.compile(optimizer='SGD', loss='mse')
    # Train
    cbk = tf.keras.callbacks.LambdaCallback(
        on_batch_begin=lambda batch, logs: (tf.print('batch:', batch),
                                            tf.print('w_prev:', my_layer.w_prev, sep='\n'),
                                            tf.print('w:', my_layer.w, sep='\n')))
    model.fit(x, y, batch_size=10, epochs=1, verbose=0, callbacks=[cbk])
    

    Output:

    batch: 0
    w_prev:
    [[0 0]
     [0 0]
     [0 0]]
    w:
    [[0.0755531341 0.0211461019]
     [-0.0209847465 -0.0518018603]
     [-0.0618413948 0.0235136505]]
    batch: 1
    w_prev:
    [[0.0755531341 0.0211461019]
     [-0.0209847465 -0.0518018603]
     [-0.0618413948 0.0235136505]]
    w:
    [[0.0770048052 0.0292659812]
     [-0.0199236758 -0.04635958]
     [-0.060054455 0.0332755931]]
    batch: 2
    w_prev:
    [[0.0770048052 0.0292659812]
     [-0.0199236758 -0.04635958]
     [-0.060054455 0.0332755931]]
    w:
    [[0.0780589 0.0353098139]
     [-0.0189863108 -0.0414136574]
     [-0.0590113513 0.0387929156]]
    batch: 3
    w_prev:
    [[0.0780589 0.0353098139]
     [-0.0189863108 -0.0414136574]
     [-0.0590113513 0.0387929156]]
    w:
    [[0.0793346688 0.042034667]
     [-0.0173048507 -0.0330933407]
     [-0.0573575757 0.0470812619]]
    batch: 4
    w_prev:
    [[0.0793346688 0.042034667]
     [-0.0173048507 -0.0330933407]
     [-0.0573575757 0.0470812619]]
    w:
    [[0.0805450454 0.0485667922]
     [-0.0159637 -0.0261840075]
     [-0.0563304275 0.052557759]]
    

    EDIT: I'm still not 100% sure how exactly you need this to work, but here is something that might work for you:

    import tensorflow as tf
    
    class KCompetitive(Layer):
        '''Applies K-Competitive layer.
        # Arguments
        '''
        def __init__(self, topk, ctype, **kwargs):
            self.topk = topk
            self.ctype = ctype
            self.uses_learning_phase = True
            self.supports_masking = True
            self.x_prev = None
            super(KCompetitive, self).__init__(**kwargs)
    
        def call(self, x):
            if self.ctype == 'ksparse':
                return K.in_train_phase(self.kSparse(x, self.topk), x)
            elif self.ctype == 'kcomp':
                return K.in_train_phase(self.k_comp_tanh(x, self.topk), x)
            else:
                warnings.warn("Unknown ctype, using no competition.")
                return x
    
        def get_config(self):
            config = {'topk': self.topk, 'ctype': self.ctype}
            base_config = super(KCompetitive, self).get_config()
            return dict(list(base_config.items()) + list(config.items()))
    
        def k_comp_tanh(self, x, topk, factor=6.26):
            if self.x_prev is None:
                self.x_prev = self.add_weight(shape=x.shape,
                                              initializer='zeros',
                                              trainable=False,
                                              name='X_prev')
            ###Some modification on x so now the x becomes 
            x_modified = self.x_prev.assign(x + 1)
            return x_modified
    

    Here is an example of usage:

    import tensorflow as tf
    
    tf.random.set_seed(0)
    np.random.seed(0)
    # Make model
    model = tf.keras.Sequential()
    model.add(tf.keras.Input(batch_shape=(3, 4)))
    my_layer = KCompetitive(2, 'kcomp')
    print(my_layer.x_prev)
    # None
    model.add(my_layer)
    # The variable gets created after it is added to a model
    print(my_layer.x_prev)
    # <tf.Variable 'k_competitive/X_prev:0' shape=(3, 4) dtype=float32, numpy=
    # array([[0., 0., 0., 0.],
    #        [0., 0., 0., 0.],
    #        [0., 0., 0., 0.]], dtype=float32)>
    model.compile(optimizer='SGD', loss='mse')
    
    # "Train"
    x = tf.zeros((3, 4))
    cbk = tf.keras.callbacks.LambdaCallback(
        on_epoch_begin=lambda batch, logs:
            tf.print('initial x_prev:', my_layer.x_prev, sep='\n'),
        on_epoch_end=lambda batch, logs:
            tf.print('final x_prev:', my_layer.x_prev, sep='\n'),)
    model.fit(x, x, epochs=1, verbose=0, callbacks=[cbk])
    # initial x_prev:
    # [[0 0 0 0]
    #  [0 0 0 0]
    #  [0 0 0 0]]
    # final x_prev:
    # [[1 1 1 1]
    #  [1 1 1 1]
    #  [1 1 1 1]]