Search code examples
tensorflowkerastf.keraskeras-layer

tell Keras to merge and train layers that do not descend from input?


Consider


import tensorflow as tf
units=11

entrada=tf.keras.Input(name="entrada", shape=(units,))
unidad= tf.Variable([[1.0]]) # + 0.0* entrada[:,:1]
denseSoftmax=tf.keras.layers.Dense(units,name="denseSoftmax",activation="softmax")
softMaxOutput=denseSoftmax(unidad)
finalproduct=tf.keras.layers.Multiply()([entrada,softMaxOutput])
modelo=tf.keras.Model(entrada,finalproduct)
modelo.summary()

This example produces a model without trainable parameters, because the denseSoftMax layer does not act in the input. If I fake it by uncommenting + 0.0 * entrada[:,:1] then it produces the expected graph

 Layer (type)                   Output Shape         Param #     Connected to                     
==================================================================================================
 entrada (InputLayer)           [(None, 11)]         0           []                               
 tf.__operators__.getitem (Slic  (None, 1)           0           ['entrada[0][0]']                
 ingOpLambda)                                                                                     
 tf.math.multiply (TFOpLambda)  (None, 1)            0           ['tf.__operators__.getitem[0][0]'
 tf.__operators__.add (TFOpLamb  (None, 1)           0           ['tf.math.multiply[0][0]']       
 denseSoftmax (Dense)           (None, 11)           22          ['tf.__operators__.add[0][0]']   
 multiply (Multiply)            (None, 11)           0           ['entrada[0][0]',                
                                                                  'denseSoftmax[0][0]']        

But faking a zero valued link to an input seems as bad as adding a constant branch in the set of input layers.

Is there a way to announce to keras that it should follow the subgraph for a series of layers that are going to be merged with the resulting output, but do not depend on the input?


Solution

  • Is the following your desired?

    class CustomModel(tf.keras.Model):
        def __init__(self,units) -> None:
            super().__init__()
            self.entrada = tf.keras.layers.InputLayer(input_shape=(units,))
            self.unidad= tf.Variable([[1.0]])
            self.denseSoftmax = tf.keras.layers.Dense(units,name="denseSoftmax",activation="softmax")
            self.finalproduct = tf.keras.layers.Multiply()
        def call(self,inputs):
            x = self.entrada(inputs)
            softMaxOutput = self.denseSoftmax(self.unidad)
            y = self.finalproduct([x,softMaxOutput])
            return y 
    units = 11
    modelo = CustomModel(units=units)
    modelo.build(input_shape=(None,units))
    modelo.summary()
    
    Model: "custom_model"
    _________________________________________________________________
     Layer (type)                Output Shape              Param #
    =================================================================
     input_1 (InputLayer)        [(None, 11)]              0
    
     denseSoftmax (Dense)        multiple                  22
    
     multiply (Multiply)         multiple                  0
    
    =================================================================
    Total params: 23
    Trainable params: 23
    Non-trainable params: 0
    _________________________________________________________________