Search code examples
pythontensorflowkeras-layertf.keras

Keras : problem with gradients, custom layer won't work in a sequential model


This is the error message I get with the code down below:
ValueError: No gradients provided for any variable: ['Variable:0'].
right after it goes through the whole layer's build(), in model.fit().

It prints the input and the scalar after going through build() and before raising the error, but the tensors are both empty:

Tensor("IteratorGetNext:0", shape=(None, 1), dtype=float32)  
<tf.Variable 'Variable:0' shape=(1,) dtype=float>

My goal was to write a (basic) custom layer and to insert it in a (basic) model. My custom layer works properly on its own but I can't get the model to fit. The layer take a tensor and multiply it by a scalar. I want my model to give me input*(scalar I chose early on).

Thus far I've gotten plenty of Error Warning about the dtype of various tensors (I had int32 instead of float32) so I wrote plenty of casts, and I had a model more complex but I stripped it to the bones to debug (it didn't help much…).

I tried with and without a "build()", with and without using "to_categorical" on the labels, with vector inputs and scalar inputs, and other probably insignificant stuff.

Here is the code of the layer:

from tensorflow.python.keras import layers
import tensorflow as tf
from tensorflow.python.ops import math_ops
from tensorflow.python.framework import tensor_shape
import h5py
import numpy as np


class MyBasicLayer(layers.Layer):
    def __init__(self, **kwargs):
        super().__init__(self)
        self._set_dtype_policy('float32')
        self.w = self.add_weight(shape=(1,), initializer='zeros', trainable=True)

    def build(self, input_shape):
        input_shape = tensor_shape.TensorShape(input_shape)
        if tensor_shape.dimension_value(input_shape[-1]) is None:
            raise ValueError('The last dimension of the inputs to `MyBasicLayer` should be defined. Found `None`.')
        super().build(input_shape)

    def call(self, inputs):
        print(inputs)
        print(self.w)
        return tf.math.multiply(tf.dtypes.cast(inputs,dtype='float32'),self.w)

And here is the code of the model:

import numpy as np
import tensorflow as tf
import os
from tensorflow.keras import Sequential
from my_basic_layer import MyBasicLayer
from tensorflow.keras.optimizers import Adam
from tensorflow.keras.utils import to_categorical
from tensorflow.python.keras.layers import Activation
from tensorflow.keras import activations



k = 2.

# load the dataset
inset = np.array([[i] for i in range(40)], dtype='float32')
outset = inset * k
#outset = to_categorical(outset, num_classes =256)

# define the model
model = Sequential()
model.add(MyBasicLayer(input_shape=(1,))) #input_shape=(4,)
#model.add(Activation(activations.softmax))

# compile the model
model.compile()

# fit the model
model.fit(inset, outset)
model.summary()

Maybe relevant for all I know:
I wanted to have a model.summary() before the compilation but I got
This model has not yet been built. Build the model first by calling build() or calling fit() with some data, or specify an input_shape argument in the first layer(s) for automatic build.
even after adding el famoso input_shape argument in the first layer.

Thank you


Solution

  • Specifying the Solution here (Answer Section) even though it is present in the Comments Sections, for the benefit of the Community.

    The error, ValueError: No gradients provided for any variable: ['Variable:0']. in the above case is because No Loss Function was provided when the Model is Compiled.

    So, replacing

    model.compile()
    

    with

    model.compile(loss='categorical_crossentropy')
    

    will fix the error.

    For the sake of completeness, the Simple working example code which uses Custom Layer is shown below:

    from tensorflow.python.keras import layers
    import tensorflow as tf
    from tensorflow.python.ops import math_ops
    from tensorflow.python.framework import tensor_shape
    import h5py
    import numpy as np
    
    
    class MyBasicLayer(layers.Layer):
        def __init__(self, **kwargs):
            super().__init__(self)
            self._set_dtype_policy('float32')
            self.w = self.add_weight(shape=(1,), initializer='zeros', trainable=True)
    
        def build(self, input_shape):
            input_shape = tensor_shape.TensorShape(input_shape)
            if tensor_shape.dimension_value(input_shape[-1]) is None:
                raise ValueError('The last dimension of the inputs to `MyBasicLayer` should be defined. Found `None`.')
            super().build(input_shape)
    
        def call(self, inputs):
            print(inputs)
            print(self.w)
            return tf.math.multiply(tf.dtypes.cast(inputs,dtype='float32'),self.w)
    
    import numpy as np
    import tensorflow as tf
    import os
    from tensorflow.keras import Sequential
    from tensorflow.keras.optimizers import Adam
    from tensorflow.keras.utils import to_categorical
    from tensorflow.python.keras.layers import Activation
    from tensorflow.keras import activations
    
    
    
    k = 2.
    
    # load the dataset
    inset = np.array([[i] for i in range(40)], dtype='float32')
    outset = inset * k
    #outset = to_categorical(outset, num_classes =256)
    
    # define the model
    model = Sequential()
    model.add(MyBasicLayer(input_shape=(1,))) #input_shape=(4,)
    
    # compile the model
    model.compile(loss='categorical_crossentropy')
    
    # fit the model
    model.fit(inset, outset)
    model.summary()
    

    Output of the above code is shown below:

    Tensor("IteratorGetNext:0", shape=(None, 1), dtype=float32)
    <tf.Variable 'Variable:0' shape=(1,) dtype=float32>
    Tensor("IteratorGetNext:0", shape=(None, 1), dtype=float32)
    <tf.Variable 'Variable:0' shape=(1,) dtype=float32>
    2/2 [==============================] - 0s 2ms/step - loss: nan
    Model: "sequential_3"
    _________________________________________________________________
    Layer (type)                 Output Shape              Param #   
    =================================================================
    my_basic_layer_3 (MyBasicLay multiple                  1         
    =================================================================
    Total params: 1
    Trainable params: 1
    Non-trainable params: 0
    

    Hope this helps. Happy Learning!