Search code examples
pythonpython-3.xtensorflowvalueerror

"ValueError: Unknown layer: ... " when calling copy.deepcopy(network) using Tensorflow


I am currently designing a NoisyNet in Tensorflow, for which I need to define a custom layer. When copying a model containing that custom layer, python raises the error ValueError: Unknown layer: NoisyLayer. The implementation of the layer is provided here.

The goal is to copy one network creating a second instance of it. For that purpose, I use the command net_copy = copy.deepcopy(net_original), which works as long as I don't include the custom layer referred to above in the model to be copied. I saw that for saving and loading there exists a way of specifying custom attributes (such as custom layers), but yet I couldn't find a similar command that would work for copy.deepcopy(), where copy is imported via import copy.

I am using Tensorflow 1.12.0 in Python3.

Again, the custom layer is provided under the link above. The network that uses the custom layer looks as follows:

class Network:
    def __init__(self, actionspace_size, learning_rate, gradient_momentum, gradient_min):
        frames_input = keras.layers.Input((84, 84, 4))
        actions_input = keras.layers.Input((actionspace_size,))

        conv1 = keras.layers.Conv2D(16, (8, 8), strides=(4, 4), activation="relu")(frames_input)
        conv2 = keras.layers.Conv2D(32, (4, 4), strides=(2, 2), activation="relu")(conv1)

        flattened = keras.layers.Flatten()(conv2)

        # NoisyNet        
        hidden = NoisyLayer(activation=tf.nn.relu)(inputs=flattened, resample_noise_flag=True)
        output = NoisyLayer(in_shape=(1,256), out_units=actionspace_size)(inputs=hidden, resample_noise_flag=True)

        filtered_output = keras.layers.merge.Multiply()([output, actions_input])

        self.model = keras.models.Model(inputs=[frames_input, actions_input], outputs=filtered_output)

        self.model.compile(loss='mse', optimizer=keras.optimizers.RMSprop(lr=learning_rate, rho=gradient_momentum, epsilon=gradient_min))

When calling

q_net = Network(actionspace_size, learning_rate, gradient_momentum, gradient_min).
target_net = copy.deepcopy(q_net)

the following error arises:

Traceback (most recent call last):
  File "DQN_tf_NoisyNet.py", line 315, in <module>
    main()
  File "DQN_tf_NoisyNet.py", line 252, in main
    target_net = copy.deepcopy(q_net)
  File "/usr/lib/python3.5/copy.py", line 182, in deepcopy
    y = _reconstruct(x, rv, 1, memo)
  File "/usr/lib/python3.5/copy.py", line 299, in _reconstruct
    y.__setstate__(state)
  File "/usr/local/lib/python3.5/dist-packages/keras/engine/network.py", line 1266, in __setstate__
    model = saving.unpickle_model(state)
  File "/usr/local/lib/python3.5/dist-packages/keras/engine/saving.py", line 435, in unpickle_model
    return _deserialize_model(f)
  File "/usr/local/lib/python3.5/dist-packages/keras/engine/saving.py", line 225, in _deserialize_model
    model = model_from_config(model_config, custom_objects=custom_objects)
  File "/usr/local/lib/python3.5/dist-packages/keras/engine/saving.py", line 458, in model_from_config
    return deserialize(config, custom_objects=custom_objects)
  File "/usr/local/lib/python3.5/dist-packages/keras/layers/__init__.py", line 55, in deserialize
    printable_module_name='layer')
  File "/usr/local/lib/python3.5/dist-packages/keras/utils/generic_utils.py", line 145, in deserialize_keras_object
    list(custom_objects.items())))
  File "/usr/local/lib/python3.5/dist-packages/keras/engine/network.py", line 1022, in from_config
    process_layer(layer_data)
  File "/usr/local/lib/python3.5/dist-packages/keras/engine/network.py", line 1008, in process_layer
    custom_objects=custom_objects)
  File "/usr/local/lib/python3.5/dist-packages/keras/layers/__init__.py", line 55, in deserialize
    printable_module_name='layer')
  File "/usr/local/lib/python3.5/dist-packages/keras/utils/generic_utils.py", line 138, in deserialize_keras_object
    ': ' + class_name)
ValueError: Unknown layer: NoisyLayer

I know that the network itself is not the problem (neither the deepcopy approach), since both work fine again as soon as I replace the NoisyLayers (custom) by standard dense layers.

Does anyone know how to copy a Tensorflow model including custom layers? Thanks in advance!


Solution

  • Found a solution:

    The problem, again, was that Tensorflow/Keras didn't know how to interpret the custom layer. So, to provide the information how to interpret a layer, one can use Keras's CustomObjectScope and copy the model within that scope as follows:

    # Import
    import copy
    from keras.utils import CustomObjectScope
    
    # Copy
    with CustomObjectScope({"MyCustomLayer":MyCustomLayer}):
            model_copy = copy.deepcopy(model)
    

    This takes care of the copying part. However, this is only going to work out of the box as long as there is no custom inputs specified as parameters to the custom layer's constructor (__init(...)).

    I guess this is case since behind the scenes the copy() function seems to temporarily save and then load again the original model using some pickle-functionality or so, such that one has to declare values for further constructor-parameters as well as follows:

    If the beginning of the custom class looks as follows, where output_dim is one of the custom parameters referred to above:

    class MyCustomLayer(keras.layers.Layer):
    
        def __init__(self, output_dim, **kwargs):
            self.output_dim = output_dim
            super(MyCustomLayer, self).__init__(**kwargs)
    

    then one has to add a function to the class MyCustomLayer that also takes care of making the custom constructor parameters persistent for saving and loading (while copying):

    def get_config(self):
            config = super(MyCustomLayer, self).get_config()
    
            # Specify here all the values for the constructor's parameters
            config['output_dim'] = self.output_dim
    
            return config
    

    These two steps solved the problem in my case.