I am currently designing a NoisyNet in Tensorflow, for which I need to define a custom layer. When copying a model containing that custom layer, python raises the error ValueError: Unknown layer: NoisyLayer
. The implementation of the layer is provided here.
The goal is to copy one network creating a second instance of it. For that purpose, I use the command net_copy = copy.deepcopy(net_original)
, which works as long as I don't include the custom layer referred to above in the model to be copied.
I saw that for saving and loading there exists a way of specifying custom attributes (such as custom layers), but yet I couldn't find a similar command that would work for copy.deepcopy()
, where copy is imported via import copy
.
I am using Tensorflow 1.12.0 in Python3.
Again, the custom layer is provided under the link above. The network that uses the custom layer looks as follows:
class Network:
def __init__(self, actionspace_size, learning_rate, gradient_momentum, gradient_min):
frames_input = keras.layers.Input((84, 84, 4))
actions_input = keras.layers.Input((actionspace_size,))
conv1 = keras.layers.Conv2D(16, (8, 8), strides=(4, 4), activation="relu")(frames_input)
conv2 = keras.layers.Conv2D(32, (4, 4), strides=(2, 2), activation="relu")(conv1)
flattened = keras.layers.Flatten()(conv2)
# NoisyNet
hidden = NoisyLayer(activation=tf.nn.relu)(inputs=flattened, resample_noise_flag=True)
output = NoisyLayer(in_shape=(1,256), out_units=actionspace_size)(inputs=hidden, resample_noise_flag=True)
filtered_output = keras.layers.merge.Multiply()([output, actions_input])
self.model = keras.models.Model(inputs=[frames_input, actions_input], outputs=filtered_output)
self.model.compile(loss='mse', optimizer=keras.optimizers.RMSprop(lr=learning_rate, rho=gradient_momentum, epsilon=gradient_min))
When calling
q_net = Network(actionspace_size, learning_rate, gradient_momentum, gradient_min).
target_net = copy.deepcopy(q_net)
the following error arises:
Traceback (most recent call last):
File "DQN_tf_NoisyNet.py", line 315, in <module>
main()
File "DQN_tf_NoisyNet.py", line 252, in main
target_net = copy.deepcopy(q_net)
File "/usr/lib/python3.5/copy.py", line 182, in deepcopy
y = _reconstruct(x, rv, 1, memo)
File "/usr/lib/python3.5/copy.py", line 299, in _reconstruct
y.__setstate__(state)
File "/usr/local/lib/python3.5/dist-packages/keras/engine/network.py", line 1266, in __setstate__
model = saving.unpickle_model(state)
File "/usr/local/lib/python3.5/dist-packages/keras/engine/saving.py", line 435, in unpickle_model
return _deserialize_model(f)
File "/usr/local/lib/python3.5/dist-packages/keras/engine/saving.py", line 225, in _deserialize_model
model = model_from_config(model_config, custom_objects=custom_objects)
File "/usr/local/lib/python3.5/dist-packages/keras/engine/saving.py", line 458, in model_from_config
return deserialize(config, custom_objects=custom_objects)
File "/usr/local/lib/python3.5/dist-packages/keras/layers/__init__.py", line 55, in deserialize
printable_module_name='layer')
File "/usr/local/lib/python3.5/dist-packages/keras/utils/generic_utils.py", line 145, in deserialize_keras_object
list(custom_objects.items())))
File "/usr/local/lib/python3.5/dist-packages/keras/engine/network.py", line 1022, in from_config
process_layer(layer_data)
File "/usr/local/lib/python3.5/dist-packages/keras/engine/network.py", line 1008, in process_layer
custom_objects=custom_objects)
File "/usr/local/lib/python3.5/dist-packages/keras/layers/__init__.py", line 55, in deserialize
printable_module_name='layer')
File "/usr/local/lib/python3.5/dist-packages/keras/utils/generic_utils.py", line 138, in deserialize_keras_object
': ' + class_name)
ValueError: Unknown layer: NoisyLayer
I know that the network itself is not the problem (neither the deepcopy approach), since both work fine again as soon as I replace the NoisyLayers (custom) by standard dense layers.
Does anyone know how to copy a Tensorflow model including custom layers? Thanks in advance!
Found a solution:
The problem, again, was that Tensorflow/Keras didn't know how to interpret the custom layer. So, to provide the information how to interpret a layer, one can use Keras's CustomObjectScope
and copy the model within that scope as follows:
# Import
import copy
from keras.utils import CustomObjectScope
# Copy
with CustomObjectScope({"MyCustomLayer":MyCustomLayer}):
model_copy = copy.deepcopy(model)
This takes care of the copying part. However, this is only going to work out of the box as long as there is no custom inputs specified as parameters to the custom layer's constructor (__init(...)
).
I guess this is case since behind the scenes the copy() function seems to temporarily save and then load again the original model using some pickle
-functionality or so, such that one has to declare values for further constructor-parameters as well as follows:
If the beginning of the custom class looks as follows, where output_dim
is one of the custom parameters referred to above:
class MyCustomLayer(keras.layers.Layer):
def __init__(self, output_dim, **kwargs):
self.output_dim = output_dim
super(MyCustomLayer, self).__init__(**kwargs)
then one has to add a function to the class MyCustomLayer
that also takes care of making the custom constructor parameters persistent for saving and loading (while copying):
def get_config(self):
config = super(MyCustomLayer, self).get_config()
# Specify here all the values for the constructor's parameters
config['output_dim'] = self.output_dim
return config
These two steps solved the problem in my case.