I have a convolutional autoencoder model. While an autoencoder typically focuses on reconstructing the input without using any label information, I want to use the class label to perform class conditional scaling/shifting after convolutions. I am curious if utilizing the label in this way might help produce better reconstructions.
num_filters = 32
input_img = layers.Input(shape=(28, 28, 1)) # input image
label = layers.Input(shape=(10,)) # label
# separate scale value for each of the filter dimensions
scale = layers.Dense(num_filters, activation=None)(label)
# conv_0 produces something of shape (None,14,14,32)
conv_0 = layers.Conv2D(num_filters, (3, 3), strides=2, activation=None, padding='same')(input_img)
# TODO: Need help here. Multiply conv_0 by scale along each of the filter dimensions.
# This still outputs something of shape (None,14,14,32)
# Essentially each 14x14x1 has it's own scalar multiplier
In the example above, the output of the convolutional layer is (14,14,32) and the scale layer is of shape (32,). I want the convolutional output to be multiplied by the corresponding scale value along each filter dimension. For example, if these were numpy arrays I could do something like conv_0[:, :, i] * scale[i]
for i in range(32).
I looked at tf.keras.layers.Multiply
which can be found here, but based on the documentation I believe that takes in tensors of the same size as input. How do I work around this?
You don't have to loop. Simply do the following by making two tensors broadcast-compatible,
out = layers.Multiply()([conv_0, tf.expand_dims(tf.expand_dims(scale,axis=1), axis=1)])