I have a GAN set up with Tensorflow and Keras, and training has been smooth. The only problem that I am having is that when I try to convert output tensors into images, I get an error.
Upon further inspection of the output tensor, I find that the image in RGB format contains negative numbers for channel values.
arr = generated_image.numpy() # (128, 128, 3)
# Zeroing in on a specific pixel to examine RGB values
print(arr[0][0][0]) # array([0.80051363, 0.55783302, 0.34086022])
print(arr[0][0][1]) # array([0.1622794 , 0.40731752, 0.41627714])
print(arr[0][0][2]) # array([-22.26079941, -17.90978622, -17.2147789 ])
After looking around on the site a bit, I found this question addressing the error I am having, written below. It said that my types should be adjusted to be in 0-255 RGB instead of a float between 0 and 1. That is why I believe these negative numbers in the array are the cause of the below error.
KeyError Traceback (most recent call last)
/opt/conda/lib/python3.7/site-packages/PIL/Image.py in fromarray(obj, mode)
2927 try:
-> 2928 mode, rawmode = _fromarray_typemap[typekey]
2929 except KeyError as e:
KeyError: ((1, 1, 3), '<f8')
The above exception was the direct cause of the following exception:
TypeError Traceback (most recent call last)
/tmp/ipykernel_33/14769953.py in <module>
----> 1 img = Image.fromarray(arr[0])
/opt/conda/lib/python3.7/site-packages/PIL/Image.py in fromarray(obj, mode)
2928 mode, rawmode = _fromarray_typemap[typekey]
2929 except KeyError as e:
-> 2930 raise TypeError("Cannot handle this data type: %s, %s" % typekey) from e
2931 else:
2932 rawmode = mode
TypeError: Cannot handle this data type: (1, 1, 3), <f8
How did my output tensor include negative numbers and still be a valid image, able to be shown with matplotlib.pyplot.imshow()
?
How can I adjust the GAN to output the image entirely with floats between 0 and 1?
Does this even matter? Is there another image encoding that is being used here? If so, how do I extract images from it?
The generator:
generator = keras.Sequential([
keras.layers.Dense(8*8*64, use_bias=False, input_shape=(100,)),
keras.layers.BatchNormalization(),
keras.layers.LeakyReLU(),
keras.layers.Reshape((8, 8, 64)),
keras.layers.Conv2DTranspose(32, (3, 3), strides=(4, 4), use_bias=False, padding="same"),
keras.layers.BatchNormalization(),
keras.layers.LeakyReLU(),
keras.layers.Conv2DTranspose(16, (3, 3), strides=(2, 2), use_bias=False, padding="same"),
keras.layers.BatchNormalization(),
keras.layers.LeakyReLU(),
keras.layers.Conv2DTranspose(3, (3, 3), strides=(2, 2), use_bias=False, padding="same"),
])
Thanks to @Dr.Snoopy, I have found an answer.
The last layer of my model was missing an activation function that would restrict the values between 0 and 1, which is why I was getting random values.
Simply putting on the sigmoid
activation function solved the problem.
keras.layers.Conv2DTranspose(3, (3, 3), strides=(2, 2), use_bias=False, padding="same", activation="sigmoid)```