Search code examples
pythonimagetensorflowimage-processingconvolution

Conv2D produces weird output


I'm trying to use a Laplace Filter via TensorFlow tf.nn.conv2d on my image. But the output is super weird and I don't have a clue what I did wrong.

I load my picture via:

file = tf.io.read_file("corgi.jpg")
uint_image = tf.io.decode_jpeg(file, 1)
image = tf.cast(uint_image,tf.float32)
kernel = tf.constant(np.array([[1, 1, 1],
                               [1, -8, 1],
                               [1, 1, 1]]), dtype=tf.float32)
convoluted_image = self.convoluteTest(image, kernel)
rs_convoluted_image = tf.reshape(convoluted_image,
                                 [tf.shape(image)[0] - tf.shape(kernel)[0] + 1,
                                  tf.shape(image)[1] - tf.shape(kernel)[0] + 1, 1])
casted_image = tf.cast(rs_convoluted_image, tf.uint8)
encoded = tf.io.encode_jpeg(casted_image)
tf.io.write_file("corgi-tensor-laplace.jpg", encoded)

But the image parameter cant be passed onto the tf.nn.conv2d function since image tensor requires to be a 4d tensor.

This function here reshapes and applies my laplace filter:

  def convoluteTest(image_tensor, kernel_tensor):
        shape = tf.shape(image_tensor)
        reshaped_image_tensor = tf.reshape(image_tensor, [1, shape[0].numpy(), shape[1].numpy(), 1])
        reshaped_kernel_tensor = tf.reshape(kernel_tensor,
                                            [tf.shape(kernel_tensor)[0].numpy(), tf.shape(kernel_tensor)[0].numpy(), 1,
                                             1])
        convoluted = tf.nn.conv2d(reshaped_image_tensor, reshaped_kernel_tensor, strides=[1, 1, 1, 1], padding='VALID')
        return convoluted

Original Picture:

enter image description here

Failed laplace:

enter image description here

Update:

Greyish output:

enter image description here

What did I do wrong? I can't wrap my head around this...


Solution

  • I believe the problem is casted_image = tf.cast(rs_convoluted_image, tf.uint8) truncates data outside of [0, 255] to pure black or pure white (0 and 255).

    I think you are missing a normalization step back to the [0, 255] range before casting to utint8.

    Try

    normalized_convolved = (rs_convoluted_image - tf.reduce_min(rs_convoluted_image) / (tf.reduce_max(rs_convoluted_image) - tf.reduce_min(rs_convoluted_image))
    normalized_convolved = normalized_convolved * 255
    
    casted_image = tf.cast(normalized_convolved, tf.uint8)