hope everyones day (or night) is going well.
I've been playing around with a Caffe model I came across, and I've been having some trouble working with the output array. I haven't worked with segmentation before so this may be a simple fix for someone more knowledgeable on the subject.
The model is based on this paper Deep Joint Task Learning for Generic Object Extraction. I have converted the model in CoreML format.
The issue I have is this:
When trying to create a PIL image from the output, I get what seems like random noise and I think its just a simple issue of the numpy array being mis-shaped or the order of the pixels is wrong. The output array is of shape (2500, 1) and it's supposed to be a 50x50 black and white image
Code looks like this:
image = Image.open('./1.jpg')
image = image.resize((55, 55), Image.ANTIALIAS)
predictions = model.predict({'data_55': image} , useCPUOnly = False)
predictions = predictions['fc8_seg']
reshape_array = numpy.reshape(predictions, (50,50))
output_image = Image.fromarray(reshape_array, '1')
I've tried both F and C orders on the numpy reshape and can't seem to get anything other than noise that looks like this . I'm using one of the test images provided in the original repo so it shouldn't be a problem. As a side note, the values in the array look like this:
[[ 4.55798066e-08 5.40980977e-07 2.13476710e-06 ..., 6.66990445e-08
6.81615759e-08 3.21255470e-07]
[ 2.69358861e-05 1.94866928e-07 4.71876803e-07 ..., 1.25911642e-10
3.14572794e-08 1.61371077e-08]
Any thoughts or answers would be much appreciated and helpful. Thanks ahead of time!
Looks like I was able to figure this out. It wasn't an issue with the order of the array, but with the values and data type. Here is the code I put together to get a proper image from the output.
predictions = model.predict({'data_55': image} , useCPUOnly = True) # Run the prediction
map_final = predictions['fc8_seg'][0,0,:,:] # fc8_seg is the output of the neural network
map_final = map_final.reshape((50,50)) # Reshape the output from shape (2500) to (50, 50)
map_final = numpy.flip(map_final, 1) # Flip axis 1 to unmirror the image
# Scale the values in the array to a range between 0 and 255
map_final -= map_final.min()
map_final /= map_final.max()
map_final = numpy.ceil(map_final*255)
map_final_unint8 = map_final.astype(numpy.uint8) # Convert the data type to an uint8
pil_image = Image.fromarray(map_final_unint8, mode = 'L') # Create the PIL image
And the output:
Everything looks just as it should!