Search code examples
computer-visionartificial-intelligencefaster-rcnn

What kind of image is used for training in Mask RCNN( only 8 bit or 16 bit images or any depth)?


I have a small doubt regarding the MaskRCNN images for training purpose. Is MRCNN is taking only 8 bit images for training? if its taking any 16 bit or 32 bit images, How it will help us by training? Usually the visualization happens for 8 bit images. I had a dilemma if its processing 16bit how it will help in classification and mapping.


Solution

  • As long as you keep the data type the same and the image intensity range to be "consistent" for all input images, then it should be fine. For example if we prefer 8 bit image, you should rescale 16 and 32 bit images to 8 bit, i.e. input images should be of type uint8 - values between [0,255]. This type of "preprocessing" is required when training and doing inference with most machine learning models.

    In one of the examples by matterport/Mask_RCNN, the input images are of type uint8.

    Alternatively, why not just cast the images to be of type float and range [0,1], thereby preserving the pixel resolution for 16 and 32bit images? Hope this helps.