Search code examples
tensorflowmachine-learningobject-detectiontensorboardtraining-data

Images get rotated during training


I am trying to train a ssd_mobilenet_v2_keras for object detection on a dataset of more or less 6000 images. The problem is that images are rotated randomly during training (or at least, this is what it looks like from the tensorboard). This is the configuration I am using in the pipeline.config file:

train_config {
  batch_size: 32
  data_augmentation_options {
    random_horizontal_flip {
    }
  }

  data_augmentation_options {
    random_rgb_to_gray {
        probability: 0.25
    }
  }

  data_augmentation_options {
    random_jpeg_quality {
        random_coef: 0.8
        min_jpeg_quality: 50
        max_jpeg_quality: 100
    }
  }

  sync_replicas: true
  optimizer {
    adam_optimizer: {
      epsilon: 1e-7
      learning_rate: {
        cosine_decay_learning_rate {
          learning_rate_base: 1e-3
          total_steps: 50000
          warmup_learning_rate: 2.5e-4
          warmup_steps: 5000
        }
      }
    }
    use_moving_average: false
  }
  fine_tune_checkpoint: "pre-trained-models/ssd_mobilenet_v2_320x320_coco17_tpu-8/checkpoint/ckpt-0"
  num_steps: 50000
  startup_delay_steps: 0.0
  replicas_to_aggregate: 8
  max_number_of_boxes: 100
  unpad_groundtruth_tensors: false
  fine_tune_checkpoint_type: "detection"
  fine_tune_checkpoint_version: V2
}

I have also tried to remove the random horizontal flip (I knew that was probably not solve anything, I just gave it a try...) but nothing changes, I still see some training images rotated in the tensorboard, and also if I run the evaluation sometimes the images are rotated. Of course the xml with the bounding box coordinates is not "rotated" so the ground truth image in tensorboard appear completely wrong, the object is in a position and the ground truth box is in a completely different position (the right position if the image wasn't rotated...)


Solution

  • This may long overdue but I had the same problem on this & it took me weeks to figure it out. I just want to share my solution. Also sorry for bad formatting im still learning.

    Apparently, there's this issue with Exif orientation metadata and the way computer/laptop reads it differently than smartphone/digital camera. You can read it more here

    1. So, to check on this you can first check the orientation of the image.
    # Determine image orientation using Exif metadata
        IMAGE_PATHS = path/to/your/images
        image_dir = Path(IMAGE_PATHS).glob('*.jpg')
    
        for image_path in image_dir:
            # Check condition of image
            img = Image.open(image_path)
        
            # Check if the image has Exif metadata (not all images do)
            if hasattr(img, '_getexif'):
                # Get the Exif metadata
                exif_data = img._getexif()
                
                # Check if the Exif data contains the orientation tag (tag number 0x0112)
                if 0x0112 in exif_data:
                    # Read the orientation value
                    orientation = exif_data[0x0112]
                    
                    # Determine the orientation based on the value
                    if orientation == 1:
                        print("Normal (0 degrees)")
                    elif orientation == 3:
                        print("Upside down (180 degrees)")
                    elif orientation == 6:
                        print("Rotated 90 degrees clockwise")
                    elif orientation == 8:
                        print("Rotated 90 degrees counterclockwise")
                    # Add more cases as needed for other orientation values
                else:
                    print("No orientation tag found in Exif data.")
            else:
                print("No Exif data found in the image.")
    

    If you happen to receive other result besides 'Normal', your images are highly rotated. You may not see it in your screen but it is. This is because Windows or any other OS auto-rotate your image for ease of view.

    1. Correct the Exif orientation metadata
    import cv2
    image_dir = 'path/to/your/image' #folder must have image only
    file_list = os.listdir(image_dir)
    for i in range(0,len(file_list)):
        file_name = file_list[i]
        #print(file_name)
        save_path = image_dir + file_name
            
        image_np = load_image_into_numpy_array(save_path)
        
        # For a portrait/vertical image, the width (shape[0]) < height (shape[1]), 
        # rotate the image to overwrite the Exif metadata
        if image_np.shape[0] < image_np.shape[1]:
            image_np = np.rot90(image_np,3).copy()
        else:
            continue
        cv2.imwrite(image_dir + '/' + str(file_name), cv2.cvtColor(image_np, cv2.COLOR_BGR2RGB))
    

    From there, you can then run training again and see at tensorboard's 'train_input_image' to verify that your new training images are correct. Hope this helps.