Search code examples
swiftfirebasefirebase-mlkit

Firebase ML kit misaligned bounding box


I'm trying to use the new Detect and Track Objects with ML Kit on iOS however I seem to be running into a roadblock with the object detection bounding box.

Using a lego figure as an example, the image orientation is converted to always be .up as per the documentation however the bounding box almost seems to be rotated 90 degrees to the correct dimensions despite the image orientation being correct. This similar behaviour exists on other objects too with the box being offset.

let options = VisionObjectDetectorOptions()
    options.detectorMode = .singleImage
    options.shouldEnableMultipleObjects = false

    let objectDetector = Vision.vision().objectDetector(options: options)

    let image = VisionImage(image: self.originalImage)

    objectDetector.process(image) { detectedObjects, error in
      guard error == nil else {
        print(error)
        return
      }
      guard let detectedObjects = detectedObjects, !detectedObjects.isEmpty else {
        print("No objects detected")
        return
      }


        let primaryObject = detectedObjects.first

        print(primaryObject as Any)

        guard let objectFrame = primaryObject?.frame else{return}

        print(objectFrame)

        self.imageView.image = self.drawOccurrencesOnImage([objectFrame], self.originalImage)

    }

and the function that draws the red box;

private func drawOccurrencesOnImage(_ occurrences: [CGRect], _ image: UIImage) -> UIImage? {
    let imageSize = image.size
    let scale: CGFloat = 0.0
    UIGraphicsBeginImageContextWithOptions(imageSize, false, scale)

    image.draw(at: CGPoint.zero)
    let ctx = UIGraphicsGetCurrentContext()

    ctx?.addRects(occurrences)
    ctx?.setStrokeColor(UIColor.red.cgColor)
    ctx?.setLineWidth(20)
    ctx?.strokePath()

    guard let drawnImage = UIGraphicsGetImageFromCurrentImageContext() else {
        return nil
    }

    UIGraphicsEndImageContext()
    return drawnImage
}

enter image description here

The image dimensions, according to image.size is (3024.0, 4032.0) and the box frame is (1274.0, 569.0, 1299.0, 2023.0). Any insight to this behaviour would be must appreciated.


Solution

  • Ended up not scaling the image properly which caused the misalignment.

    This function ended up fixing my problems.

    public func updateImageView(with image: UIImage) {
      let orientation = UIApplication.shared.statusBarOrientation
      var scaledImageWidth: CGFloat = 0.0
      var scaledImageHeight: CGFloat = 0.0
      switch orientation {
      case .portrait, .portraitUpsideDown, .unknown:
        scaledImageWidth = imageView.bounds.size.width
        scaledImageHeight = image.size.height * scaledImageWidth / image.size.width
      case .landscapeLeft, .landscapeRight:
        scaledImageWidth = image.size.width * scaledImageHeight / image.size.height
        scaledImageHeight = imageView.bounds.size.height
      }
      DispatchQueue.global(qos: .userInitiated).async {
        // Scale image while maintaining aspect ratio so it displays better in the UIImageView.
        var scaledImage = image.scaledImage(
          with: CGSize(width: scaledImageWidth, height: scaledImageHeight)
        )
        scaledImage = scaledImage ?? image
        guard let finalImage = scaledImage else { return }
        DispatchQueue.main.async {
          self.imageView.image = finalImage
          self.processImage(finalImage)
        }
      }
    }