Search code examples
iosswiftcomputer-visionavcapturecorner-detection

VNRectangleObservation corners compressed in x-axis on iPhone


I'm capturing video via my device's camera, and feeding it to the Vision framework to perform rectangle detection. The code looks something like this (compressed for brevity ... hidden lines not relevant to this question):

func captureOutput(_ output: AVCaptureOutput, 
                     didOutput sampleBuffer: 
                     CMSampleBuffer, from connection: AVCaptureConnection) {

    // Get a CIImage from the buffer
    guard let buffer = CMSampleBufferGetImageBuffer(sampleBuffer) else { return }
    let image = CIImage(cvImageBuffer: buffer)

    // Set up corner detector
    let handler = VNImageRequestHandler(ciImage: image, orientation: .up options: [:])
    let request = VNDetectRectanglesRequest()

    // Perform corner detection
    do {
            try handler.perform([request])
            guard let observation = request.results?.first as? VNRectangleObservation else {
                print("error at \(#line)")
                return
            }
            handleCorners(observation)
        } catch {
            print("Error: \(error)")
            return
        }
}

This works just fine on an iPad Air 2, and I can use the corners in the observation object to draw a nice overlay. But on an iPhone X the corners in the x-axis are "compressed".

For example, if I capture an image with a business card that occupies almost the entire width of the screen, I would expect observation.topLeft to have an x value close to zero. Instead it's nearly 0.15. This is true for the righthand corners too (expected: ~1.0, actual: ~0.85).

Any idea why this might be the case? The CIImage extent property is the same on both devices. It's just that Vision's corners are compressed in the x-axis.


Solution

  • I had a pretty similar problem with detecting rectangles in realtime using ARKit. And after some investigation I saw this answer and figure out that: "The problem is that ARKit provides the image buffer (frame.capturedImage) with a camera resolution of 1920 x 1440. The screen of the iPhone X is 375 x 812 points. It seems like ARKit can see more than it can display on the phone screen." So I just corrected capturedImage size using screen proportion, and this "solution" fix my problem.