Search code examples
swiftcore-image

Why is my Image CIPerspectiveCorrection different on each device?


I want to use CIPerspectiveCorrection on an image, taken from a AVCaptureSession. But the resulting corrected Image is different on the different device, but it is using the same code. This is my code to get the points:

targetRectLayer is the layer I draw a rectangle in, to highlight the focused rectangle. scannerViewis the view where I show the video session in

let request = VNDetectRectanglesRequest { req, error in
    DispatchQueue.main.async {
        if let observation = req.results?.first as? VNRectangleObservation {
            let points = self.targetRectLayer.drawTargetRect(observation: observation, previewLayer: self.previewLayer, animated: false)
            let size = self.scannerView.frame.size
            self.trackedTopLeftPoint = CGPoint(x: points.topLeft.x / size.width, y: points.topLeft.y / size.height )
            self.trackedTopRightPoint = CGPoint(x: points.topRight.x / size.width, y: points.topRight.y / size.height )
            self.trackedBottomLeftPoint = CGPoint(x: points.bottomLeft.x / size.width, y: points.bottomLeft.y / size.height )
            self.trackedBottomRightPoint = CGPoint(x: points.bottomRight.x / size.width, y: points.bottomRight.y / size.height )
        } else {
            _ = self.targetRectLayer.drawTargetRect(observation: nil, previewLayer: self.previewLayer, animated: false)
        }
    }
}

And

func drawTargetRect(observation: VNRectangleObservation?, previewLayer: AVCaptureVideoPreviewLayer?, animated: Bool) -> (topLeft: CGPoint, topRight: CGPoint, bottomLeft: CGPoint, bottomRight: CGPoint) {
    guard let observation = observation, let previewLayer = previewLayer else {
        draw(path: nil, animated: false)
        return (topLeft: CGPoint.zero, topRight: CGPoint.zero, bottomLeft: CGPoint.zero, bottomRight: CGPoint.zero)
    }

    let convertedTopLeft: CGPoint = previewLayer.layerPointConverted(fromCaptureDevicePoint: CGPoint(x: observation.topLeft.x, y: 1 - observation.topLeft.y))
    let convertedTopRight: CGPoint = previewLayer.layerPointConverted(fromCaptureDevicePoint: CGPoint(x: observation.topRight.x, y: 1 - observation.topRight.y))
    let convertedBottomLeft: CGPoint = previewLayer.layerPointConverted(fromCaptureDevicePoint: CGPoint(x: observation.bottomLeft.x, y: 1 - observation.bottomLeft.y))
    let convertedBottomRight: CGPoint = previewLayer.layerPointConverted(fromCaptureDevicePoint: CGPoint(x: observation.bottomRight.x, y: 1 - observation.bottomRight.y))

    let rectanglePath = UIBezierPath()
    rectanglePath.move(to: convertedTopLeft)
    rectanglePath.addLine(to: convertedTopRight)
    rectanglePath.addLine(to: convertedBottomRight)
    rectanglePath.addLine(to: convertedBottomLeft)
    rectanglePath.close()

    draw(path: rectanglePath, animated: animated)

    return (topLeft: convertedTopLeft, topRight: convertedTopRight, bottomLeft: convertedBottomLeft, bottomRight: convertedBottomRight)
}

This is where I put the points to new positions to continue towards the CIPerspectiveCorrection:

let imageTopLeft: CGPoint = CGPoint(x: image.size.width * trackedBottomLeftPoint.x, y: trackedBottomLeftPoint.y * image.size.height)
let imageTopRight: CGPoint = CGPoint(x: image.size.width * trackedTopLeftPoint.x, y: trackedTopLeftPoint.y * image.size.height)
let imageBottomLeft: CGPoint = CGPoint(x: image.size.width * trackedBottomRightPoint.x, y: trackedBottomRightPoint.y * image.size.height)
let imageBottomRight: CGPoint = CGPoint(x: image.size.width * trackedTopRightPoint.x, y: trackedTopRightPoint.y * image.size.height)

When applying the CIPerspectiveCorrection, the cartesian coordinate system is taken care of with:

extension UIImage {
   func cartesianForPoint(point:CGPoint,extent:CGRect) -> CGPoint {
       return CGPoint(x: point.x,y: extent.height - point.y)
   }
}

Other math is all based on those 4 self.trackedPoints I have set here. Here are some pictures of what differences I mean, see the areas that were not cut off. The highlighted targetRect is perfectly drawn on the edges of the document. This results are persistent on those devices. All pictures are taken in portrait, running iOS 12.1.3.

Why are these images different? The calculations use proportional values, no hardcoded values and they are size ratio independent.

iPhone X, see space left and right

iphonex

iPad Air 2, see space top and bottom

ipadair2

iPhone 5s, that is how I expected all of them iphone5s


Solution

  • I found a configuration failure for the videoGravity property. It was set to .resizeAspectFill and therefore pushes one of the video sides over the view border to fill any left space for the other side. And since all device provide different spaces, the blank space to compensate was different and resulted in the described behavior when taking a picture of the video stream in full resolution, not just the visible one. I focused too much on the math part and missed that possibility.

    The perfect fit on the iphone5s was totally random.

    This is what the corrected code looks:

    // setup view preview layer
        let previewLayer = AVCaptureVideoPreviewLayer(session: newCaptureSession)
        self.previewLayer = previewLayer
        previewLayer.videoGravity = .resize
        self.scannerView.layer.addSublayer(previewLayer)
        self.scannerView.layer.addSublayer(targetRectLayer)