Search code examples
iosswiftapple-vision

Wrong Vision Framework Landmarks Coordinates


I'm trying to capture face landmarks with Vision Framework to show them on screen, but the eyes always appear a little higher than expected, like the Tim Cook image below.

Tim Cook's photo with Vision Coordinates

Here is my capturing code:

guard let pixelBuffer = CMSampleBufferGetImageBuffer(cmSampleBuffer) else { return }
var requests: [VNRequest] = []
    
let requestLandmarks = VNDetectFaceLandmarksRequest { request, _ in
    guard let results = request.results as? [VNFaceObservation],
          let firstFace = results.first else { return }

    completion(self.drawFaceWithLandmarks(face: firstFace))
}

requests.append(requestLandmarks)
    
let handler = VNImageRequestHandler(cvPixelBuffer: pixelBuffer, orientation: .leftMirrored)
do {
    try handler.perform(requests)
} catch {
    print(error)
}

Here is how I'm converting the Vision coordinates

let transform = CGAffineTransform(scaleX: 1, y: -1).translatedBy(x: 0, y: -UIScreen.main.bounds.height)
let translate = CGAffineTransform.identity.scaledBy(x: UIScreen.main.bounds.width, y: UIScreen.main.bounds.height)
let facebounds = face.boundingBox.applying(translate).applying(transform)

let eyePathPoints = eye.normalizedPoints
    .map({ eyePoint in
        CGPoint(
            x: eyePoint.x * facebounds.width + facebounds.origin.x,
            y: (1-eyePoint.y) * facebounds.height + facebounds.origin.y)
        })

Has anyone ever faced a similar problem?


Solution

  • I also has the same doubt about the actually face landmarks detection. It is awkward that Vision framework work well only on real-device. I have run small code to draw the left and right eye corners on a static image. It drew correctly in my case.

    import UIKit
    import Vision
    
    class ViewController: UIViewController {
        
        override func viewDidLoad() {
            super.viewDidLoad()
            
            guard let inputImage = UIImage(named: "imFace") else { return }
            
            detectFacialLandmarks(in: inputImage) { [weak self] observations in
                DispatchQueue.main.async {
                    guard let strongSelf = self,
                          let markedImage = strongSelf.drawLandmarks(on: inputImage, observations: observations) else { return }
                    strongSelf.saveImageToPhotos(markedImage)
                    // Optionally update UI or display the image with landmarks
                }
            }
        }
        
        func detectFacialLandmarks(in image: UIImage, completion: @escaping ([VNFaceObservation]) -> Void) {
            guard let cgImage = image.cgImage else {
                completion([])
                return
            }
            
            let request = VNDetectFaceLandmarksRequest { request, error in
                guard error == nil,
                      let observations = request.results as? [VNFaceObservation] else {
                    completion([])
                    return
                }
                completion(observations)
            }
            
            let handler = VNImageRequestHandler(cgImage: cgImage, options: [:])
            try? handler.perform([request])
        }
    
        func drawLandmarks(on image: UIImage, observations: [VNFaceObservation]) -> UIImage? {
            let renderer = UIGraphicsImageRenderer(size: image.size)
            let resultImage = renderer.image { context in
                // Draw the original image
                image.draw(at: CGPoint.zero)
                
                // Set the drawing context and color for landmarks
                let context = context.cgContext
                context.setStrokeColor(UIColor.red.cgColor)
                context.setLineWidth(2.0)
                
                observations.forEach { observation in
                    if let landmarks = observation.landmarks {
                        // Example: Draw the face contour
                        if let faceContour = landmarks.faceContour {
                            drawPoints(faceContour.normalizedPoints, boundingBox: observation.boundingBox, in: context, imageSize: image.size)
                        }
                        if let faceContour = landmarks.leftEye {
                            drawPoints(faceContour.normalizedPoints, boundingBox: observation.boundingBox, in: context, imageSize: image.size)
                        }
                        if let faceContour = landmarks.rightEye {
                            drawPoints(faceContour.normalizedPoints, boundingBox: observation.boundingBox, in: context, imageSize: image.size)
                        }
                        // Add more landmarks to draw here (e.g., eyes, nose, etc.)
                    }
                }
            }
            
            return resultImage
        }
    
        func drawPoints(_ points: [CGPoint], boundingBox: CGRect, in context: CGContext, imageSize: CGSize) {
            let transform = CGAffineTransform(scaleX: imageSize.width, y: -imageSize.height)
                .translatedBy(x: 0, y: -1)
                .scaledBy(x: boundingBox.width, y: boundingBox.height)
                .translatedBy(x: boundingBox.minX, y: boundingBox.minY)
            
            let convertedPoints = points.map { $0.applying(transform) }
            
            guard let firstPoint = convertedPoints.first else { return }
            
            context.beginPath()
            context.move(to: firstPoint)
            convertedPoints.dropFirst().forEach { context.addLine(to: $0) }
            context.closePath()
            context.strokePath()
        }
    
    
        func saveImageToPhotos(_ image: UIImage) {
        
            UIImageWriteToSavedPhotosAlbum(image, nil, nil, nil)
            print("Saved the image")
        }
    }
    

    Left Eye, Right Eye corners are drawn on this image