How exactly is the facial recognition done in this framework done? The docs state that it is part of the framework
Face Detection and Recognition
However, it is not clear which classes/methods allow us to do so. The closest thing I've found is VNFaceObservation
which is lacking significant details.
Is it more of a manual process and we must include our own learned models someway? -- if so, how?
The technical details of how the vision framework are unknown even though from the WWDC video they seem to be using deep learning.
Here is some sample code to locate an eye in your image:
let request = VNDetectFaceLandmarksRequest()
let handler = VNImageRequestHandler(cvPixelBuffer: buffer, orientation: orientation)
try! handler.perform([request])
guard let face = request.results?.first as? VNFaceObservation,
let leftEye = face.landmarks?.leftEye else { return }
let box = face.boundingBox
let points = (0..<landmark.pointCount).map({ i in
let point = landmark.point(at: i)
let x = box.minX + box.width * CGFloat(point.x)
let y = 1 - (box.minY + box.height * CGFloat(point.y))
return CGPoint(x: x, y: y)
})
That will return you some points that you can see linked together in the WWDC video as:
You might want to watch the WWDC video until they improve the doc. Else Xcode autocomplete is your best friend.
Core ML is a different thing. It's not specifically targeted to faces. You can use your own models and predict whatever you want. So if you have a face recognition model, go for it! The vision framework has some support for CoreML models through VNCoreMLModel