iOS 17.0 - Record live camera feed along with overlay layers

I am building an iOS object detection app and so far so good. Can see the detected objects on a separate layer that is added on top of the VideoPreviewLayer.

 func startVideo() {
        videoCapture = VideoCapture()
        videoCapture.delegate = self
        videoCapture.setUp(sessionPreset: .photo) { success in
            // .hd4K3840x2160 or .photo (4032x3024)  Warning: 4k may not work on all devices i.e. 2019 iPod
            if success {
                // Add the video preview into the UI.
                if let previewLayer = self.videoCapture.previewLayer {
                    self.view!.layer.addSublayer(previewLayer)
                    self.videoCapture.previewLayer?.frame = self.view!.bounds  // resize preview layer
                }
                
                // Add the bounding box layers to the UI, on top of the video preview.
                for box in self.boundingBoxViews {
                    box.addToLayer(self.view!.layer)
                }
                
                // Once everything is set up, we can start capturing live video.
                self.videoCapture.start()
            }
        }
    }

However, I want to record the screen image when a particular object appears on the screen. Pretty straightforward I thought, compare the detected object class and record the UIView. This doesn't seem to work.

func snapScreen() {
    let bounds = UIScreen.main.bounds
    //UIGraphicsBeginImageContextWithOptions(bounds.size, false, 0.0)
    UIGraphicsBeginImageContextWithOptions(bounds.size, false, 0.0)
    let context = UIGraphicsGetCurrentContext()
    //self.view!.drawHierarchy(in: bounds, afterScreenUpdates: true)
    
    self.view!.layer.render(in: context!)
    let img = UIGraphicsGetImageFromCurrentImageContext()
    saveScreenImage(image: img!)
    UIGraphicsEndImageContext()
}

Am triggering this just after the bounding boxes are added to the previewLayer. The videoPreviewLayer is not captured. Only the boundingBoxLayer is captured.

The CALayer.render(in: Context) message says all layers and sublayers will be rendered in the context.

Ok since this didn't work I thought the videoPreviewLayer is missing in the subLayers so I iterated through all the sublayers and that doesn't seem to work either.

 func snapScreen() {
        let bounds = UIScreen.main.bounds
        //UIGraphicsBeginImageContextWithOptions(bounds.size, false, 0.0)
        UIGraphicsBeginImageContextWithOptions(bounds.size, false, 0.0)
        let context = UIGraphicsGetCurrentContext()
        //self.view!.drawHierarchy(in: bounds, afterScreenUpdates: true)
        for layer in self.view!.layer.sublayers! {
            layer.render(in: context!)
        }
        let img = UIGraphicsGetImageFromCurrentImageContext()
        saveScreenImage(image: img!)
        UIGraphicsEndImageContext()
    }

I thought I should capture the image in the Capture AVCapturePhotoCaptureDelegate and add the layer on top of the image.

This doesn't work either only the previewLayer is captured.
This is slow because of the overhead of drawing an image and once again only the previewLayer is saved not the image.

extension CameraViewController: AVCapturePhotoCaptureDelegate {
    func photoOutput(_ output: AVCapturePhotoOutput, didFinishProcessingPhoto photo: AVCapturePhoto, error: Error?) {
        if let error = error {
            print("error occurred : \(error.localizedDescription)")
        }
        DispatchQueue.main.async {
            if let dataImage = photo.fileDataRepresentation() {
                print(UIImage(data: dataImage)?.size as Any)
                let dataProvider = CGDataProvider(data: dataImage as CFData)
                let cgImageRef: CGImage! = CGImage(jpegDataProviderSource: dataProvider!, decode: nil, shouldInterpolate: true, intent: .defaultIntent)
                let image = UIImage(cgImage: cgImageRef, scale: 0.5, orientation: UIImage.Orientation.right)
                let bounds = UIScreen.main.bounds
                //UIGraphicsBeginImageContextWithOptions(bounds.size, false, 0.0)
                UIGraphicsBeginImageContextWithOptions(bounds.size, false, 0.0)
                let context = UIGraphicsGetCurrentContext()
                UIGraphicsPushContext(context!)
                //self.view!.drawHierarchy(in: bounds, afterScreenUpdates: true)
                image.draw(at: CGPoint(x: 0, y: 0))
                UIGraphicsPopContext()
                context?.saveGState()
                self.view!.layer.render(in: context!)
                context?.restoreGState()
                let newImg = UIGraphicsGetImageFromCurrentImageContext()
                UIImageWriteToSavedPhotosAlbum(newImg!, nil, nil, nil);
                UIGraphicsEndImageContext()
                // Save to camera roll
                
            } else {
                print("AVCapturePhotoCaptureDelegate Error")
            }
        }
        
    }
}

What I ideally want

What I am getting in the CapturePhotoDelegate

What I am getting in snapScreen

If someone has an idea what I am doing wrong please let me know. The one aspect I may be missing is that in snapScreen() I am not accessing the image buffer since it's already loaded in the view. I may be wrong there.

Solution

The problem is using the AVVideoPreviewLayer in the overall scheme of things. So what I did was to get rid of the videoPreviewLayer and use a UIView instead.

Constructed the UIView from the capture function of the videoOutputSampleBuffer delegate. So the snapScreen method now works.

func captureOutput(_ output: AVCaptureOutput, didOutput sampleBuffer: CMSampleBuffer, from connection: AVCaptureConnection) {
    guard let imageBuffer = CMSampleBufferGetImageBuffer(sampleBuffer) else {
        NSLog("Failed to get video buffer image")
        return
    }
    lastFrame = sampleBuffer
    predict(sampleBuffer: imageBuffer)
    
}

Then created the UIView before processing every frame for results.

private func updatePreviewOverlayViewWithLastFrame() {
    guard let lastFrame = lastFrame,
          let imageBuffer = CMSampleBufferGetImageBuffer(lastFrame)
    else {
        return
    }
    self.updatePreviewOverlayViewWithImageBuffer(imageBuffer)
    self.removeDetectionAnnotations()
}

Thanks to Google MLKit for this code. Things are now working fine. As a general rule of thumb what I learned is not to use the AVVideoPreviewLayer if you need to capture frames.