Search code examples
swiftvideo-processingpose-detection

Storing AVDepthData for processing in Swift


I'm trying to use VNDetectHumanBodyPose3DRequest to get 3D keypoints from videos taken in my app at 30 fps. Since the request takes too long to do in real time I'm saving the CMSampleBuffers from the camera to a video file using an AVAssetWriter. I'm then processing the frames after the recording. I'm trying to get better results by incorporating AVDepthData for each frame into the request but I'm not sure how to store the depth data to process them after recording.

I can't store the depth data into an array since it takes too much memory. I also tried using CGImageDestination to save each frame as an HEIC file with the depth data encoded but saving each frame is too slow. I'm thinking I could encode each AVDepthData as a frame in a separate video and then convert each frame back but I'm not sure how I could do this. Does anyone know of a way to go about this or have any resources to point me towards? Thanks.


Solution

  • I realized I can can write the depthMap in each AVDepthData to a file on disk using Data and FileManager. Then after recording I read the data from the file and create dictionaries as described here, and recreate the AVDepthData by passing the dictionary into its initializer.