In a recent project, I have to access all frames of my video individually using AV Foundation. Also, if possible to acess them randomly (like an array)
I tried to research the question but I didn't get anything useful.
Note: Is there any useful documentation to get familiar with the AV Foundation ?
You can enumerate the frames of your video serially using AVAssetReader
, like this:
let asset = AVAsset(URL: inputUrl)
let reader = try! AVAssetReader(asset: asset)
let videoTrack = asset.tracksWithMediaType(AVMediaTypeVideo)[0]
// read video frames as BGRA
let trackReaderOutput = AVAssetReaderTrackOutput(track: videoTrack, outputSettings:[String(kCVPixelBufferPixelFormatTypeKey): NSNumber(unsignedInt: kCVPixelFormatType_32BGRA)])
while let sampleBuffer = trackReaderOutput.copyNextSampleBuffer() {
print("sample at time \(CMSampleBufferGetPresentationTimeStamp(sampleBuffer))")
if let imageBuffer = CMSampleBufferGetImageBuffer(sampleBuffer) {
// process each CVPixelBufferRef here
// see CVPixelBufferGetWidth, CVPixelBufferLockBaseAddress, CVPixelBufferGetBaseAddress, etc
Random access is more complicated. You could use an AVPlayer
+ AVPlayerItemVideoOutput
to get frames from any time t
, using copyPixelBufferForItemTime
, as described in this answer, but the subtlety lies in how you choose that t
If you want to sample the video at uniform intervals, then that's easy, but if you want to land on the same frames/presentation time stamps that the serial AVAssetReader
code sees, then you will probably have to preprocess the file with AVAssetReader
, to build a frame number -> presentation timestamp map. This can be fast if you skip decoding by using nil
output settings in AVAssetReaderTrackOutput