Search code examples
iosswiftavfoundationavplayeravplayeritem

Processing subtitles while video is playing in iOS


I need to do some text processing on the subtitle of a local (or remote) video file as they appear on the screen. Currently I am using AVPlayerItem and by looking into its assets I can see that there is a closed-caption track, but I cannot get the actual subtitle text. So the question is how can I get the actual text of the subtitle track when they are displayed (preferably in swift).


Solution

  • The presentation at http://www.slideshare.net/invalidname/stupid-videotricksdc2014 should give you everything you need for this (see link to sample code in the comments), since the AVSubtitleReader example does exactly what you’re asking.

    Once you see that there is a text track (closed-caption, subtitle, whatever), you need to get that track, open it with an AVAssetReader, and then start reading sample buffers. Each one will have a presentation timestamp and duration, and then a CMBlockBuffer. The contents of that buffer depend on what kind of track you’re reading, and these are defined in the QuickTime File Format documentation. In the case of text, it’s just a Pascal string: a big-endian UInt16 that tell you how long the text is, followed by the text itself. Granted, the required endian-flips and data copying are more natural in C than in Swift, but surely it’s doable. Or, just leave this part in its own C or Obj-C file, and call it from Swift.