Search code examples
ffmpegvideo-streamingred5rtmpwowza

Extract frames as images from an RTMP stream in real-time


I am streaming short videos (4 or 5 seconds) encoded in H264 at 15 fps in VGA quality from different clients to a server using RTMP which produced an FLV file. I need to analyse the frames from the video as images as soon as possible so I need the frames to be written as PNG images as they are received.

Currently I use Wowza to receive the streams and I have tried using the transcoder API to access the individual frames and write them to PNGs. This partially works but there is about a second delay before the transcoder starts processing and when the stream ends Wowza flushes its buffers causing the last second not to get transcoded meaning I can lose the last 25% of the video frames. I have tried to find a workaround but Wowza say that it is not possible to prevent the buffer getting flushed. It is also not the ideal solution because there is a 1 second delay before I start getting frames and I have to re-encode the video when using the transcoder which is computationally expensive and unnecessarily for my needs.

I have also tried piping a video in real-time to FFmpeg and getting it to produce the PNG images but unfortunately it waits until it receives the entire video before producing the PNG frames.

How can I extract all of the frames from the stream as close to real-time as possible? I don’t mind what language or technology is used as long as it can run on a Linux server. I would be happy to use FFmpeg if I can find a way to get it to write the images while it is still receiving the video or even Wowza if I can find a way not to lose frames and not to re-encode.

Thanks for any help or suggestions.


Solution

  • I finally found a way to do this with FFmpeg. The trick was to disable audio, use a different flv meta data analyser and to reduce the duration that FFmpeg waits for before processing. My FFmpeg command now starts like this:

    ffmpeg -an -flv_metadata 1 -analyzeduration 1 ...
    

    This starts producing frames within a second of receiving an input from a pipe so writes the streamed frames pretty close to real-time.