Search code examples
videoffmpeg

Using ffmpeg to extract multiple images from video and get timestamp of extracted images


I'm using ffmpeg to extract one frame (as a jpeg) every five minutes from videos, and piping the output from the console to a text file in order to get the exact timestamps of the extracted frames.

The command I'm using is:

ffmpeg -i input.avi -ss 00:10:00 -vframes 10 -vf showinfo,fps=fps=1/300 %03d.jpg &> output.txt

Where -ss 00:10:00 lets me skip ahead 10 mins in the video before starting, and -vframes 10 lets me capture only the first 10 frames (1 frame per 5 mins).

This almost works fine except that the command outputs information for all frames, including those that were not written as a jpeg. Here's a three line sample output:

[Parsed_showinfo_0 @ 0x2219020] n:11427 pts:11429 pts_time:599.979 pos:48892180 fmt:yuv420p sar:1/1 s:640x480 i:P iskey:0 type:P checksum:6309A75D plane_checksum:[15A29007 1617E1FE D93A3549] mean:[146 125 153 ] stdev:[17.6 1.0 2.1 ]
[Parsed_showinfo_0 @ 0x2219020] n:11428 pts:11430 pts_time:600.031 pos:48898094 fmt:yuv420p sar:1/1 s:640x480 i:P iskey:0 type:B checksum:815D031A plane_checksum:[E004E973 E28CE2D5 F56636B4] mean:[146 125 153 ] stdev:[17.6 1.0 2.1 ]
[Parsed_showinfo_0 @ 0x2219020] n:11429 pts:11431 pts_time:600.084 pos:48892448 fmt:yuv420p sar:1/1 s:640x480 i:P iskey:0 type:P checksum:6CE2D3C5 plane_checksum:[E983BD86 38B9E198 93B13498] mean:[146 125 153 ] stdev:[17.6 1.0 2.1 ]

I would expect the middle line, with pts_time:600.031, to be the first frame extracted as an image, but have no way to distinguish it from the other frames either side, where images were not extracted.

Does anyone know of a way to resolve this?

Thank you!


Solution

  • In answer to my own question I've now found a workaround, though I'm not exactly sure how it works. By defining a select argument within -vf and also adding a vsync 0 parameter like so:

    ffmpeg -i input.avi -vframes 10 -vf '[in]select=not(mod(n\,300*19.05))[s1];[s1]showinfo[out]' -vsync 0 %02d.jpg >& output.txt
    

    ...the function now returns the desired 10 frames only. Here's a sample stderr output of the first two frames:

    [Parsed_showinfo_1 @ 0x21b1b60] n:0 pts:2 pts_time:0.104992 pos:10248 fmt:yuv420p sar:1/1 s:640x480 i:P iskey:1 type:I checksum:F6FDCFBF plane_checksum:[5FB6331C 9D9D7F99 44FB1D0A] mean:[183 126 155 ] stdev:[19.6 0.8 2.8 ]
    frame=    1 fps=0.0 q=4.1 size=N/A time=00:00:00.15 bitrate=N/A    
    frame=    1 fps=1.0 q=4.1 size=N/A time=00:00:00.15 bitrate=N/A    
    frame=    1 fps=0.7 q=4.1 size=N/A time=00:00:00.15 bitrate=N/A    
    frame=    1 fps=0.5 q=4.1 size=N/A time=00:00:00.15 bitrate=N/A    
    frame=    1 fps=0.4 q=4.1 size=N/A time=00:00:00.15 bitrate=N/A    
    frame=    1 fps=0.3 q=4.1 size=N/A time=00:00:00.15 bitrate=N/A    
    frame=    1 fps=0.3 q=4.1 size=N/A time=00:00:00.15 bitrate=N/A    
    frame=    1 fps=0.2 q=4.1 size=N/A time=00:00:00.15 bitrate=N/A    
    frame=    1 fps=0.2 q=4.1 size=N/A time=00:00:00.15 bitrate=N/A    
    [Parsed_showinfo_1 @ 0x21b1b60] n:1 pts:5717 pts_time:300.121 pos:24474150 fmt:yuv420p sar:1/1 s:640x480 i:P iskey:0 type:P checksum:BAECD030 plane_checksum:[F609470E 45F694CE 4BFCF445] mean:[148 126 152 ] stdev:[17.7 0.8 2.3 ]
    frame=    2 fps=0.4 q=2.1 size=N/A time=00:05:00.17 bitrate=N/A    
    frame=    2 fps=0.4 q=2.1 size=N/A time=00:05:00.17 bitrate=N/A    
    frame=    2 fps=0.3 q=2.1 size=N/A time=00:05:00.17 bitrate=N/A    
    frame=    2 fps=0.3 q=2.1 size=N/A time=00:05:00.17 bitrate=N/A    
    frame=    2 fps=0.3 q=2.1 size=N/A time=00:05:00.17 bitrate=N/A    
    frame=    2 fps=0.3 q=2.1 size=N/A time=00:05:00.17 bitrate=N/A    
    frame=    2 fps=0.2 q=2.1 size=N/A time=00:05:00.17 bitrate=N/A    
    frame=    2 fps=0.2 q=2.1 size=N/A time=00:05:00.17 bitrate=N/A    
    frame=    2 fps=0.2 q=2.1 size=N/A time=00:05:00.17 bitrate=N/A   
    

    Still unsure exactly why this works or why each frame= ... is replicated so many times in the output, but it seems to do the job!