Search code examples
videoffmpegvideo-encoding

Concatenating videos Audio/Video out of sync : Non-monotonous TDS output


I am trying to add two videos using ffmpeg and the output video/audio is out of sync (and fast forwarded). Goal is to put intro.mp4 before the original file clip.flv

My approach is to

  1. Change the format of clip.flv to clip.mp4
ffmpeg -i clip.flv -q 0 -c copy clip.mp4
  1. Concat intro.mp4 with clip.mp4
ffmpeg -f concat -safe 0 -i filesToJoin.txt -c copy combinedvideo.mp4

I see this in output log for command#2

[mp4 @ 0x3ebcd60] Non-monotonous DTS in output stream 0:0; previous: 392311, current: 391925; changing to 392312. This may result in incorrect timestamps in the output file.
frame= 1566 fps=0.0 q=-1.0 Lsize=    8711kB time=00:00:48.86 bitrate=1460.2kbits/s speed= 272x    
video:7363kB audio:1294kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 0.619341%

Files Metadata

Original source file clip.flv metadata

  Metadata:
    encoder         : Lavf57.83.100
  Duration: 00:00:40.09, start: 0.010000, bitrate: 1632 kb/s
    Stream #0:0: Video: h264 (High), yuv420p(progressive), 1280x720, 1500 kb/s, 30 fps, 30 tbr, 1k tbn, 60 tbc
    Stream #0:1: Audio: aac (LC), 44100 Hz, stereo, fltp, 160 kb/s

Intermediary file intro.mp4 metadata

  Metadata:
    major_brand     : mp42
    minor_version   : 0
    compatible_brands: mp42mp41
    creation_time   : 2020-06-02T10:36:51.000000Z
  Duration: 00:00:13.21, start: 0.000000, bitrate: 484 kb/s
    Stream #0:0(eng): Video: h264 (Main) (avc1 / 0x31637661), yuv420p(tv, bt709), 1280x720 [SAR 1:1 DAR 16:9], 130 kb/s, 30 fps, 30 tbr, 30k tbn, 60k tbc (default)
    Metadata:
      creation_time   : 2020-06-02T10:36:51.000000Z
      handler_name    : Alias Data Handler
      encoder         : AVC Coding
    Stream #0:1(eng): Audio: aac (LC) (mp4a / 0x6134706D), 48000 Hz, stereo, fltp, 317 kb/s (default)
    Metadata:
      creation_time   : 2020-06-02T10:36:52.000000Z
      handler_name    : Alias Data Handler

File clip.mp4 metadata

    Metadata:
    major_brand     : isom
    minor_version   : 512
    compatible_brands: isomiso2avc1mp41
    encoder         : Lavf57.83.100
  Duration: 00:00:40.01, start: 0.000000, bitrate: 1635 kb/s
    Stream #0:0(und): Video: h264 (High) (avc1 / 0x31637661), yuv420p, 1280x720, 1500 kb/s, 30 fps, 30 tbr, 16k tbn, 60 tbc (default)
    Metadata:
      handler_name    : VideoHandler
    Stream #0:1(und): Audio: aac (LC) (mp4a / 0x6134706D), 44100 Hz, stereo, fltp, 160 kb/s (default)
    Metadata:
      handler_name    : SoundHandler

Things I have tried

  • As you'll see that the both videos have same frame rate but different timescale, I have tried changing the timescale of one of the video to match the other before combining the video but with no luck. For timescale change I used this command

ffmpeg -i clip.mp4 -video_track_timescale 30000 clip_ts30000.mp4

  • I have seen the similar questions on SO, no luck

Solution

  • Update: I tried a workaround to convert the video to intermediary format .MTS, then concat the videos and then convert output video to .mp4 as final output.

    # Step 1: Convert to MTS format
    ffmpeg -i clip.flv -q 0 clip.MTS
    ffmpeg -i intro.mp4 -q 0 intro.MTS
    
    # Step 2: Concat clip.MTS and intro.MTS
    ffmpeg -f concat -i filesToSync.txt -c copy out.MTS
    
    # Step 3: Convert the output back to mp4
    ffmpeg -i out.MTS -q 0 out.mp4
    

    This workaround works for this specific usecase with a quirk: applying more ffmpeg operations on the final output file will give error - unspecified pixel format.