Search code examples
c++videoffmpegframe-ratetranscoding

c++, ffmpeg tanscoding: time_base differs depending on the container


I transcode video (mkv and mp4). When mkv transcoded to mkv, output is fine (output video fps and duration are same as input), but if mkv transcoded to mp4, output fps is less than input 2 times and duration of output video is more than input 2 times.

I transcode only video, audio writing as decoded packet from input file.

Video stream and context created like this:

out_stream = avformat_new_stream(ofmt_ctx, NULL);
avcodec_parameters_copy(out_stream->codecpar, in_codecpar);
out_stream->codecpar->codec_tag = 0;
codec_encode  = avcodec_find_encoder(out_stream->codecpar->codec_id);
context_encode = avcodec_alloc_context3(codec_encode);
context_encode->width       = width;
context_encode->height      = height;
context_encode->pix_fmt     = codec_encode->pix_fmts[0];
context_encode->time_base   = av_inv_q(in_stream->r_frame_rate);
out_stream->time_base       = context_encode->time_base;
out_stream->r_frame_rate    = in_stream->r_frame_rate;

Transcoding (simplified):

 int64_t i = 0; 
 while (true) {
        av_read_frame(ifmt_ctx, pkt);
        in_stream = ifmt_ctx->streams[pkt->stream_index];
        pkt->stream_index = stream_mapping[pkt->stream_index];
        pCodecCtx = ifmt_ctx->streams[pkt->stream_index]->codec;
        pCodec = avcodec_find_decoder(pCodecCtx->codec_id);
        error = avcodec_open2(pCodecCtx, pCodec, nullptr);
        if (pkt->stream_index == AVMEDIA_TYPE_VIDEO) {
             ....
             avcodec_decode_video2(pCodecCtx, frame, &frameFinished, pkt);
             ....
             // manipulate with frame
             ....
             frame->pts = i;
             avcodec_send_frame(context_encode, frame);
             while ((ret = avcodec_receive_packet(context_encode, pkt_encode)) >= 0) {
                  if (ret == AVERROR(EAGAIN) || ret == AVERROR_EOF)
                       break;
                  av_packet_rescale_ts(pkt_encode, context_encode->time_base, out_stream->time_base);
                  av_interleaved_write_frame(ofmt_ctx, pkt_encode);
                  av_packet_unref(pkt_encode);
             }
             i++;
        }
         else {
            av_packet_rescale_ts(pkt, in_stream->time_base, out_stream->time_base);
            av_interleaved_write_frame(ofmt_ctx, pkt);
        }
        av_packet_unref(pkt);
   }

Mediainfo of output mkv transcoded video (mkv -> mkv):

  • Frame rate : 23.976 (24000/1001) FPS

Mediainfo of output mp4 transcoded video (mkv -> mp4):

  • Frame rate : 11.988 (12000/1001) FPS
  • Original frame rate : 23.976 (24000/1001) FPS

When video context created, time_base values are (mkv -> mp4 and mkv -> mkv):

FPS input: (24000/1001)
FPS output: (24000/1001)
context_decode->time_base (1001 / 48000)
context_encode->time_base (1001 / 24000)
in_stream->time_base (1 / 1000)
in_stream->codec->time_base (1001 / 48000)
out_stream->time_base (1001 / 24000)
out_stream->codec->time_base (0 / 1)

When video frame is writing, time_base values are (mkv -> mp4):

context_encode->time_base (1001 / 24000)
out_stream->time_base (1 / 48000)

But if mkv->mkv:

context_encode->time_base (1001 / 24000) 
out_stream->time_base (1 / 1000)

ffmpeg av_dump:

Input #0, matroska,webm, from '24fps2.mkv':
 - Stream #0:0: Video: h264 (High), yuv420p(tv, bt709, progressive), 1920x1080 [SAR 1:1 DAR 16:9], 23.98 fps, 23.98 tbr, 1k tbn, 47.95 tbc (default)
 - Stream #0:1: Audio: aac (LC), 48000 Hz, stereo, fltp (default)

Output #0, mp4, to 'temp_read.mp4':
 - Stream #0:0: Video: h264 (High), yuv420p(tv, bt709, progressive), 1920x1080 [SAR 1:1 DAR 16:9], q=2-31, 23.98 tbr, 23.98 tbn
 - Stream #0:1: Audio: aac (LC), 48000 Hz, stereo, fltp

But if I manually set time_base to be equal to FPS/2 of input video:

AVRational temp;
temp.num = 500;
temp.den = 24001;
context_encode->time_base   = temp;
out_stream->time_base       = context_encode->time_base;
out_stream->r_frame_rate    = in_stream->r_frame_rate;

When video stream and context created, time_base values are (mkv -> mp4):

context_encode->time_base (500 / 24001)
out_stream->time_base (500 / 24001)

When video frame is writing, time_base values are (mkv -> mp4):

context_encode->time_base (500 / 24001)
out_stream->time_base (1 / 48000)

And video FPS and duration is correct:

  • Frame rate : 23.976 (24000/1001) FPS

What is wrong with time_base and av_packet_rescale in this case and how it could be fixed?


Solution

  • Problem was with different timebases. Audio encoding:

    av_packet_rescale_ts(pkt, in_stream->time_base, out_stream->time_base);
    

    Video encoding:

    av_packet_rescale_ts(pkt_encode, context_encode->time_base, out_stream->time_base);
    

    But out_stream variable used the same with both encodings. I replaced out_stream->time_base with ofmt_ctx->streams[pkt->stream_index]->time_base and now it works fine.