Just a quick question in regards to video encoding/muxing a video file with ffmpeg. Basically, I have my muxer functioning and I'm trying to have my packets output the correct PTS/DTS.
This is a portion of my code that encodes my AVFrame, muxing it to an output file:
int ret;
int got_packet = 0;
AVPacket pkt = { 0 };
av_init_packet(&pkt);
pkt.data = NULL;
pkt.size = 0;
/* encode the image */
ret = avcodec_encode_video2(cc, &pkt, frame, &got_packet);
if (ret < 0)
{
fprintf(stderr, "error encoding video frame: %s\n", av_err2str(ret));
exit(EXIT_FAILURE);
}
if (got_packet)
{
av_packet_rescale_ts(&pkt, cc->time_base, st->time_base);
fprintf(stderr, "\npkt.pts = %ld\n", pkt.pts);
fprintf(stderr, "pkt.dts = %ld\n", pkt.dts);
fprintf(stderr, "writing frame\n");
ret = av_interleaved_write_frame(fmt_ctx, &pkt);
av_packet_unref(&pkt);
}
else
{
ret = 0;
}
...
I'm then getting an output of the following:
pkt.pts = 0
pkt.dts = 0
writing frame
pkt.pts = 1502
pkt.dts = 0
writing frame
pkt.pts = 3003
pkt.dts = 1502
writing frame
pkt.pts = 4505
pkt.dts = 3003
writing frame
...
My goal is to have my PST and DST both with the pattern: 1502, 3003, 4505, 6006, 7508, ...
But it seems that the first DTS value is repeating once, and thus being off-sync with it's corresponding PTS value. It's also worth mentioning that the codec context was configured to have no b-frames, so only i- and p- frames are present here.
Does anyone with more experience have some insight on this?
Addition:
I ran the following command in terminal to check if my DTS and PTS values were consistent to my print statements:
sudo ./ffprobe -show_packets -print_format json mux_test.ts | less
And I got the following:
{
"packets": [
{
"codec_type": "video",
"stream_index": 0,
"pts": 0,
"pts_time": "0.000000",
"dts": -1501,
"dts_time": "-0.016678",
"duration": 1501,
"duration_time": "0.016678",
"convergence_duration": "N/A",
"convergence_duration_time": "N/A",
"size": "55409",
"pos": "564",
"flags": "K"
},
{
"codec_type": "video",
"stream_index": 0,
"pts": 1502,
"pts_time": "0.016689",
"dts": 0,
"dts_time": "0.000000",
"duration": 1501,
"duration_time": "0.016678",
"convergence_duration": "N/A",
"convergence_duration_time": "N/A",
"size": "46574",
"pos": "60160",
"flags": "_"
},
{
"codec_type": "video",
"stream_index": 0,
"pts": 3003,
"pts_time": "0.033367",
"dts": 1502,
"dts_time": "0.016689",
"duration": 1501,
"duration_time": "0.016678",
"convergence_duration": "N/A",
"convergence_duration_time": "N/A",
"size": "2544",
"pos": "110356",
"flags": "_"
},
...
Which doesn't show my first DTS value repeated, but continues to show my DTS one cycle behind my PTS.
After debugging the API, I managed to come to a conclusion.
For DTS values to be valid, they must increase at a monotonous, consistent rate (assuming the time base and frame rate are not somehow altered during the muxing), so the values themselves are not as important.
This block of code is straight from the ffmpeg libray. It is found in mpegvideo_enc.c on lines 2074-2080 (reformatted for clarity):
...
pkt->pts = s->current_picture.f->pts;
if (!s->low_delay && s->pict_type != AV_PICTURE_TYPE_B)
{
if (!s->current_picture.f->coded_picture_number)
{
pkt->dts = pkt->pts - s->dts_delta;
}
else
{
pkt->dts = s->reordered_pts;
}
s->reordered_pts = pkt->pts;
...
As you can see, only the first frame will enter the if (!s->current_picture.f->coded_picture_number) statement, having a coded_picture_number value of 0. Every subsequent frame will enter the else statement, setting the current DTS equal the previous PTS value.
Therefore, this behavior seems to be normal for muxing situations using the MPEG-2 encoder. The DTS should trail behind by "1-cycle".