I am sending H.264 encoded packets over RTP UDP via the following Gstreamer CLI pipeline:
gst-launch-1.0 videotestsrc is-live=true ! video/x-raw,framerate=30/1 ! timeoverlay ! videoconvert ! x264enc ! h264parse ! rtph264pay pt=96 ! udpsink host=127.0.0.1 port=5000
Note that timeoverlay
element will come in handy later on!
At the receiver side and for depiction only (this is not launched from the CLI), I use the following pipeline:
gst-launch-1.0 -v udpsrc port=5000 ! application/x-rtp,clock-rate=90000,payload=96 ! queue ! appsink
Note that udpsrc
's property of do-timestamp
is set to true by default. So it is the element that is actually timestamping the buffers here.
Using Gstreamer's appsink
element I extract GstSample*
and unix-timestamp them manually and keep them in a queue. The timestamp I did has nothing to do with presentation timestamp of sample's buffer as inGST_BUFFER_PTS(buffer)
it's rather implemented as follows:
struct MyGstSample {
GstSample* sample_;
long unix_time_ns_;
}
Upon requesting a specific unix time interval, I should feed the corresponding
samples GstSample*
from the queue to a separate pipeline via Gstreamer's appsrc
via the following pipeline (for depiction only) and record the interval in a .mp4
file:
gst-launch-1.0 appsrc ! application/x-rtp, media=video, clock-rate=90000, encoding-name=H264, payload=96 ! rtph264depay ! h264parse ! mp4mux ! queue ! filesink
I can't seem to write the required interval precisely, you can tell by the time shown in the recorded video, Thanks to timeoverlay
. For example, if the user requests (from running time 5s to running time 15s for simplicity) I only get from 10s to 15s. I guess this has something to do with a key frame issue or something.
Is there a way where I can keep the frames encoded and yet still ensure precise on-demand video recording?
So problem actually relates to the key frames as the recording pipeline will only start writing the video from the first key frame it finds, all the delta frames are discarded. And the reason why I miss a lot of seconds in the requested interval is because of the configuration of the sender pipeline sending one key frame every 300 frame, This property relates to the x264enc
.
Adjusting the sender pipeline to send key frame every 30 frames:
gst-launch-1.0 videotestsrc is-live=true ! video/x-raw,framerate=30/1 ! timeoverlay ! videoconvert ! x264enc key-int-max=30 ! h264parse ! rtph264pay pt=96 ! udpsink host=127.0.0.1 port=5000
Next up we need to figure out which buffers GstBuffer *
correspond to key frames. This can be done by checking if buffer has the GST_BUFFER_FLAG_DELTA_UNIT
set by GST_BUFFER_FLAG_IS_SET(buf, GST_BUFFER_FLAG_DELTA_UNIT)
At the receiver's side and upon probing the buffers coming out of the udpsrc
, you will find that GST_BUFFER_FLAG_DELTA_UNIT
is always set to false because the data is application/x-rtp
. Here you will need to use rtph264depay
and check the buffers coming out of it. You will start seeing some key frames and delta frames. With that said, the receiver's side should be as follows:
gst-launch-1.0 -v udpsrc port=5000 ! application/x-rtp,clock-rate=90000,payload=96 ! rtph264depay ! queue ! appsink
And finally for the recording pipeline to work, you will need to obtain the GstCaps*
of the rtph264depay
's src pad from the receiver's side and use it as the caps for the appsrc
in the recording pipeline since it contains important codec-specific data. otherwise parser h264parse
will not be able to function properly. Here's the pipeline (for depiction only, you still need to add the caps to the appsrc though)
gst-launch-1.0 appsrc ! h264parse ! mp4mux ! queue ! filesink
With that said, whenever we receive a requested interval, will actually search for the nearest key frame in the queue and start feeding the corresponding samples to recording pipeline.