I'm trying to build a real-time video streaming application in C++ based on H265 codec. My application cares the real-time performance very much, and I have built a single thread program to test the H265 codecs. The program has a simply pipeline:
I have tried X265/DE265
combination, and the AV_CODEC_ID_HEVC
encoder and decoder in avcodec
combination, and I found a phenomenon that the decoder will not decode "right away" after the first frame's data arrives, and it has to wait until about 30 frames of data before start outputting the decoded result. The situation looks like this:
**encoding** **decoding**
frame 1: succeeded -> no frame decoded
frame 2: succeeded -> no frame decoded
frame 3: succeeded -> no frame decoded
...
frame 30: succeeded -> no frame decoded
frame 31: succeeded -> frame 1 outputted
frame 32: succeeded -> frame 2 outputted
...
This will result a 1 to 2 seconds delay from the encoder. I'm wondering why and if there is a way to avoid that.
Thank you!
One reason can be forward referencing in B-slices.
For instance, by choosing a Group Of Picture (GOP) of size 32, with hierarchical structure, you may impose a decoding delay of about 1 second (assuming 25fps).
More precisely, the reconstruction of your second frame (first frame is Intra, hence independently decodable) may indirectly depend to your 32nd frame.
This coding mode is usually called Random Access. Look it up. You can avoid it by using the LowDelayP mode. Or All Intra. In other words, the delay depends on your GOP structure.