Search code examples
rtprtsp

Detect MPEG4/H264 I-Frame (IDR) in RTP stream


I need to detect MPEG4 I-Frame in RTP packet. I know how to remove RTP header and get the MPEG4 frame in it, but I can't figure out how to identify the I-Frame.

Does it have a specific signature/header?


Solution

  • Ok so I figured it out for h264 stream.

    How to detect I-Frame:

    • remove RTP header
    • check the value of the first byte in h264 payload
    • if the value is 124 (0x7C) it is an I-Frame

    I cant figure it out for the MPEG4-ES stream... any suggestions?

    EDIT: H264 IDR

    This works for my h264 stream (fmtp:96 packetization-mode=1; profile-level-id=420029;). You just pass byte array that represents the h264 fragment received through RTP. If you want to pass whole RTP, just correct the RTPHeaderBytes value to skip RTP header. I always get the I-Frame, because it is the only frame that can be fragmented, see here. I use this (simplified) piece of code in my server, and it works like a charm!!!! If the I-Frame (IDR) is not fragmented, the fragment_type would be 5, so this code would return true for the fragmented and not fragmented IDRs.

    public static bool isH264iFrame(byte[] paket)
        {
            int RTPHeaderBytes = 0;
    
            int fragment_type = paket[RTPHeaderBytes + 0] & 0x1F;
            int nal_type = paket[RTPHeaderBytes + 1] & 0x1F;
            int start_bit = paket[RTPHeaderBytes + 1] & 0x80;
    
            if (((fragment_type == 28 || fragment_type == 29) && nal_type == 5 && start_bit == 128) || fragment_type == 5)
            {
                return true;
            }
    
            return false;
       }
    

    Here's the table of NAL unit types:

     Type Name
        0 [unspecified]
        1 Coded slice
        2 Data Partition A
        3 Data Partition B
        4 Data Partition C
        5 IDR (Instantaneous Decoding Refresh) Picture
        6 SEI (Supplemental Enhancement Information)
        7 SPS (Sequence Parameter Set)
        8 PPS (Picture Parameter Set)
        9 Access Unit Delimiter
       10 EoS (End of Sequence)
       11 EoS (End of Stream)
       12 Filter Data
    13-23 [extended]
    24-31 [unspecified] 
    

    EDIT 2: MPEG4 I-VOP

    I forgot to update this... Thanx to Che and ISO IEC 14496-2 document, I managed to work this out! Che was rite, but not so precise in his answer... so here is how to find I, P and B frames (I-VOP, P-VOP, B-VOP) in short:

    1. VOP (Video Object Plane -- frame) starts with a code 000001B6(hex). It is the same for all MPEG4 frames (I,P,B)
    2. Next follows many more info, that I am not going to describe here (see the IEC doc), but we only (as che said) need the higher 2 bits from the following byte (next two bits after the byte with the value B6). Those 2 bits tell you the VOP_CODING_TYPE, see the table:

      VOP_CODING_TYPE (binary)  Coding method
                            00  intra-coded (I)
                            01  predictive-coded (P)
                            10  bidirectionally-predictive-coded (B)
                            11  sprite (S)
      

    So, to find I-Frame find the packet starting with four bytes 000001B6 and having the higher two bits of the next byte 00. This will find I frame in MPEG4 stream with a simple video object type (not sure for advanced simple).

    For any other problems, you can check the document provided (ISO IEC 14496-2), there is all you want to know about MPEG4. :)