Search code examples
phpcertificatex509certificatex509

How does this code extract the signature?


I have to debug an old PHP script from a developer who has left the company. I understand the most part of the code, except the following function. My question: What does...

if($seq == 0x03 || $seq == 0x30)

...mean in context of extracting the signature out of an X.509 certificate?

public function extractSignature($certPemString) {

    $bin = $this->ConvertPemToBinary($certPemString);

    if(empty($certPemString) || empty($bin))
    {
        return false;
    }    

    $bin = substr($bin,4);

    while(strlen($bin) > 1) 
    {            
        $seq = ord($bin[0]); 
        if($seq == 0x03 || $seq == 0x30) 
        {            
            $len = ord($bin[1]);
            $bytes = 0;

            if ($len & 0x80)
            {
                $bytes = ($len & 0x0f);
                $len = 0;
                for ($i = 0; $i < $bytes; $i++)
                {
                    $len = ($len << 8) | ord($bin[$i + 2]);
                }
            }

            if($seq == 0x03)
            {
                return substr($bin,3 + $bytes, $len);
            }
            else 
            {
                $bin = substr($bin,2 + $bytes + $len);                  
            }                                                    
        }
        else 
        {                            
            return false;                
        }
    }
    return false;
}

Solution

  • An X.509 certificate contains data in multiple sections (called Tag-Length-Value triplets). Each section starts with a Tag byte, which indicates the data format of the section. You can see a list of these data types here.

    0x03 is the Tag byte for the BIT STRING data type, and 0x30 is the Tag byte for the SEQUENCE data type.

    So this code is designed to handle the BIT STRING and SEQUENCE data types. If you look at this part:

    if($seq == 0x03)
    {
        return substr($bin,3 + $bytes, $len);
    }
    else // $seq == 0x30
    {
        $bin = substr($bin,2 + $bytes + $len);                  
    }
    

    you can see that the function is designed to skip over Sequences (0x30), until it finds a Bit String (0x03), at which point it returns the value of the Bit String.

    You might be wondering why the magic number is 3 for Bit String and 2 for Sequence. That is because in a Bit String, the first value byte is a special extra field which indicates how many bits are unused in the last byte of the data. (For example, if you're sending 13 bits of data, it will take up 2 bytes = 16 bits, and the "unused bits" field will be 3.)

    Next issue: the Length field. When the length of the Value is less than 128 bytes, the length is simply specified using a single byte (the most significant bit will be 0). If the length is 128 or greater, then the first length byte has bit 7 set, and the remaining 7 bits indicates how many following bytes contain the length (in big-endian order). More description here. The parsing of the length field happens in this section of the code:

    $len = ord($bin[1]);
    $bytes = 0;
    
    if ($len & 0x80)
    {
        // length is greater than 127!
        $bytes = ($len & 0x0f);
        $len = 0;
        for ($i = 0; $i < $bytes; $i++)
        {
             $len = ($len << 8) | ord($bin[$i + 2]);
        }
    }
    

    After that, $bytes contains the number of extra bytes used by the length field, and $len contains the length of the Value field (in bytes).

    Did you spot the error in the code? Remember,

    If the length is 128 or greater, then the first length byte has bit 7 set, and the remaining 7 bits indicates how many following bytes contain the length.

    but the code says $bytes = ($len & 0x0f), which only takes the lower 4 bits of the byte! It should be:

    $bytes = ($len & 0x7f);
    

    Of course, this error is only a problem for extremely long messages: it will work fine as long as the length value will fit within 0x0f = 15 bytes, meaning the data has to be less than 256^15 bytes. That's about a trillion yottabytes, which ought to be enough for anybody.