Search code examples
opensslssl-certificatex509certificate

What does `openssl x509 -hash` calculate the hash of?


In the following command, openssl x509 -in example.crt -hash -noout outputs 8927dc31.

openssl req -out example.crt -keyout example.key -newkey rsa:2048 -nodes -x509 -subj '/C=US/CN=example.com' -days 3650
openssl x509 -in example.crt -hash -noout  # 8927dc31

openssl-x509(1) just says it's the "hash" of the subject name.

       -subject_hash
           Outputs the "hash" of the certificate subject name. This is used in OpenSSL to form an index to allow certificates in a
           directory to be looked up by subject name.

       -issuer_hash
           Outputs the "hash" of the certificate issuer name.

       -hash
           Synonym for "-subject_hash" for backward compatibility reasons.
  • What is the "hash" function? (sha1? md5?)
  • What exactly is "the subject name"? (Subject: C = US, CN = example.com in openssl x509 -in example.crt -text?)
  • Can I reproduce the same hash value with the command line?

Solution

  • The first 4 bytes (8 hex-letters) of the sha1 hash of the ASN.1-encoded subject value (issuer value for -issuer_hash).

    You can reproduce the hash with the following command:

    echo '
      310b30 09060355
    04060c02 75733114
    30120603 5504030c
    0b657861 6d706c65
    2e636f6d
    ' | xxd -r -p | sha1sum
    # => 31dc2789c1e1182fbfbb64ee0a0c9a6e11276f97  -
    

    The first 4 bytes is 31dc2789. If the CPU on which openssl runs is little-endian (including x86_64), openssl inverts the bytes [1] (31 dc 27 8989 27 dc 31) then prints 8927dc31

    The ASN.1-encoded subject value 310b30... is found by wireshark example.crt.

    Finding out the ASN.1-encoded subject value with Wireshark

    If the subject is empty (-subj '/'), the hash is the sha1 of empty data.

    openssl req -out example.crt -keyout example.key -newkey rsa:2048 -nodes -x509 -subj '/' -days 3650
    openssl x509 -in example.crt -hash -noout  # eea339da
    sha1sum </dev/null
    # => da39a3ee5e6b4b0d3255bfef95601890afd80709  -
    # da 39 a3 ee ... -> flip bytes: ee a3 39 da: eea339da
    

    [1]: This looks unnatural to me. I consider this should have been ntohl()ed.