Search code examples
codecasn.1

Explanation of the UPER encoding of an extensible sequence


I have the following ASN.1 SEQENCE type, with the following value, and that produces the below encoding in UPER. I'm interested in why the encoding looks like it does, particularly the start as it's what's consistent across multiple compilers and encoders (both open source and commercial), but from reading the spec I don't understand how it's possible.

The first byte is 1110 0000, and I don't understand how the start is 111 in the encoding, as from looking at the sequence and value, it feels like it should be 11, as the first bit is the extension addition presence flag, and the second bit is the first boolean, however I don't know where the third 1 comes from, as from reading the X.691 specification, the next thing should be the normally small integer length determinant of the bitfield of extension additions, and since the number of extension additions is 3, the next bit should be zero to indicate that it's a bitfield of less than 64 bits, followed by the bitfield, but in every implementation the third bit is 1, is there something I'm missing from the spec that would explain why the third bit is 1 and not 0?

Edit: The second byte is 0111 0000 which would imply a bit field length of maybe one (1), or three 11, but given the recent answer indicating that there is fact only two extension additions, I'm confused as what the length determinant is supposed to be here for the extensions bitfield.

Schema And Value

  G ::= SEQUENCE {
              a BOOLEAN,
              ...,
              [[
              b BOOLEAN
              ]],
              [[
              c BOOLEAN
              ]],
              ...,
              d BOOLEAN
            }

value ::= G {
  a TRUE,
  b TRUE,
  c TRUE,
  d TRUE
}

encoding (hexadecimal):

E070180018000A

Solution

  • The components "a" and "d" constitute the extension root of your sequence type. Note that the second ellipsis marks the end of the extension additions and the beginning of the second part of the extension root.

    In this sequence type, there are two extension addition groups. The first group contains "b" and the second group contains "c". The first bit of the PER encoding is '1' because the sequence value contains one or more extensions. The second bit of the PER encoding contains the encoding of component "a" and the third bit contains the encoding of component "d" (not "b").

    Here is an explanation of this encoding:

    offset (bits), length (bits), description
    
     0, 1, preamble
        0, 1, extension bit ('1': the sequence value is extended)
     1, 1, root component 'a' ('1': TRUE)
     2, 1, root component 'd' ('1': TRUE)
     3, 1, format of the length of the extension bitmap ('0': a "normally small" length)
     4, 6, length of the extension bitmap ('000001': the length is 1+1=2 bits) --see note below
    10, 2, extension bitmap ('11')
    12, 8, length of the first extension ('00000001': the length is 1 octet)
    20, 8, first extension
        20, 1, component 'b' ('1': TRUE)
        21, 7, padding ('0000000')
    28, 8, length of the second extension ('00000001': the length is 1 octet)
    36, 8, second extension
        36, 1, component 'c' ('1': TRUE)
        36, 7, padding ('0000000')
    44, 4, final padding of the complete encoding ('0000')
    
    total length: 48 bits
    
    11100000 01110000 00011000 00000000 00011000 00000000
    E0 70 18 00 18 00
    

    Note: A "normally small length" is a length determinant that is restricted to the range 1..64. Such a length value must be encoded by encoding the length minus one into a 6-bit unsigned integer field.