Explaining SSL ClientHello SNI message extension syntax defined by RFC6066 Server Name Indication

RFC6066 defines server name indication in extension of type server_name. The extension_data field of this extension SHALL contain ServerNameList where:

      struct {
          NameType name_type;
          select (name_type) {
              case host_name: HostName;
          } name;
      } ServerName;

      enum {
          host_name(0), (255)
      } NameType;

      opaque HostName<1..2^16-1>;

      struct {
          ServerName server_name_list<1..2^16-1>
      } ServerNameList;

Would be nice to have step by step explanation of this data structure. Also, here is the sample code, could be found here, how to read the extension data:

private static List<SNIServerName> exploreSNIExt(ByteBuffer input,
        int extLen) throws IOException {

    Map<Integer, SNIServerName> sniMap = new LinkedHashMap<>();

    int remains = extLen;
    if (extLen >= 2) {     // "server_name" extension in ClientHello
        int listLen = getInt16(input);     // length of server_name_list
        if (listLen == 0 || listLen + 2 != extLen) {
            throw new SSLProtocolException(
                    "Invalid server name indication extension");
        }

        remains -= 2;     // 0x02: the length field of server_name_list
        while (remains > 0) {
            int code = getInt8(input);      // name_type
            int snLen = getInt16(input);    // length field of server name
            if (snLen > remains) {
                throw new SSLProtocolException(
                        "Not enough data to fill declared vector size");
            }
            byte[] encoded = new byte[snLen];
            input.get(encoded);

            SNIServerName serverName;
            switch (code) {
                case StandardConstants.SNI_HOST_NAME: // 0x00
                    if (encoded.length == 0) {
                        throw new SSLProtocolException(
                                "Empty HostName in server name indication");
                    }
                    serverName = new SNIHostName(encoded);
                    break;
                default:
                    serverName = new UnknownServerName(code, encoded);
            }
            // check for duplicated server name type
            if (sniMap.put(serverName.getType(), serverName) != null) {
                throw new SSLProtocolException(
                        "Duplicated server name of type "
                        + serverName.getType());
            }

            remains -= encoded.length + 3;  // NameType: 1 byte
            // HostName length: 2 bytes
        }
    } else if (extLen == 0) {     // "server_name" extension in ServerHello
        throw new SSLProtocolException(
                "Not server name indication extension in client");
    }

    if (remains != 0) {
        throw new SSLProtocolException(
                "Invalid server name indication extension");
    }

    return Collections.<SNIServerName>unmodifiableList(
            new ArrayList<>(sniMap.values()));
}

Byte reader:

private static int getInt16(ByteBuffer input) {
    return ((input.get() & 0xFF) << 8) | (input.get() & 0xFF);
}

Here is good example how data should be read. For example, extension type is defined by reading 2 bytes - so another question is - which RFC defines it?

Solution

If you already have the source code implementing it, what is more needed to know?

The format used for the abstract schema is derived from XDR but is defined specifically in each TLS specification, like for the last one in 3. Presentation Language

So if we go piece by piece:

  struct {
      NameType name_type;
      select (name_type) {
          case host_name: HostName;
      } name;
  } ServerName;

See https://www.rfc-editor.org/rfc/rfc8446#section-3.6, this defines a structure:

called "ServerName"
whose first component is called name_type and is of type called NameType (defined later)
whose second and last component is called name and is a variant (https://www.rfc-editor.org/rfc/rfc8446#section-3.8): its value depends of the previous name_type content. If name_type have the value of host_name, then the value of this second component is of type HostName (defined later)

enum {
      host_name(0), (255)
} NameType;

See https://www.rfc-editor.org/rfc/rfc8446#section-3.5, this defines an enumeration with only one possible value (0), whose alias is host_name

The (255) is solely used to force the width (so 0 to 255 as values fits in one byte, this structure uses one byte of space), as explained in the specification:

One may optionally specify a value without its associated tag to force the width definition without defining a superfluous element.

So it means on the wire you use 0, but if you have 0 it is to encode host_name in other parts of the specification.

  opaque HostName<1..2^16-1>;

In https://www.rfc-editor.org/rfc/rfc8446#section-3.2 we have:

Single-byte entities containing uninterpreted data are of type opaque.

And in https://www.rfc-editor.org/rfc/rfc8446#section-3.4, <> is used to define a variable length vector (or single dimensional array, or list).

So HostName is a vector containing between 1 to 2¹⁶-1 bytes (not elements), each element being of type "opaque", that is a single byte.

Note that there is further explanation as text in the RFC about SNI:

"HostName" contains the fully qualified DNS hostname of the server, as understood by the client. The hostname is represented as a byte string using ASCII encoding without a trailing dot.

  struct {
      ServerName server_name_list<1..2^16-1>
  } ServerNameList;

Same case as first one, but using a variable length array like above

a ServerNameList is a structure
whose only element is an array of variable length between 1 and 2¹⁶-1 bytes
each element of this array being of type ServerName, as defined previously

Said differently:

a ServerNameList structure is a list of elements, each one being of type ServerName
this list can not be empty since it has at least a length of 1 byte (and up to 2¹⁶-1 bytes)
a ServerName encodes a name_type (that can be only "host_name", aka value 0) and a name that is of type HostName, which is a non empty list of up to 2¹⁶-1 bytes encoding an hostname as explained in https://www.rfc-editor.org/rfc/rfc6066#section-3