I'm trying to extract the host name (as string) from a still encrypted HTTPS.
The host name is not encrypted due to the protocol. But I can't find the correct method to extract it. Deferent domains have deferent length and not all request are similar.
google.com (first TCP message)
�� T}@����5�;�;��O��KG��Y� ����~fM�FRH�N��7s�6w��[���ک�>�,�0�̨̩̪�+�/��$�(k�#�'g�
�9� �3��=<5/�u
google.com
3th2http/1.11
*(
+-3&$
cϼB�Y�j¬��b*a$��n$���}�X�.u�
example.com (first TCP message)
�ę`�ۜ����z#�X��I�&���~�� ��Ao�)���쿂�7�-�������`�l>�,�0�̨̩̪�+�/��$�(k�#�'g�
�9� �3��=<5/�uexample.com
3th2http/1.11
*(
+-3&$ a�
���a桵.3�*L_��d�N�yK
*r��
Any ideas ?
Well, we are probably talking about the TLS protocol with the SNI extension. So you basically need to have a parser that is able to understand the initial TLS packet. It is not too hard if you just implement parsing the handshake protocol, more specifically the ClientHello message. See https://www.rfc-editor.org/rfc/rfc5246 and https://www.rfc-editor.org/rfc/rfc6066#section-3.