Search code examples
pythonhttpnetwork-programmingtcppcap

Printing HTTP header on terminal


I am currently creating a packet sniffer using python and pcap. After following this code: https://www.binarytides.com/code-a-packet-sniffer-in-python-with-pcapy-extension/

I am able to parse IP and TCP header to get the values such as source address, port number, etc. I only need HTTP request/response so I filtered out to only keep the ones that have the port number 80.

However, I am really confused on how to print out the actual values of HTTP header. Where and how am I supposed to get the image below to appear on my MacOS terminal?

enter image description here

Thanks in advance.


Solution

  • If you have your HTTP response packet data in http_response_data variable as python bytes type you can get response headers just in one line:

    headers_text = http_response_data.partition(b'\r\n\r\n')[0].decode('utf-8')
    print(headers_text)
    

    Also you might need to use 'cp852' instead of 'utf-8' if HTTP headers are not UTF-8 encoded.

    This takes into account the fact that HTTP headers are separated from HTTP content body by two newlines (all header lines are separated just by one newline).

    Next is small example of usage solution above by receiving HTTP bytes response from TCP port 80 of Google's server using standard socket library.

    Try it online!

    import socket
    s = socket.socket()
    s.connect(('google.com', 80))
    s.send(b'GET / HTTP/1.1\r\n\r\n')
    s.shutdown(socket.SHUT_WR)
    http_response_data = s.recv(8192) # TCP response stored as bytes
    s.close()
    headers_text = http_response_data.partition(b'\r\n\r\n')[0].decode('utf-8')
    print(headers_text)
    

    PS:

    1. Your headers might be not UTF-8 encoded then instead of .decode('utf-8') try other encoding/code-page like this .decode('cp852').
    2. headers_text will contain also status line like HTTP/1.1 200 OK if you don't need it (if important to have only key: value lines) you can use next code instead:
    headers_text = http_response_data.partition(b'\r\n\r\n')[0].partition(b'\r\n')[2].decode('utf-8')
    print(headers_text)
    
    1. According to your tutorial TCP data as bytes type is available at the end of body of if protocol == 6 : block as data variable, use it as http_response_data in my solution.