Search code examples
pythonpython-3.xpacketethernetsniffer

Packet sniffer in python3


So browsing the web, there is a lot of information about packet sniffers. However, all of the code or libraries seem to only be for python2. I am trying to make a simple packet sniffer in python3 for testing purposes.

I grabbed the code from http://www.binarytides.com/python-packet-sniffer-code-linux/ and tried to convert it to python3. However, there is a problem with the way python2 and python3 handles the struct.unpack function.

Here is a snippet of their code (slightly modified for python3) that grabs the ethernet header and prints out the MAC address.

def eth_addr (a) :
  a = str(a) # added because TypeError occurs with ord() without it
  b = "%.2x:%.2x:%.2x:%.2x:%.2x:%.2x" % (ord(a[0]) , ord(a[1]) , ord(a[2]), ord(a[3]), ord(a[4]) , ord(a[5]))
  return b
 
#create a AF_PACKET type raw socket (thats basically packet level)
#define ETH_P_ALL    0x0003          /* Every packet (be careful!!!) */
try:
    s = socket.socket( socket.AF_PACKET , socket.SOCK_RAW , socket.ntohs(0x0003))
except socket.error as msg:
    msg = list(msg)
    print('Socket could not be created. Error Code : ' + str(msg[0]) + ' Message ' + msg[1])
    sys.exit()
 
# receive a packet
while True:
    packet = s.recvfrom(65565)
     
    #packet string from tuple
    packet = packet[0]
     
    #parse ethernet header
    eth_length = 14
     
    eth_header = packet[:eth_length]
    eth = unpack('!6s6sH' , eth_header)
    eth_protocol = socket.ntohs(eth[2])
    print('Destination MAC : ' + eth_addr(packet[0:6]) + ' Source MAC : ' + eth_addr(packet[6:12]) + ' Protocol : ' + str(eth_protocol))

Inserting print statements reveals the unpacking of the header, there seems to be a difference between python2 and python3. Python3 still has the data encoded as binary data. But if I try and decode the data, it throws an error about incorrect "utf-8" formatting.

How can I get the MAC address to format properly in python3?

Thanks


Solution

  • Remove the a = str(a) line and the ord() calls:

    def eth_addr (a) :
      b = "%.2x:%.2x:%.2x:%.2x:%.2x:%.2x" % (a[0] , a[1] , a[2], a[3], a[4] , a[5])
      return b
    

    In Python 3, bytes objects produce integers when subscripted, so you do not need to call ord() on them. Casting a bytes object to str() like that is incorrect, because it will try to parse it as UTF-8. This is unsuccessful because you do not have UTF-8, you have random binary garbage.