Search code examples
pythonregexporttcpdump

Python Regular Expressions - Parse Out Port From tcpdump


I am attempting to parse out the information given when running "tcpdump -nNqt".

Example output looks like this:

IP 10.0.0.11.60446 > 10.0.0.232.22: tcp 0
IP 10.0.0.232.22 > 10.0.0.11.60446: tcp 176
IP 10.0.0.232.22 > 10.0.0.11.60446: tcp 80

I have so far been able to remove:

First IP / Second IP

(?<=IP\s)\b\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}\b
(?<=\s>\s)\b\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}\b

TCP or UDP / Size

(?<=:\s)(.{1,3})
(?<=tcp |udp )(\d+)

I have not been able to parse out the port numbers which are the final digits at the end of the IP. My nonworking attempt looks like this:

(?<=\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}\.)\d{,6}

What is wrong with my expression? Is there another way of doing this that I am not seeing?


Solution

  • I'm not sure why you are taking the parts you need one at a time. You could just take them all in one go (I also collapsed your IP pattern a bit):

    IP (?P<IP1>(?:\d{1,3}\.){3}\d{1,3})\.(?P<Port1>\d+) > (?P<IP2>(?:\d{1,3}\.){3}\d{1,3})\.(?P<Port2>\d+): (?:tc|ud)p (?P<protocol>\d+)
    

    regex101 demo

    import re
    
    reg = re.compile(r"IP (?P<IP1>(?:\d{1,3}\.){3}\d{1,3})\.(?P<Port1>\d+) > (?P<IP2>(?:\d{1,3}\.){3}\d{1,3})\.(?P<Port2>\d+): (?:tc|ud)p (?P<size>\d+)")
    
    for line in input_lines:
        m = reg.match(line)
        print(m.group("IP1"))
        print(m.group("Port1"))
        print(m.group("IP2"))
        print(m.group("Port2"))
        print(m.group("size"))