Python Regular Expressions - Parse Out Port From tcpdump

I am attempting to parse out the information given when running "tcpdump -nNqt".

Example output looks like this:

IP 10.0.0.11.60446 > 10.0.0.232.22: tcp 0
IP 10.0.0.232.22 > 10.0.0.11.60446: tcp 176
IP 10.0.0.232.22 > 10.0.0.11.60446: tcp 80

I have so far been able to remove:

First IP / Second IP

(?<=IP\s)\b\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}\b
(?<=\s>\s)\b\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}\b

TCP or UDP / Size

(?<=:\s)(.{1,3})
(?<=tcp |udp )(\d+)

I have not been able to parse out the port numbers which are the final digits at the end of the IP. My nonworking attempt looks like this:

(?<=\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}\.)\d{,6}

What is wrong with my expression? Is there another way of doing this that I am not seeing?

Solution

I'm not sure why you are taking the parts you need one at a time. You could just take them all in one go (I also collapsed your IP pattern a bit):

IP (?P<IP1>(?:\d{1,3}\.){3}\d{1,3})\.(?P<Port1>\d+) > (?P<IP2>(?:\d{1,3}\.){3}\d{1,3})\.(?P<Port2>\d+): (?:tc|ud)p (?P<protocol>\d+)

regex101 demo

import re

reg = re.compile(r"IP (?P<IP1>(?:\d{1,3}\.){3}\d{1,3})\.(?P<Port1>\d+) > (?P<IP2>(?:\d{1,3}\.){3}\d{1,3})\.(?P<Port2>\d+): (?:tc|ud)p (?P<size>\d+)")

for line in input_lines:
    m = reg.match(line)
    print(m.group("IP1"))
    print(m.group("Port1"))
    print(m.group("IP2"))
    print(m.group("Port2"))
    print(m.group("size"))