Search code examples
pythonregexregex-lookarounds

Regex to match only the second ip address in a range


I'm trying to match only the second valid ip address in a string with a range of ip addresses. Sometimes it's written without a space between addresses and something it has one or more spaces. Also sometimes the ip isn't valid so it shouldn't match.

test = '''
1.0.0.0-1.0.0.240
2.0.0.0 - 1.0.0.241
3.0.0.0 -1.0.0.242
4.0.0.0- 1.0.0.243
5.0.0.0  -  1.0.0.244
6.0.0.0 -  1.0.0.245
7.0.0.0 -  1.0.0.2456 #NOT VALID SO DONT MATCH
'''

pattern = r"(?<=-\s))\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}"
r = re.compile(pattern, re.DOTALL)
print(r.findall(test))

My try only catches: 1.0.0.241 and 1.0.0.243


Solution

  • Change regex pattern to the following:

    pattern = r"(?<=[-\s])((?:\d{1,3}\.){3}\d{1,3})$"
    r = re.compile(pattern, re.M)
    print(r.findall(test))
    

    • (?<=[-\s]) - lookbehind assertion to match either - or \s as a boundary before IP address (which is enough in your case)
    • (?:\d{1,3}\.){3} - matches the 3 first octets each followed by . of IP address
    • $ - matches the end of the string in a multi-lined text (recognized by re.M)

    ['1.0.0.240', '1.0.0.241', '1.0.0.242', '1.0.0.243', '1.0.0.244', '1.0.0.245']