I have a problem with writing correct regex. Maybe someone can help me?
I have output from two network devices:
1
VRF NAME1 (VRF Id = 2); default RD 9200:1; default VPNID <not set>
Old CLI format, supports IPv4 only
Flags: 0xC
Interfaces:
Gi1/1/1 Gi1/1/4
2
VRF NAME2 (VRF Id = 2); default RD 101:2; default VPNID <not set>
Interfaces:
Gi0/0/3 Gi0/0/4 Gi0/1/4
I need extract interface name from both.
I have regex:
rx = re.compile("""
VRF\s(.+?)\s\(.*RD\s(.*);.*[\n\r]
^.*$[\n\r]
^.*$[\n\r]
^.*$[\n\r]
(^.*)
""",re.MULTILINE|re.VERBOSE)
But it is only works for first text, it skips 4 lines and 5 line is exactly what I need. However there are many routers that returning output like 2. The question is how ignore unknown amount of line and for example find line with Interfaces word and extract next line after "Interfaces:"
EDIT: after providing us with more input, the answer is corrected.
There are many ways to solve this. Look at regex101. The regex
(?s)VRF\s([^\s]+)\s.*?(?:RD\s([\d.]+:\d|<not\sset>));.*?Interfaces:(?:\r*\n)\s*(.*?)(?:\r*\n)
read in a complete record and captures the Name, RD value and line following Interfaces
.
Explanation:
(?s) # single line mode: make "." read anything,
# including line breaks
VRF # every records start with VRF
\s # read " "
([^\s]+) # group 1: capture NAME VRF
\s # read " "
.*? # lazy read anything
(?: # start non-capture group
RD\s # read "RD "
( # group 2
[\d.]+:\d # number or ip, followed by ":" and a digit
| # OR
<not\sset> # value "<not set>"
) # group 2 end
) # non-caputure group end
; # read ";"
.*? # lazy read anything
Interfaces: # read "Interfaces:"
(?:\r*\n) # read newline
\s* # read spaces
(.*?) # group 3: read line after "Interfaces:"
(?:\r*\n) # read newline
Let's look at a test script. I've cut down on the length of the records in the script a bit, but the message still stands.
$ cat test.py
import os
import re
pattern = r"(?s)VRF\s([^\s]+)\s.*?(?:RD\s([\d.]+:\d|<not\sset>));.*?Interfaces:(?:\r*\n)\s*(.*?)(?:\r*\n)"
text = '''\
VRF BLA1 (VRF Id = 2); default RD 9200:1; default VPNID <not set>
Old CLI format, supports IPv4 only
Flags: 0xC
Interfaces:
Gi1/1/1.451 Gi1/1/4.2019
Address family ipv4 unicast (Table ID = 0x2):
VRF label allocation mode: per-prefix
Address family ipv6 unicast not active
Address family ipv4 multicast not active
VRF BLA2 (VRF Id = 1); default RD <not set>; default VPNID <not set>
New CLI format, supports multiple address-families
Flags: 0x1808
Interfaces:
Gi0
Address family ipv4 unicast (Table ID = 0x1):
Flags: 0x0
Address family ipv6 unicast (Table ID = 0x1E000001):
Flags: 0x0
Address family ipv4 multicast not active\
'''
for rec in text.split( os.linesep + os.linesep):
m = re.match(pattern, rec)
if m:
print("%s\tRD: %s\tInterfaces: %s" % (m.group(1), m.group(2), m.group(3)))
which results in:
$ python test.py
BLA1 RD: 9200:1 Interfaces: Gi1/1/1.451 Gi1/1/4.2019
BLA2 RD: <not set> Interfaces: Gi0