I want to extract the lines between specified start-pattern (inclusive) and end-pattern (exclusive).
My code below does extract some lines, but not the first line that matches the start-pattern. In my desired target output I want also the first line that matches.
import re
import xlswriter
linenum = 0
myline = []
pattern_start = re.compile(r"^vsi ipcbb")
pattern_stop = re.compile(r"^vsi ipcbb-ipran")
with open(r'readline.txt', 'rt') as myfile :
for row in myfile :
if pattern_start.search(row) != None :
for line in myfile :
linenum += 1
if pattern_stop.search(line) != None:
break
myline.append((linenum, line.rstrip('\n')))
with xlsxwriter.Workbook('readline.xlsx') as workbook:
worksheet = workbook.add_worksheet('VSI')
for row_num,data in enumerate(myline):
worksheet.write_row(row_num + 0, 0, data)
!Last configuration was updated at 2021-04-22 05:52:21 UTC by
!Last configuration was saved at 2021-04-22 19:00:49 UTC by
!PdtPrivateInfo = System current forwarding-mode: compatible
!MKHash 0000000000000000
vsi ipcbb-RAC_YBPNM01H-00 static
description *** M-ipcbb-RAC_YBPNM01H(via RAG_MBSPM01H&RAG_YBPNM01H) ***
tnl-policy TE
diffserv-mode pipe af1 green
#
vsi ipcbb-ipran-RSG_NKY2M-00 static
description *** IPCBB-IPRAN VLAN61 Inherit(RAG_NKY2M01H-RAG_NKY2M02H) ***
tnl-policy TE
diffserv-mode pipe af1 green
#
description *** M-ipcbb-RAC_YBPNM01H(via RAG_MBSPM01H&RAG_YBPNM01H) ***
tnl-policy TE
diffserv-mode pipe af1 green
#
vsi ipcbb-RAC_YBPNM01H-00 static
description *** M-ipcbb-RAC_YBPNM01H(via RAG_MBSPM01H&RAG_YBPNM01H) ***
tnl-policy TE
diffserv-mode pipe af1 green
#
You can work with a boolean mode-flag like extract_on
, which signals if currently in between start and stop and should extract the line.
Also the line-matching can be done using re.match
function, which either returns a match-object or None
.
import re
pattern_start = re.compile(r"^vsi ipcbb")
pattern_stop = re.compile(r"^vsi ipcbb-ipran")
i = 0
extract_on = False
extracts = []
with open(r'readline.txt', 'rt') as myfile:
for line in myfile:
i += 1 # line counting starts with 1
if pattern_start.match(line):
extract_on = True
if pattern_stop.search(line):
extract_on = False
if extract_on:
extracts.append((i, line.rstrip('\n')))
for line in extracts:
print(line)
Given your input, it will ignore the first 4 lines, extract the middle 5, and again ignores the last 5. So print-out of extracted lines including position-in-file is:
(5, 'vsi ipcbb-RAC_YBPNM01H-00 static')
(6, ' description *** M-ipcbb-RAC_YBPNM01H(via RAG_MBSPM01H&RAG_YBPNM01H) ***')
(7, ' tnl-policy TE')
(8, ' diffserv-mode pipe af1 green')
(9, '#')
Left out the XLS-writing, which is assumed to be working as expected.