I am working with a STEP file format which I want to parse, extract information and store it in arrays so I can call upon and perform mathematical operations on them later in the program.
Below is an example of the data I am working with (advanced_face references face_outer_bound later in the data file:
#12 = ADVANCED_FACE ( 'NONE', ( #194 ), #326, .F. ) ;
...
#194 = FACE_OUTER_BOUND ( 'NONE', #159, .T. ) ;
Here's what I have come up with so far:
import re
with open('TestSlot.STEP', 'r') as step_file:
data = step_file.readlines()
NF = 0
faces = []
for line in data:
line = line.strip()
if re.search("ADVANCED_FACE", line):
NF = NF + 1
advface = re.compile('#\d+')
advfaceresult = advface.match(line)
faces.append(advfaceresult.group())
print("Face IDs =", faces)
print("Number of faces, NF =", NF)
This gives the output:
Face IDs = ['#12', '#73', '#99', '#131', '#181', '#214', '#244',
'#273', '#330', '#358']
Number of faces, NF = 10
How would I go about stripping the regex match so only the number is appended to the list?
You can use group within regex and convert directly string '12' to number 12 before append to faces list
advface = re.compile('#(\d+)')
advfaceresult = advface.match(line)
faces.append(int(advfaceresult.group(1)))
the result will be Face IDs = [12, ...]
Also solution can be reached by
import re
ifile = r'TestSlot.STEP'
with open(ifile) as f:
text = f.read() # read all text
faces_txt = re.findall(r'#(\d+) = ADVANCED_FACE.*;', text)
# get all groups by re
faces = [int(face) for face in faces_txt] # convert to int
print('Face IDs = ', faces)
print('Number of faces, NF =', len(faces))