I have a string say like this:
ARAN22 SKY BYT and TRO_PAN
In the above string The first alphabet can be A or S or T or N and the two numbers after RAN can be any two digit. However the rest will be always same and last three characters will be always like _PAN.
So the few possibilities of the string are :
SRAN22 SK BYT and TRO_PAN
TRAN25 SK BYT and TRO_PAN
NRAN25 SK BYT and TRO_PAN
So I was trying to extract the string every time in python using regex as follows:
import re
pattern = "([ASTN])RAN" + "\w+\s+" +"_PAN"
pat_check = re.compile(pattern, flags=re.IGNORECASE)
sample_test_string = 'NRAN28 SK BYT and TRO_PAN'
re.match(pat_check, sample_test_string)
here string can be anything like the above examples I gave there.
But its not working as I am not getting the string name ( the sample test string) which I should. Not sure what I am doing wrong. Any help will be very much appreciated.
You are using \w+\s+
, which will match one or more word (0-9A-Za-z_
) characters, followed by one or more space characters. So it will match the two digits and space after RAN
but then nothing more. Since the next characters are not _PAN
, the match will fail. You need to use [\w\s]+
instead:
pattern = "([ASTN])RAN" + "[\w\s]+" +"_PAN"