Search code examples
pythonregex

Matching a String in Python using regex


I have a string say like this:

  ARAN22 SKY BYT and TRO_PAN

In the above string The first alphabet can be A or S or T or N and the two numbers after RAN can be any two digit. However the rest will be always same and last three characters will be always like _PAN.

So the few possibilities of the string are :

  SRAN22 SK BYT and TRO_PAN
  TRAN25 SK BYT and TRO_PAN
  NRAN25 SK BYT and TRO_PAN

So I was trying to extract the string every time in python using regex as follows:

import re

pattern =   "([ASTN])RAN" + "\w+\s+" +"_PAN"

pat_check = re.compile(pattern, flags=re.IGNORECASE)

sample_test_string = 'NRAN28 SK BYT and TRO_PAN'

re.match(pat_check, sample_test_string) 

here string can be anything like the above examples I gave there.

But its not working as I am not getting the string name ( the sample test string) which I should. Not sure what I am doing wrong. Any help will be very much appreciated.


Solution

  • You are using \w+\s+, which will match one or more word (0-9A-Za-z_) characters, followed by one or more space characters. So it will match the two digits and space after RAN but then nothing more. Since the next characters are not _PAN, the match will fail. You need to use [\w\s]+ instead:

    pattern =   "([ASTN])RAN" + "[\w\s]+" +"_PAN"