I am trying to get the character on a new line after a specific letter using regex. My raw data looks like the below:
Total current charges (please see Current account details) $38,414.69
ID Number
1001166UNBEB
ACCOUNT SUMMARY
SVL0
BALANCE OVERDUE - PLEASE PAY IMMEDIATELY $42,814.80
I want to get the ID Number
My attempt is here:
ID_num = re.compile(r'[^ID Number[\r\n]+([^\r\n]+)]{12}')
The length of ID num is always 12, and always after ID Number
which is why I am specifying the length in my expression and trying to detect the elements after that.
But this is not working as desired.
Would anyone help me, please?
Your regex is not working because of the use of [ ]
at the beginning of the pattern, these are used for character sets.
So replace it with ( )
.
Your pattern would look like: r'^ID Number[\r\n]+([^\r\n]+){12}'
But you can simplify your pattern to: ID Number[\s]+(\w+)
\r\n
will be matched in \s
and numbers and alpha chars in \w
.
import re
s = """
Total current charges (please see Current account details) $38,414.69
ID Number
1001166UNBEB
ACCOUNT SUMMARY
SVL0
BALANCE OVERDUE - PLEASE PAY IMMEDIATELY $42,814.80
"""
print(re.findall(r"ID Number[\s]+(\w+)", s))
# ['1001166UNBEB']