Search code examples
python-re

regex match after decimal number (python re)


Looking to cleave off trailing text, if any, from a string that has '@' immediately followed by a decimal number using re.split in Python. Examples:

myString1 = 'random text @2.25 possibly more text'

myString2 = 'random text @-1.50 possibly more text'

myString3 = 'random text @-.50'

Desired output:

1: 'random text @2.25'

2: 'random text @-1.50'

3: 'random text @-.50'

What I tried:

test = re.split('(?<=@[0-9]+.[0-9]+)', myString)[0]

?<= gets me match after. Then @[0-9] gets me the ampersand and first number match. Since the number could be more than 1 digit long before the decimal point, I add the '+', as in [0-9]+ and this fails with a 'look behind' error.

Is what I'm trying to do possible with re.split?


Solution

  • Rather than trying to split here, I would instead do an re.search for the appropriate pattern:

    inp = ["random text @2.25 possibly more text", "random text @-1.50 possibly more text", "random text @-.50"]
    output = [re.search(r'^.*?@-?\d*(?:\.\d+)?', x).group() for x in inp]
    print(output)  # ['random text @2.25', 'random text @-1.50', 'random text @-.50']
    

    Here is an explanation of the regex pattern:

    • ^ from the start of the string
    • .*? match all content, leading up to the nearest
    • @
    • -? optional minus sign
    • \d* zero or more digits
    • (?:\.\d+)? optional decimal component