Search code examples
pythonpython-re

regex to extract a substring from a string in python


How do we get the following substring from a string using re in python.

string1 = "fgdshdfgsLooking: 3j #123"
substring = "Looking: 3j #123"

string2 = "Looking: avb456j #13fgfddg"
substring = "Looking: avb456j #13"

tried:

re.search(r'Looking: (.*#\d+)$', string1)

Solution

  • Your regex is mostly correct, you just need to remove EOL(End of Line) $ as in some case like string2 the pattern does not end with a EOL, and have some extra string after the pattern ends.

    import re
    
    string1 = 'fgdshdfgsLooking: 3j #123'
    string2 = 'Looking: avb456j #13fgfddg'
    
    pattern = r'Looking: (.*?#\d+)'
    
    match1 = re.search(pattern, string1)
    match2 = re.search(pattern, string2)
    
    print('String1:', string1, '|| Substring1:', match1.group(0))
    print('String2:', string2, '|| Substring2:', match2.group(0))
    
    

    Output:

    String1: fgdshdfgsLooking: 3j #123 || Substring1: Looking: 3j #123
    String2: Looking: avb456j #13fgfddg || Substring2: Looking: avb456j #13
    

    should work, also I've matched everything before # lazily by using ? to match as few times as possible, expanding as needed, that is to avoid matching everything upto second #, in case there is a second #followed by few digits in the string somewhere further down.

    Live Demo