Search code examples
regexsedregex-group

Regex is not working properly with sed but works in regex101


I am just trying to strip everything before the first occurance of a single space character from each line. Eg.50G This is a Test (0000) 1234p (String).ext should become This is a Test (0000) 1234p (String).ext

So i am using this simple regex - ^.+?\s(.*) whereby i am just trying to wrap everything after the first space character in a group and then trying to substitute the whole match with the 1st group

Now this problem is that its working well in regex101 - https://regex101.com/r/1dAUcO/1 but when i try the same regex in terminal with sed , it returns a different output . Here's the sed command - echo "50G This is a Test (0000) 1234p (String).ext" | sed -E 's|^.+?\s(.*)|\1|g'


Solution

  • You are using this sed:

    sed -E 's|^.+?\s(.*)|\1|g
    

    Where your intention of .+? is to make it a lazy match however sed (even in ERE mode) doesn't support lazy quantifier.

    If you consider perl then it would work as is since perl has support for lazy quantifier:

    echo "50G This is a Test (0000) 1234p (String).ext" |
    perl -pe 's|^.+?\s(.*)|\1|g'
    
    This is a Test (0000) 1234p (String).ext
    

    However I would strongly recommend using cut for this as you don't have to bother about using a regex and this is what cut is made for:

    echo "50G This is a Test (0000) 1234p (String).ext" |
    cut -d " " -f2-
    
    This is a Test (0000) 1234p (String).ext