Search code examples
regexperlsedregex-group

Replace regex with captured group ONLY


I'm trying to understand why the following does not give me what I think (or want :)) should be returned:

sed -r 's/^(.*?)(Some text)?(.*)$/\2/' list_of_values

or Perl:

perl -lpe 's/^(.*?)(Some text)?(.*)$/$2/' list_of_values

So I want my result to be just the Some text, otherwise (meaning if there was nothing captured in $2) then it should just be EMPTY.

I did notice that with perl it does work if Some text is at the start of the line/string (which baffles me...). (Also noticed that removing ^ and $ has no effect)

Basically, I'm trying to get what grep would return with the --only-matching option as discussed here. Only I want/need to use sub/replace in the regex.

EDITED (added sample data)

Sample input:

$ cat -n list_of_values
     1  Black
     2  Blue
     3  Brown
     4  Dial Color
     5  Fabric
     6  Leather and Some text after that ....
     7  Pearl Color
     8  Stainless Steel
     9  White
    10  White Mother-of-Pearl Some text stuff

Desired output:

$ perl -ple '$_ = /(Some text)/ ? $1 : ""' list_of_values | cat -n
     1
     2
     3
     4
     5
     6  Some text
     7
     8
     9
    10  Some text

Solution

  • First of all, this shows how to duplicate grep -o using Perl.


    You're asking why applying s/^(.*?)(Some text)?(.*)$/$2/ again

    foo Some text bar
    012345678901234567
    

    results in just a empty string instead of

    Some text
    

    Well,

    • At position 0, ^ matches 0 characters.
    • At position 0, (.*?) matches 0 characters.
    • At position 0, (Some text)? matches 0 characters.
    • At position 0, (.*) matches 17 characters.
    • At position 17, $ matches 0 characters.
    • Match succeeds.

    You could use

    s{^ .*? (?: (Some[ ]text) .* | $ )}{ $1 // "" }xse;
    

    or

    s{^ .*? (?: (Some[ ]text) .* | $ )}{$1}xs;     # Warns if warnings are on.
    

    Far simpler:

    $_ = /Some text/ ? $& : "";
    

    I question your use of -p. Are you sure you want a line of output for each line of input? It seems to me you'd rather have

    perl -nle'print $& if /Some text/'