Search code examples
regexperl

How to embed s/\A{/LB/ in the EXPR of s{...}{EXPR}e?


I've read 'Quote and Quote-like Operators' in man perlop but can't figure out how to use s/.../EXPR/e if EXPR contains code which itself contains a substitution involving both \A and the outer substitution delimiters:

perl -E 'say "{foo}" =~ s{.+}{local $_=$&; s/\{/LB/;$_ }er;'

works and prints LBfoo}, I guess as expected.

If I try to anchor using \A it won't parse:

perl -E 'say "{foo}" =~ s{.+}{local $_=$&; s/\A\{/LB/;$_ }er;'

gets Unescaped left brace in regex is illegal here in regex; marked by <-- HERE in m/\A{ <-- HERE /

Adding additional backslashes before \A does not help.

Adding additional backslashes before \{ does not seem to work either:

perl -E 'say "{foo}" =~ s{.+}{local $_=$&; s/\A\\{/LB/;$_ }er;'

gets Substitution replacement not terminated (reasonably enough), but

perl -E 'say "{foo}" =~ s{.+}{local $_=$&; s/\A\\\{/LB/;$_ }er;'

compiles but does not match (output is {foo} not LBfoo}).

Adding additional backslashes alternates between the "not terminated" error and silently not matching.

Can anyone spot the problem?


Solution

  • You have the following broken code:

    s{.+}{local $_=$&; s/\A\{/LB/;$_ }er
    

    Let's start by simplifying that to the following:

    s{.+}{ $& =~ s/\A\{/LB/r }er;
    

    One of the first things Perl needs to do to parse this is to find the end of the operator. The delimiters of the outer substitution are { and }, so the \{ in the replacement expression is escaping a delimiter. In the same way that "a\"b" produces the string a"b, your substitution has the following replacement expression:

     $& =~ s/\A{/LB/r 
    

    Note that the \ is gone, since it was used to escape a delimiter.

    So we're left with \A{, and that isn't legal. (\x{ for some letter x is either recognized like in \x{}, \N{}, \b{}, or reserved to allow this in the future.) This is why your program doesn't work. The simple solution is to avoid using { by using \x7B in lieu.

    s{.+}{ $& =~ s/\A\x7B/LB/r }er;
    

    You could also use a different delimiter.

    s<.+>< $& =~ s/\A\{/LB/r >er;
    

    Alternatively, there are hacks you could use.

    s{.+}{ $& =~ s/^\{/LB/r }er;     # Inner subst is `s/^{/LB/r`
    s{.+}{ $& =~ s/\A \{/LB/xr }er;  # Inner subst is `s/\A {/LB/xr`