Search code examples
regexperl

Why does a match against regex $ return 1 when the input string contains a newline?


Why does the command

perl -e "print qq/a\n/ =~ /$/"

print 1?

As far as I know, Perl considers $ as the position both before \n as well as the position at the end of the whole string in multi-line mode, which is the default (no modifier is applied).


Solution

  • The match operator returns 1 as the true value because the pattern matched. The print outputs that value.

    The $ is an anchor, which is a specific sort of zero-width assertion. It matches a condition in the pattern but consumes no text. Since you have nothing else in the pattern, the /$/ matches any target string including the empty string. It will always return true.

    The $ is the end-of-line anchor, as documented in perlre. The $ allows a vestigial newline at the end, so both of these can match:

    "a"   =~ /a$/
    "a\n" =~ /a$/
    

    Without the /m regex modifier, the end of the line is the end of the string. But, with that modifier it can match before any newline in the string:

    "a\n" =~ /a$b/m
    

    You might get this behavior even if you don't see it attached to the particular match operator since people can set default match flags:

    use re '/m'; # applies to all in lexical scope
    

    Over-enthusiastic fans of Perl Best Practices like to make a trio of pattern changing commands the default (often not auditing every regex it affects):

    use re '/msx'
    

    There's another anchor, the end-of-string anchor \Z, that also allows a trailing newline. If you don't want to allow a newline, you can use the lowercase \z to mean the absolute end of the string. These are not affected by regex flags.