Search code examples
regexperlpcre

Regular expression for start and end of string in multiline mode


In a regular expression, in multiline mode, ^ and $ stand for the start and end of line. How can I match the end of the whole string?

In the string

Hello\nMary\nSmith\nHello\nJim\nDow

the expression

/^Hello(?:$).+?(?:$).+?$/ms

matches Hello\nMary\Smith.

I wonder whether there is a metacharacter (like \ENDSTRING) that matches the end of the whole string, not just line, such that

/^Hello(?:$).+?(?:$).+?\ENDSTRING/ms

would match Hello\nJim\nDow. Similarly, a metacharacter to match the start of the whole string, not a line.


Solution

  • There are indeed assertions (perlre) for that

    \A Match only at beginning of string
    \Z Match only at end of string, or before newline at the end

    ...
    The \A and \Z are just like ^ and $, except that they won't match multiple times when the /m modifier is used, while ^ and $ will match at every internal line boundary. To match the actual end of the string and not ignore an optional trailing newline, use \z.

    Also see Assertions in perlbackslash.

    I am not sure what you're after in the shown example so here is another one

    perl -wE'$_ = qq(one\ntwo\nthree); say for /(\w+\n\w+)\Z/m'
    

    prints

    two
    three
    

    while with $ instead of \Z it prints

    one
    two
    

    Note that the above example would match qq(one\ntwo\three\n) as well (with a trailing newline), what may or may not be suitable. Please compare \Z and \z from the above quote for your actual needs. Thanks to ikegami for a comment.