Search code examples
phpregexemail-headers

REGEX - ignore new line characters


I came across this line of code:

preg_match_all("!boundary=(.*)$!mi", $content, $matches);

but for

Content-Type: multipart/alternative; boundary=f403045e21e067188c05413187fd\r\n

It returns

f403045e21e067188c05413187fd\r

When it should return

f403045e21e067188c05413187fd

(without the \r)

Any ideas how to fix this?

PS.: It should also work for when \r is not present, only \n


Solution

  • There are two options.

    1. Use lazy dot matching and add an optional \r:

      preg_match_all("!boundary=(.*?)\r?$!mi", $content, $matches);

    See this PHP demo

    1. Use a [^\r\n] negated character class matching any char but \r and \n:

      preg_match_all("!boundary=([^\n\r]*)!mi", $content, $matches);

    Or, a shorter version, using the \V shorthand character class matching any character that is not a vertical whitespace (not a linebreak char):

    preg_match_all("!boundary=(\V*)!mi", $content, $matches);
    

    See this or this PHP demo.

    Note that the second approach is much more efficient.