Search code examples
phpregexunicodemodifiermultibyte

Why are accented/unicode characters which are listed in the negated character class being matched by the regex?


$string1 = preg_replace('/[^A-Za-z0-9äöü!&_=\+-]/', ' ', $string4);

This Regex shouldn't replace the chars äöü. In Ruby it worked as expected. But in PHP it replaces also the ä ö and ü.

Can someone give me a hint how to fix it?


Solution

  • Set the u pattern modifier (to tell php to treat the regex as a UTF-8 string).

    '/[^A-Za-z0-9äöü!&_=\+-]/u'