Let say we have random string like this:
$str_test = "faafŠ š čćž đš čšđ ćčš žž fa fssfa afž afžsa f";
and we do some preg_replace function on it:
preg_replace("/[^\da-z ]/i", "_", $str_test);
And the result I get is:
faaf__ __ ______ ____ ______ ______ ____ fa fssfa af__ af__sa f
So if we compare bothe - input and output:
faaf__ __ ______ ____ ______ ______ ____ fa fssfa af__ af__sa f
faafŠ š čćž đš čšđ ćčš žž fa fssfa afž afžsa f
we can see that all special chars are being replaced with two signt "_" ... Result should be:
faaf_ _ ___ __ ___ ___ __ fa fssfa af_ af_sa f
faafŠ š čćž đš čšđ ćčš žž fa fssfa afž afžsa f
I have tried with encodings already but no success.. I also thought to make function to do multiple preg_match once and than replace "_" with "" ... but that would be slow on big texts ...
Any Ideas?
$str=preg_replace("/[^0-9a-zA-Z ]/u", "_", $str_test);
Notice 'u' modifier! Explanation: http://www.php.net/manual/en/reference.pcre.pattern.modifiers.php#107498
If the _subject_ contains utf-8 sequences the 'u' modifier should be set, otherwise a pattern such as /./ could match a utf-8 *sequence as two to four individual ASCII characters*.