Search code examples
regexpreg-replacepreg-matchpreg-replace-callback

How to convert foreign characters in regex search to proper case?


I have written the function below. It converts lower caser to upper case and proper case. I want it to ignore foreign characters. eg. ñ

Expected result: Sabiña/Cerca

Actual Result: SabiÑA/Cerca

NOTE: if I use mb_convert_case alone it does not change any character after/ to proper case.

$string= 'SABIÑA CERCA';

echo  preg_replace_callback('/\w+/i', 

create_function('$m','

var_dump($m);
if(strlen($m[0]) > 3)
{
    return mb_convert_case($m[0], MB_CASE_TITLE, "UTF-8");
}
else
{
    return ucfirst($m[0]);
}')
, $string);

Solution

  • You just need to use the /u modifier.

    '/\w+/u'
    

    See IDEONE demo

    Note that the /i case insensitive modifier is redundant because \w matches both lower- and uppercase letters.

    See Pattern modifiers:

    This modifier turns on additional functionality of PCRE that is incompatible with Perl. Pattern and subject strings are treated as UTF-8. This modifier is available from PHP 4.1.0 or greater on Unix and from PHP 4.2.3 on win32.