I have a need to convert source data that I can't control into normalization form C. I am currently doing it, but by calling an external program (uconv
). This is what my code snippet looks like:
$malayalam_books = preg_split("/\n/", shell_exec("uconv -f utf8 -t utf8 -x nfc book-names.txt"));
It works well, but obviously making calls to the system is not recommended. I know that PHP supports the ICU libraries, but it's so convoluted how to do a simple thing like this...
I've since discovered that the normalizer_normalize PECL function can handle this natively in PHP 5 >= 5.3.0.