Search code examples
phputf-8ctypesetlocale

What does set_locale(LC_CTYPE, 'C'); actually do?


When my PHP script is run with UTF-8 encoding, using non-ASCII characters, some PHP functions like strtolower() don't work.

I could use mb_strtolower, but this script can be run on all sorts of different platforms and configurations, and the multibyte string extension might not be available. I could check whether the function exists before use, but I have string functions littered throughout my code and would rather not replace every instance.

Someone suggested using set_locale(LC_CTYPE, 'C'), which he says causes the string functions to work correctly. This sounds fine, but I don't want to introduce that change without understanding exactly what it is doing. I have used set_locale to change the formatting of numbers before, but I have not used the LC_CTYPE flag before, and I don't really understand what it does. What does the value 'C' mean?


Solution

  • C means "use whatever locale is hard coded" (and since most *NIX programs are written in C, it's called C). However, it is usually not an UTF-8 locale.

    If you are using multibyte charsets such as UTF-8 you cannot use the regular string functions - using the mb_ counterparts is required. However, almost every PHP installation should have this extension enabled.