Search code examples
phputf-8character-encodingiso-8859-1

PHP: how to get the input encoded in ANSI


When adding a get parameter such as ?key=é , PHP transforms that value immediately into UTF8. My script is not UTF8 and I must get the data in ISO-8859-1 by doing $_GET['key']

strlen of that value should be 1. Not 2.

How can I get that data in non-UTF8 without using a dependency such as mb or iconv.


Solution

  • When adding a get parameter such as ?key=é , PHP transforms that value immediately into UTF8. My script is not UTF8 and I must get the data in ISO-8859-1 by doing $_GET['key']

    This statement is incorrect. PHP is not responsible here — the web browser is sending data to the web server in the UTF-8 string encoding, and PHP is passing it on to your application in that same form.

    Keep in mind that "ANSI" is not a string encoding. It's a term sometimes used by Windows systems to (incorrectly!) refer to the system's native string encoding, which is usually Windows-1252 (similar to ISO-8859-1) on English systems, but can vary depending on the locale.

    strlen of that value should be 1. Not 2.

    To have strlen() return the length in characters, rather than in bytes, you would need to activate PHP's multibyte function overloading, which is a feature of the mbstring extension, or convert the string to a single-byte string encoding using iconv (which will be a lossy process).

    There is no way to perform this conversion without either mbstring, iconv, or something substantially similar.