Search code examples
phpstringcsvunicodeansi

In PHP, How to convert unicode number strings into numbers correctly?


I have csv file encoded in unicode and when I read it either with fgetcsv or fgets and try to use the number strings as integer numbers in PHP, only the first character of the string is casting into a number, i.e

$str='2012';
$num=$str + 0; OR $num=(int)$str;
echo $num;
results -> 2

How can I convert these unicode number strings correctly?

I was not successful using conversion functions in PHP from unicode to other charsets!

The only way I know is to use a simple text editor like notepad or notepad++ and convert the file format to an ANSI csv.

Thanks for your help.


Solution

  • convert it to some other encoding, like UTF-8.

    $str = mb_convert_encoding( $str, "UTF-8", "UTF-16LE");
    

    Your string is actually like this (Manually constructed UTF-16LE):

     $str = "2\x000\x001\x002\x00";
    

    So php reads the first 2 and then sees NUL which is not a number, and you get 2.

    BTW, LE BOM isn't handled here (\xFF\xFE) so show your full code and I will see.