Search code examples
phputf-8character-encodingmultibyteord

UTF-8 safe equivalent of javascript's charCodeAt() in PHP


I need to be able to use ord() to get the same value as javascript's charCodeAt() function. The problem is that ord() doesn't support UTF8.

How can I get Ą to translate to 260 in PHP? I've tried some uniord functions out there, but they all report 256 instead of 260.


Solution

  • ord() works byte per byte (as most of PHPs standard string functions - if not all). You would need to convert it your own, for example with the help of the multibyte string extension:

    $utf8Character = 'Ą';
    list(, $ord) = unpack('N', mb_convert_encoding($utf8Character, 'UCS-4BE', 'UTF-8'));
    echo $ord; # 260