Search code examples
phpunicodeutf-8

can I get the unicode value of a character or vise versa with php?


Is it possible to input a character and get the unicode value back? for example, i can put &#12103 in html to output "⽇", is it possible to give that character as an argument to a function and get the number as an output without building a unicode table?

$val = someFunction("⽇");//returns 12103

or the reverse?

$val2 = someOtherFunction(12103);//returns "⽇"

I would like to be able to output the actual characters to the page not the codes, and I would also like to be able to get the code from the character if possible. The closest I got to what I want is php.net/manual/en/function.mb-decode-numericentity.php but I cant get it working, is this the code I need or am I on the wrong track?


Solution

  • function _uniord($c) {
        if (ord($c[0]) >=0 && ord($c[0]) <= 127)
            return ord($c[0]);
        if (ord($c[0]) >= 192 && ord($c[0]) <= 223)
            return (ord($c[0])-192)*64 + (ord($c[1])-128);
        if (ord($c[0]) >= 224 && ord($c[0]) <= 239)
            return (ord($c[0])-224)*4096 + (ord($c[1])-128)*64 + (ord($c[2])-128);
        if (ord($c[0]) >= 240 && ord($c[0]) <= 247)
            return (ord($c[0])-240)*262144 + (ord($c[1])-128)*4096 + (ord($c[2])-128)*64 + (ord($c[3])-128);
        if (ord($c[0]) >= 248 && ord($c[0]) <= 251)
            return (ord($c[0])-248)*16777216 + (ord($c[1])-128)*262144 + (ord($c[2])-128)*4096 + (ord($c[3])-128)*64 + (ord($c[4])-128);
        if (ord($c[0]) >= 252 && ord($c[0]) <= 253)
            return (ord($c[0])-252)*1073741824 + (ord($c[1])-128)*16777216 + (ord($c[2])-128)*262144 + (ord($c[3])-128)*4096 + (ord($c[4])-128)*64 + (ord($c[5])-128);
        if (ord($c[0]) >= 254 && ord($c[0]) <= 255)    //  error
            return FALSE;
        return 0;
    }   //  function _uniord()
    

    and

    function _unichr($o) {
        if (function_exists('mb_convert_encoding')) {
            return mb_convert_encoding('&#'.intval($o).';', 'UTF-8', 'HTML-ENTITIES');
        } else {
            return chr(intval($o));
        }
    }   // function _unichr()