Search code examples
phpfunctionunicodeemojibin2hex

PHP keep emoji in unicode but also keep text as plain text


I have this function to convert emoji to unicode, but it's also converting text to hex.

How to only convert the emoji and keep text as plain text string?

function emoji_to_unicode($emoji) {
   $emoji = mb_convert_encoding($emoji, 'UTF-32', 'UTF-8');
   $unicode = strtoupper(preg_replace("/^[0]{3}/","U+",bin2hex($emoji)));
   return $unicode;
}

$var = ("😀x😀text here");
$out = '';
for ($i = 0; $i < mb_strlen($var); $i++) {
    $out .= emoji_to_unicode(mb_substr($var, $i, 1));
}
echo "$out\n";

SO

$var = ("😀x😀text here");

Returns to me:

U+1F600U+00078U+1F600U+00074U+00065U+00078U+00074U+00020U+00068U+00065U+00072U+00065

But I need return like this:

U+1F600xU+1F600text here

I need to keep text as plain text but also keep emoji in unicode format.


Solution

  • The Intl extension provides functions to work with unicode codepoints and blocks that will allow you to determine if the current character is an emoticon or not.

    function emoji_to_unicode($emoji) {
       $emoji = mb_convert_encoding($emoji, 'UTF-32', 'UTF-8');
       $unicode = strtoupper(preg_replace("/^[0]{3}/","U+",bin2hex($emoji)));
       return $unicode;
    }
    
    $var = ("😀x😀text here");
    $out = '';
    for ($i = 0; $i < mb_strlen($var); $i++) {
        $char = mb_substr($var, $i, 1);
        $isEmoji = IntlChar::getBlockCode(IntlChar::ord($char)) == IntlChar::BLOCK_CODE_EMOTICONS;
        $out .= $isEmoji ? emoji_to_unicode($char) : $char;
    }
    
    echo $out;
    

    Here's the list of predefined constants where you can find all blocks.