Search code examples
encodingutf-16emoji

Emoji value range


I was trying to take out all emoji chars out of a string (like a sanitizer). But I cannot find a complete set of emoji values.

What is the complete set of emoji chars' UTF16 values?


Solution

  • The Unicode standard's Unicode® Technical Report #51 includes a list of emoji (emoji-data.txt):

    ...
    21A9 ;  text ;  L1 ;    none ;  j   # V1.1 (↩) LEFTWARDS ARROW WITH HOOK
    21AA ;  text ;  L1 ;    none ;  j   # V1.1 (↪) RIGHTWARDS ARROW WITH HOOK
    231A ;  emoji ; L1 ;    none ;  j   # V1.1 (⌚) WATCH
    231B ;  emoji ; L1 ;    none ;  j   # V1.1 (⌛) HOURGLASS
    ...
    

    I believe you would want to remove each character listed in this document which had a Default_Emoji_Style of emoji.

    There is no way, other than reference to a definition list like this, to identify the emoji characters in Unicode. As the reference to the FAQ says, they are spread throughout different blocks.