Search code examples
javascriptemoji

What is the difference between {Emoji_Presentation} and {Extended_Pictographic}?


The top answer to the question "How to detect emoji using javascript" uses Extended_Pictographic.

Emoji_Presentation is mentioned in this blog post by David Walsh.


Solution

  • One key difference is that Extended_Pictographic will match a bunch of "pictographic" characters that aren't technically emojis, or aren't shown as colored emojis by default:

    "1😂💯♡⌨︎".match(/\p{Emoji_Presentation}/gu)    // ['😂', '💯']
    "1😂💯♡⌨︎".match(/\p{Extended_Pictographic}/gu) // ['😂', '💯', '♡', '⌨︎']
    

    \p{Emoji_Presentation} only matches emojis that are, by default, shown in their colored emoji form.

    There is also \p{Emoji}, but it's likely best avoided in most real-world circumstances:

    "1😂💯♡⌨︎".match(/\p{Emoji}/gu) // ['1', '😂', '💯', '⌨︎']
    

    IIUC, 1 has an emoji/colored representation (1️⃣), and so is matched by \p{Emoji}, but 1 isn't shown in its emoji/colored form by default, so it isn't matched by \p{Emoji_Presentation}. Same with ⌨︎. And I guess isn't classed as an emoji at all, but does fall within the "pictographic" class.

    Note that the above explanation implies that an emoji can be colored, and yet not matched by \p{Emoji_Presentation}. For example, ❄️ is an "old" emoji that's displayed in black and white by default. The reason you see it colored is because it has the special "variation selector 16" (\uFE0F) after it which makes it render in colored form. If you'd like to match all colored emojis in a string, regardless of their "default" presentation, then I think this should work:

    "1😂💯♡⌨︎❄️".match(/(\p{Emoji}\uFE0F|\p{Emoji_Presentation})/gu) // ['😂', '💯', '❄️']
    

    The \p{Emoji}\uFE0F part is what causes the above regex to match the snowflake. Note that in the above code block the snow flake is rendered as black and white, but that's just a CSS-related thing due to it being in a code block.