I found this awesome way to detect emojis using a regex that doesn't use "huge magic ranges" by using a Unicode property escape:
console.log(/\p{Emoji}/u.test('flowers 🌼🌺🌸')) // true
console.log(/\p{Emoji}/u.test('flowers')) // false
But when I shared this knowledge in this answer, @Bronzdragon noticed that \p{Emoji}
also matches numbers! Why is that? Numbers are not emojis?
console.log(/\p{Emoji}/u.test('flowers 123')) // unexpectdly true
// regex-only workaround by @Bonzdragon
const regex = /(?=\p{Emoji})(?!\p{Number})/u;
console.log(
regex.test('flowers'), // false, as expected
regex.test('flowers 123'), // false, as expected
regex.test('flowers 123 🌼🌺🌸'), // true, as expected
regex.test('flowers 🌼🌺🌸'), // true, as expected
)
// more readable workaround
const hasEmoji = str => {
const nbEmojiOrNumber = (str.match(/\p{Emoji}/gu) || []).length;
const nbNumber = (str.match(/\p{Number}/gu) || []).length;
return nbEmojiOrNumber > nbNumber;
}
console.log(
hasEmoji('flowers'), // false, as expected
hasEmoji('flowers 123'), // false, as expected
hasEmoji('flowers 123 🌼🌺🌸'), // true, as expected
hasEmoji('flowers 🌼🌺🌸'), // true, as expected
)
NOTE: To match any Emoji character in the contemporary JavaScript code, you may use
// EXTRACT:
console.log( 'flowers 🌼🌺🌸'.match(/\p{RGI_Emoji}/vg) ); // => ['🌼', '🌺', '🌸']
// TEST IF PRESENT:
console.log( /\p{RGI_Emoji}/v.test('flowers 🌼🌺🌸') ); // => true
// COUNT:
console.log( 'flowers 🌼🌺🌸'.match(/\p{RGI_Emoji}/vg).length ); // => 3
The answer to the current question
According to this post, digtis, #
, *
, ZWJ and some more chars contain the Emoji
property set to Yes, which means digits are considered valid emoji chars:
0023 ; Emoji_Component # 1.1 [1] (#️) number sign
002A ; Emoji_Component # 1.1 [1] (*️) asterisk
0030..0039 ; Emoji_Component # 1.1 [10] (0️..9️) digit zero..digit nine
200D ; Emoji_Component # 1.1 [1] () zero width joiner
20E3 ; Emoji_Component # 3.0 [1] (⃣) combining enclosing keycap
FE0F ; Emoji_Component # 3.2 [1] () VARIATION SELECTOR-16
1F1E6..1F1FF ; Emoji_Component # 6.0 [26] (🇦..🇿) regional indicator symbol letter a..regional indicator symbol letter z
1F3FB..1F3FF ; Emoji_Component # 8.0 [5] (🏻..🏿) light skin tone..dark skin tone
1F9B0..1F9B3 ; Emoji_Component # 11.0 [4] (🦰..🦳) red-haired..white-haired
E0020..E007F ; Emoji_Component # 3.1 [96] (..) tag space..cancel tag
For example, 1
is a digit, but it becomes an emoji when combined with U+FE0F
and U+20E3
chars: 1️⃣:
console.log("1\uFE0F\u20E3 2\uFE0F\u20E3 3\uFE0F\u20E3 4\uFE0F\u20E3 5\uFE0F\u20E3 6\uFE0F\u20E3 7\uFE0F\u20E3 8\uFE0F\u20E3 9\uFE0F\u20E3 0\uFE0F\u20E3")