Search code examples
javaregexemoji

Check if letter is emoji


I want to check if a letter is a emoji. I've found some similiar questions on so and found this regex:

private final String emo_regex = "([\\u20a0-\\u32ff\\ud83c\\udc00-\\ud83d\\udeff\\udbb9\\udce5-\\udbb9\\udcee])";

However, when I do the following in a sentence like:

for (int k=0; k<letters.length;k++) {    
    if (letters[k].matches(emo_regex)) {
        emoticon.add(letters[k]);
    }
}

It doesn't add any letters with any emoji. I've also tried with a Matcher and a Pattern, but that didn't work either. Is there something wrong with the regex or am I missing something obvious in my code?

This is how I get the letter:

sentence = "Jij staat op 10 😂"
String[] letters = sentence.split("");

The last 😂 should be recognized and added to emoticon


Solution

  • It seems like those emojis are two characters long, but with split("") you are splitting between each single character, thus none of those letters can be the emoji you are looking for.

    Instead, you could try splitting between words:

    for (String word : sentence.split(" ")) {
        if (word.matches(emo_regex)) {
            System.out.println(word);
        }
    }
    

    But of course this will miss emojis that are joined to a word, or punctuation.

    Alternatively, you could just use a Matcher to find any group in the sentence that matches the regex.

    Matcher matcher = Pattern.compile(emo_regex).matcher(sentence);
    while (matcher.find()) {
        System.out.println(matcher.group());
    }