I am not able to find a clear answer to this. Does the ECLIPSE IDE support emojis? I have read a lot about surrogate pairs here on stack overflow, but I am unable to get a clear answer on this.
I am having to read in a text file character by character and I am using FileInputStream.
Would it be possible to process the emojis using surrogate pairs? I am wanting to use a select few apple emojis. These specifically: ๐ซ ๐ ๐๐ By process them, I mean I would like to identify them as that particular emoji when reading in the file.
If so, could someone show me an example?
InputStreams are for reading bytes; Readers are for reading characters. So you should use a Reader obtained from Files.newBufferedReader, or use a FileReader or InputStreamReader.
Although Java uses surrogate pairs inside a String to represent emojis and many other types of Unicode characters, you donโt need to deal with surrogate pairs directly. Surrogate values only exist because many character values are too large for a Java char
type. If you read individual characters as int
values (for example, with the CharSequence.codePoints method), you will get whole character values every time, and you will never see or have to deal with a surrogate value.
As of this writing, emojis are defined by Unicode to be in the Emoticons block, part of the Supplemental Symbols and Pictographs block, and three legacy characters in the Miscellaneous Symbols block.
Thus, using a BufferedReader and traversing the character data with ints might look like this:
try (BufferedReader reader =
Files.newBufferedReader(Paths.get(filename), Charset.defaultCharset())) {
IntStream chars = reader.lines().flatMapToInt(String::codePoints);
chars.forEachOrdered(c -> {
if ((c >= 0x2639 && c <= 0x263b) ||
(c >= 0x1f600 && c < 0x1f650) ||
(c >= 0x1f910 && c < 0x1f930)) {
processEmoji(c);
}
});
}