I have a big .xlsx file containing tweets with emojis. I am working on a personal project where I want to make a network graph from the extracted emojis. For example, if I have this in one of the columns:
Christian✝️, Husband👫, Father👨👩👦👦, Former TV 📺Meteorologist🌪, GOP🐘, LTC 🔫, Dolfan🐬, since ‘75, Yanks Fan⚾️ & UCONN Alum🏀 Go Whalers🐋!
So how would I only get this as on output?
✝️👫👨👩👦👦📺🌪🐘🔫🐬⚾️🏀🐋
I have looked thoroughly everywhere, in Stack Overflow and over the internet, however I couldn't find anything. I am a beginner in R.
I am getting the Unicode (in UTF-8 format) when I normally read the file, but I don't know how to turn those Unicode to the emojis. There are dictionaries online, but they only give me the name of some of these emojis, they are very outdated.
There is a solution that works in Linux, but I am seeking a solution/hint to get this to work in the Windows.
This works for me, with the caveat only the cross prints out as an emoji in the console, the rest are the unicode representation.
# install.packages("remotes")
# remotes::install_github("hadley/emo")
emojis <- "Christian✝️, Husband👫, Father👨👩👦👦, Former TV 📺Meteorologist🌪, GOP🐘, LTC 🔫, Dolfan🐬, since ‘75, Yanks Fan⚾️ & UCONN Alum🏀 Go Whalers🐋!"
emojis
only_emojis <- emo::ji_extract_all(emojis)
only_emojis
# emo::ji_extract_all(emojis)
# [[1]]
# [1] "✝️" "\U0001f46b" "\U0001f468" "\U0001f469" "\U0001f466" "\U0001f466" "\U0001f4fa" "\U0001f418" "\U0001f52b" "\U0001f42c" "\u26be" "\U0001f3c0" "\U0001f40b"
# install.packages("utf8")
utf8::utf8_print(only_emojis[[1]])
# [1] "✝️" "👫" "👨" "👩" "👦" "👦" "📺" "🐘" "🔫" "🐬" "⚾" "🏀" "🐋"