!pip install emot
from emot.emo_unicode import EMOTICONS_EMO
def convert_emoticons(text):
for emot in EMOTICONS_EMO:
text = re.sub(u'\('+emot+'\)', "_".join(EMOTICONS_EMO[emot].replace(",","").split()), text)
return text
text = "Hello :-) :-)"
convert_emoticons(text)
I'm trying to run the above code in google collab, but it gives the following error: unbalanced parenthesis at position 4
My undesrtanding from the re module documentation tells that '\(any_expression'\)'
is correct way to use, but I still get the error. So, I'have tried replacing '\(' + emot + '\)
with:
'(' + emot + ')'
, it gives the same error'[' + emot + ']'
, it gives the following output: Hello Happy_face_or_smiley-Happy_face_or_smiley Happy_face_or_smiley-Happy_face_or_smiley
The correct output should be Hello Happy_face_smiley Happy_face_smiley
for text = "Hello :-) :-)"
Can someone help me fix the problem?
This is pretty tricky using regex, as you'd first need to escape the metachars in the regex that are contained in the emoji, such as :)
and :(
, which is why you get the unbalanced parens. So, you'd need to do something like this first:
>>> print(re.sub(r'([()...])', r'%s\1' % '\\\\', ':)'))
:\)
But I'd suggest just doing a straight replacement since you already have a mapping that you're iterating through it. So we'd have:
from emot.emo_unicode import EMOTICONS_EMO
def convert_emoticons(text):
for emot in EMOTICONS_EMO:
text = text.replace(emot, EMOTICONS_EMO[emot].replace(" ","_"))
return text
text = "Hello :-) :-)"
convert_emoticons(text)
# 'Hello Happy_face_smiley Happy_face_smiley'