How to convert Bytes (UTF-8) embeded emoji in a string

I am scraping a data from WhatsApp chat backup (chat.txt). It looks like this :

7/21/20, 1:31 PM - mark: Can we look google😂😂  
7/21/20, 1:31 PM - elon: No  
7/21/20, 1:31 PM - mark: Can we smile ?  
7/21/20, 1:31 PM - elon: Ya🤩

While I used line by line extraction

with open ('chat.txt','rb') as file:
    for line in file:
        print(str(line.strip()))

I got this:

b'7/21/20, 7:37 AM - mark: Can we look google\xf0\x9f\xa4\xa9\xf0\x9f\x98\x82\xf0\x9f\x98\x82'
b'7/21/20, 7:37 AM - elon: No'
b'7/21/20, 1:31 PM - mark: Can we smile ?'
b'7/21/20, 7:37 AM - elon: Ya\xf0\x9f\x98\x82'

How can we git rid of b'' ? ( I tried .decode('utf-8'), but it didn't work)

How can I convert

Can we look google\xf0\x9f\xa4\xa9\xf0\x9f\x98\x82\xf0\x9f\x98\x82

Can we look google😂😂?

Solution

Open the file with the right encoding, not binary mode:

with open ('chat.txt', encoding='utf8') as file:
    for line in file:
        print(line, end='')

How well this works depends on your execution environment. You need a terminal/IDE and font that support printing the code points for print to be successful, but that is not a Python issue.