Search code examples
pandaspython-telegram-bot

Python Telegram Bot and Pandas - reply_text show "\n" Instead of new line


I have a simple Telegram Bot with python-telegram-bot package and i use pandas to import a csv:

df = pd.read_csv('data.csv')

key text
00 text1 \n text2 \n text3
. some \n text
. other \n text
99 text4 \n text5 \n text6

Then I do a search inside the dataframe from user message:

answer = df['text'].str.contains(update.message.text, case=False)]

and the bot sends a reply message to the user:

await update.message.reply_text(answer)

but the output is showing the "\n" tag:


text1 \n text2 \n text3

and I want to show the text:

text1
text2
text3

I'm struggling with this problem. Before dataframe I used TinyDb and everything worked fine. How can I resolve?

Thanks

I try to change dtype of column to string, to export csv to list, encoding of the file.


Solution

  • I tried what happened in your case, but it worked for me.

    Image result in telegram

    import pandas as pd
    
    # I am using json format for my case
    data = [
        {"key": "01", "text": "text1 \n text2 \n text3"},
        {"key": "01", "text": "some \n text"},
        {"key": "02", "text": "other \n text"},
        {"key": "99", "text": "text4 \n text5 \n text6"},
    ]
    
    df = pd.DataFrame(data)
    
    # ...
    
    answer = df[df["text"].str.contains("some", case=False)]
    # answer = df[df["text"].str.contains(update.message.text, case=False)]
    
    if not answer.empty:
        print(answer.values[0][1].encode())  # check raw text
        # out: b'some \n text'
        print(type(answer.values[0][1]))  # check type
        # out: <class 'str'>
        await update.message.reply_text(answer.values[0][1])
    

    With csv data:

    key,text
    01,text1 \n text2 \n text3
    01,some \n text
    02,other \n text
    99,text4 \n text5 \n text6
    101,this is emoji \n ✅ \n \U0001F600\
    
    # ...
    df = pd.read_csv("data.csv")
    
    answer = df[df["text"].str.contains("some", case=False)]
    
    if not answer.empty:
        print(answer.values[0][1].encode())  # check raw text
        # out: b'some \n text'
        print(type(answer.values[0][1]))  # check type
        # out: <class 'str'>
        print(answer.values[0][1])  # print text
        # out: some \n text
        print(answer.values[0][1].replace("\\n", "\n"))  # replace text
        # out: some 
        #       text
    
        # ...
    

    Output:

    b'some \\n text'
    <class 'str'>
    some \n text
    some 
     text
    

    When we work with csv, we will get \\n, we can change \\n to \n to get the result we need.

    print(answer.values[0][1].replace("\\n", "\n"))  # replace text
    

    Working with unicode, we need a unicode escape sequence and replace it with Unicode characters using the unicode_escape codec.

    A simple way we can use a regex expression like that:

    import re
    
    def replace_unicode_escape(text):
        def replace(match):
            return match.group(0).encode().decode("unicode_escape")
    
        text = re.sub(r"\\n", "\n", text)
        return re.sub(r"\\U[0-9a-fA-F]{8}", replace, text)
    
    # ...
    
    if not answer.empty:
        # ...
    
        print(replace_unicode_escape(answer.values[0][1]))  # replace text
        # out: this is emoji
        #       ✅
        #       😀
    
        # ...
    
    

    Output:

    b'this is emoji \\n \xe2\x9c\x85 \\n \\U0001F600'
    <class 'str'>
    this is emoji \n ✅ \n \U0001F600
    this is emoji 
     ✅ 
     😀