Search code examples
pythonunicode

Python: Cannot find the right encoding to print Tableau result


I want to print the result of crosstabs from Tableau worksheet, it contains some Tranditional Chinese words in it.

import sys
sys.stdout.reconfigure(encoding='utf-8')
.
.
.
view_data_raw = querying.get_view_data_dataframe(
    conn, view_id=visual_c_id)
print(view_data_raw.to_string()) #A
print(view_data_raw.to_string().encode(encoding='utf-8')) #B
print(view_data_raw.to_string().encode(encoding='cp1252')) #C
print(view_data_raw.to_string().encode(encoding='gbk')) #D

#A

2023å¹´4æ1æ¥                     #it should be 2023年4月1日

#B

2023\xc3\xa5\xc2\xb9\xc2\xb44\xc3\xa6\xc2\x9c\xc2\x881\xc3\xa6\xc2\x97\xc2\xa5      #it should be 2023年4月1日

#C

UnicodeEncodeError: 'charmap' codec can't encode characters in position 8-9: character maps to <undefined>

#D

UnicodeEncodeError: 'gbk' codec can't encode character '\xe4' in position 4: illegal multibyte sequence

I tried to decode and encode several times but it doesn't work. Any suggestion?


Solution

  • Just found the solution:

    print(view_data_raw.to_string().encode('raw_unicode_escape').decode())