Search code examples
python-3.xconfigparser

how to fix the unicode problem on configparser


I use Python 3.7 and configparser 3.7.4.

I have a rank.ini:

[example]
placeholder : \U0001F882

And i have a main.py file:

import configparser
config = configparser.ConfigParser()
config.read('ranks.ini')

print('🢂')
test = '\U0001F882'
print(type(test))
print(test)
test2 = config.get('example', 'placeholder')
print(type(test2))
print(test2)

The result of the code is:

🢂
<class 'str'>
🢂
<class 'str'>
\U0001F882

Why is the var test2 not "🢂" and how i can fix it.


Solution

  • It took me a while to figure this one out since python3 sees everything as unicode explained here

    If my understanding is correct the original print is being seen like this u'\U0001F882', so it converts it into the character.

    However, when you pass the variable in using the configparser as a string the unicode escape character is essentially getting lost such as '\\U0001F882'.

    You can see this difference if you print test and test2's repr

    print(repr(test))
    print(repr(test2))
    

    To get the output you want you will have to unicode escape the string value

    print(test2.encode('utf8').decode('unicode-escape')  
    

    Hope this works for you.