I have a script I'm writing where I need to print the character sequence "Qä" to the terminal. My terminal is using UTF-8 encoding. My file has # -*- coding: utf-8 -*-
at the top of it, which I think is not actually necessary for Python 3, but I put it there in case it made any difference. In the code, I have something like
print("...Qä...")
This does not produce Qä. Instead it produces Q▒.
I then tried
qa = "Qä".encode('utf-8')
print(f"...{qa}...")
This also does not produce Qä. It produces 'Q\xc3\xa4'.
I also tried
qa = u"Qä"
print(f"...{qa}...")
This also produces Q▒.
However, I know that Python 3 can open files that contain UTF-8 and use the contents properly, so I created a file called qa.txt, pasted Qä into it, and then used
with open("qa.txt") as qa_file:
qa = qa_file.read().strip()
print(f"...{qa}...")
This works. However, it's beyond dumb that I have to create this file in order to print this string. How can I put this text into my code as a string literal?
This question is NOT a duplicate of a question asking about Python 2.7, I am not using Python 2.7.
You're using Git Bash, on Windows. On Windows, except if stdio is connected to a standard Windows console (which I don't think Git Bash counts as), Python defaults the standard streams to a locale encoding of 'cp1252'
. Your terminal is set to expect UTF-8, not CP1252. You can reconfigure the standard output stream to UTF-8 with
sys.stdout.reconfigure(encoding='utf-8')
and similarly for stdin and stderr, or you can set the PYTHONIOENCODING
environment variable to utf-8
before running Python to change the default stdin/stdout/stderr encodings.