Why, with this program :
import sys
print("sys.getdefaultencoding()='%s'" % (sys.getdefaultencoding(), ))
with open("example.txt", "w", encoding="utf-8-sig", errors="replace") as f:
f.write("test;Ilość sztuk\n")
with open("example.txt", "r", errors="strict") as rf:
lr = rf.readline()
print("lr=", lr)
run OK in some context, and failed in other context.
example OK :
$ python3 ./example.py
sys.getdefaultencoding()='utf-8'
lr= test;Ilość sztuk
note :
$ python3 --version
Python 3.6.8
example KO :
sys.getdefaultencoding()='utf-8'
Traceback (most recent call last):
File "./example.py", line 9, in <module>
lr = rf.readline()
File "/.../python/lib/python3.6/encodings/ascii.py", line 26, in decode
return codecs.ascii_decode(input, self.errors)[0]
UnicodeDecodeError: 'ascii' codec can't decode byte 0xef in position 0: ordinal not in range(128)
$
note :
$ python3 --version
Python 3.6.8
Contexts are ; Ubuntu 19.04, Ubuntu 18.04, Debian 9, in chroot, outside chroot, LANG is "en_US.UTF-8" or "fr_FR.UTF-8", no impact on success or failed
In all case, Python is install by hand with same option.
If you need value of some environment variable, I can give it.
I search to have exact same execution in all case.
In Python 3, there are different encoding defaults.
The one you found, sys.getdefaultencoding()
, tells you the default for the methods str.encode()
and bytes.decode()
.
As far as I know, it's always UTF-8, no matter what build or implementation of Python you use.
However, if you omit the encoding=...
parameter in a call to open()
, then locale.getpreferredencoding()
is used; also for sys.stdin
, sys.stdout
(print()
!), sys.stderr
.
The value of this default depends on the environment in which the Python interpreter is started.
The details of how this value is determined varies between platforms, but often you can achieve the desired behaviour by setting the PYTHONIOENCODING
env variable.
As of Python 3.7, you can launch Python with -X utf8
to enable UTF-8 mode.