How can i decode iso-8859-1 symbols within open
funcion.
filename = open(f'/opt/PATH/{shorter}', 'r', encoding='iso-8859-1')
file_content = filename.read()
filename.close()
which gave me ÿ
(i guess this was comma):
[...]
11 Dir(s) 3ÿ016ÿ011ÿ776 bytes free
[...]
It's a mojibake case:
cmd
>NUL chcp 852
>dir_cp852.txt dir /C
type dir_cp852.txt | find /I "bytes free"
28 Dir(s) 832 467 206 144 bytes free
>NUL chcp 1252
type dir_cp852.txt | find /I "bytes free"
28 Dir(s) 832ÿ467ÿ206ÿ144 bytes free
with open('dir_cp852.txt', 'r', encoding='iso-8859-1') as filename:
file_content = filename.read()
print(file_content[-52:])
28 Dir(s) 832ÿ467ÿ206ÿ144 bytes free
Solution:
with open('dir_cp852.txt', 'r', encoding='cp852') as filename:
file_content = filename.read()
print(file_content[-52:])
28 Dir(s) 832 467 206 144 bytes free
Note file_content[-52:]
(in Python prompt):
' 28 Dir(s) 832\xa0467\xa0206\xa0144 bytes free\n'
shows character in mojibake: \xa0
(U+00A0, No-Break Space) with code 0xFF
in Code page 852 (and more MS-DOS code pages).
Please note the /C
switch in dir /C
above (Display the thousand separator in file sizes).; I have overridden the default by (globally defined) set "DIRCMD=/-C"
.
The thousand separator in file sizes is defined in Control Panel\Clock and Region
-> Region:reg query "HKCU\Control Panel\International" /v sThousand