I'm reading a config file in python getting sections and creating new config files for each section.
However.. I'm getting a decode error because one of the strings contains Español=spain
self.output_file.write( what.replace( " = ", "=", 1 ) )
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 4: ordinal not in range(128)
How would I adjust my code to allow for encoded characters such as these? I'm very new to this so please excuse me if this is something simple..
class EqualsSpaceRemover:
output_file = None
def __init__( self, new_output_file ):
self.output_file = new_output_file
def write( self, what ):
self.output_file.write( what.replace( " = ", "=", 1 ) )
def get_sections():
configFilePath = 'C:\\test.ini'
config = ConfigParser.ConfigParser()
config.optionxform = str
config.read(configFilePath)
for section in config.sections():
configdata = {k:v for k,v in config.items(section)}
confignew = ConfigParser.ConfigParser()
cfgfile = open("C:\\" + section + ".ini", 'w')
confignew.add_section(section)
for x in configdata.items():
confignew.set(section,x[0],x[1])
confignew.write( EqualsSpaceRemover( cfgfile ) )
cfgfile.close()
If you use python2
with from __future__ import unicode_literals
then every string literal you write is an unicode literal, as if you would prefix every literal with u"..."
, unless you explicitly write b"..."
.
This explains why you get an UnicodeDecodeError on this line:
what.replace(" = ", "=", 1)
because what you actually do is
what.replace(u" = ",u"=",1 )
ConfigParser
uses plain old str
for its items when it reads a file using the parser.read()
method, which means what
will be a str
. If you use unicode as arguments to str.replace()
, then the string is converted (decoded) to unicode, the replacement applied and the result returned as unicode. But if what
contains characters that can't be decoded to unicode using the default encoding, then you get an UnicodeDecodeError where you wouldn't expect one.
So to make this work you can
what.replace(b" = ", b"=", 1)
unicode_litreals
future import.Generally you shouldn't mix unicode
and str
(python3 fixes this by making it an error in almost any case). You should be aware that from __future__ import unicode_literals
changes every non prefixed literal to unicode and doesn't automatically change your code to work with unicode in all case. Quite the opposite in many cases.