I'm trying to read some data from an excel file using xlrd
in python. There are some cells containing special characters like sigma, Pi, etc.; but xlrd
gives me UnicodeEncodeError
.
This is my excel file:
This is the code I use:
import xlrd
if __name__ == '__main__':
wb = xlrd.open_workbook('test.xlsx')
s = wb.sheet_by_name('test')
for row in range(1, s.nrows):
values = {}
for column in range(s.ncols):
values.update({s.cell(0, column).value: str(s.cell(row, column).value)})
print values
And this is the output:
{u'formula': 'a + b * 15', u'name': 'test1'}
Traceback (most recent call last):
File ".\testXLRD.py", line 21, in <module>
values.update({s.cell(0, column).value: str(s.cell(row, column).value)})
UnicodeEncodeError: 'ascii' codec can't encode character u'\u2211' in position 0: ordinal not in range(128)
What should I do??
As Mathias said, the str()
cast is failing because it's trying to decode in ascii a unicode string u'\u2211'
whereas it should use for instance 'utf-8'.
Since s.cell(row, column)
returns a float, an int or some unicode string, you'd better to either don't change the format or convert everything in unicode while you're manipulating it:
values.update({s.cell(0, column).value: unicode(s.cell(row, column).value)})
If you really want a string, and not unicode, this should work:
values.update({s.cell(0, column).value: unicode(s.cell(row, column).value).encode('utf-8')})