Search code examples
pythonasciixlrd

Python XLRD module gives ValueError for some ASCII characters


I'm trying to read some data from an excel file using xlrd in python. There are some cells containing special characters like sigma, Pi, etc.; but xlrd gives me UnicodeEncodeError.

This is my excel file:

enter image description here

This is the code I use:

import xlrd
if __name__ == '__main__':
    wb = xlrd.open_workbook('test.xlsx')
    s = wb.sheet_by_name('test')
    for row in range(1, s.nrows):
        values = {}
        for column in range(s.ncols):
            values.update({s.cell(0, column).value: str(s.cell(row, column).value)})    
            print values

And this is the output:

{u'formula': 'a + b * 15', u'name': 'test1'}
Traceback (most recent call last):
  File ".\testXLRD.py", line 21, in <module>
    values.update({s.cell(0, column).value: str(s.cell(row, column).value)})
UnicodeEncodeError: 'ascii' codec can't encode character u'\u2211' in position 0: ordinal not in range(128)

What should I do??


Solution

  • As Mathias said, the str() cast is failing because it's trying to decode in ascii a unicode string u'\u2211' whereas it should use for instance 'utf-8'.

    Since s.cell(row, column) returns a float, an int or some unicode string, you'd better to either don't change the format or convert everything in unicode while you're manipulating it:

    values.update({s.cell(0, column).value: unicode(s.cell(row, column).value)})
    

    If you really want a string, and not unicode, this should work:

    values.update({s.cell(0, column).value: unicode(s.cell(row, column).value).encode('utf-8')})