Search code examples
pythonexcelstringunicodeunicode-string

Unable to convert unicode to string in python getting error


I'm reading a column from an excel file into a list as follows:

import xlrd
import openpyxl
book = xlrd.open_workbook("English corpus.xlsx")
sheet = book.sheet_by_index(0)


data=[]
for row_index in xrange(1, sheet.nrows): # skip heading row
    timestamp, text, header, transporter, device_type = sheet.row_values(row_index, end_colx=5)
    print (text)
    data.append(text)

But with the the elements in the data list are of type "unicode". I tried doing the following to convert them to string:

[x.encode('UTF8') for x in data]

But then it gives me the following error:

AttributeError: 'int' object has no attribute 'encode'

then I tried doing the following:

[str(x).encode('UTF8') for x in data]

that's giving me the following error:

UnicodeEncodeError: 'ascii' codec can't encode character u'\xe9' in position 176: ordinal not in range(128)

OR: If you could tell me how I could read from the excel column into the list not as unicode elements but normal string. Thanks


Solution

  • The last error is coming from str(x); if you use [unicode(x).encode('UTF8') for x in data], you will avoid that error.