Search code examples
pythonpandascsvanacondaimporterror

Proper Python Pandas read_csv encoding for '\u2116', the 'Numero Sign'


I'm working with a file that has a couple Numero Signs in it.

Here are the top 3 lines copied and pasted directly from the CSV file:

0   1   2   3   4   5   6   7   8   9   10  11  12  13  14  15
    â„– Summer  01 !    02 !    03 !    Total   â„– Winter  01 !    02 !    03 !    Total   â„– Games   01 !    02 !    03 !    Combined total
Afghanistan (AFG)  13  0   0   2   2   0   0   0   0   0   13  0   0   2   2

When I try to import the file in Anaconda using Python 3.5 using Pandas read_csv I get the following error:

UnicodeEncodeError:  'charmap' code can't encode character '\u2116' in position 104: character maps to <undefined>

This happens when I try:

df=pd.read_csv('myfile.csv', encoding='utf_8')

I also tried the standard English codecs listed here with basically the same error code: https://docs.python.org/3/library/codecs.html#standard-encodings

What should I try differently?


Solution

  • I went in the CSV file and deleted the 'Numero Sign' from the file and used it that way. Hopefully it doesn't present a problem in future projects.