Search code examples
pythonpandasencodingspecial-characters

pandas read_excel converting special charactor


I am reading data from excel and writing it to CSV file.

osht = pd.read_excel(ip_path,header=None,sheet_name=j,encoding='utf-8-sig')
osht.to_csv(file_name,sep=',',index=False,encoding='utf-8-sig')

Excel file has some lines which has special characters like :

'SOCIÉTÉ' , 'HERMÈS'

Pandas changes such words to :

'SOCIéTé' , 'HERMÊS'

I tried changing encoding method to 'utf-8', 'utf_16_le' but issue still persisted.

Please suggest whats needs to be done in such case.


Solution

  • 'SOCIéTé' , 'HERMÊS'

    This suggests the resulting file is not UTF-8 encoded.