Search code examples
pythonpandasunidecoder

Pandas apply unidecode to several columns


I am trying to convert all the elements of two pandas series from a pandas data frame, which aren't ascii characters to ascii. Simply apply the function to the relevant columns doesnt work. Python only shows an attribute error stating that 'series' object has no attribute encode.

import pandas as pd 
import numpy as np
from unidecode import unidecode

try_data=pd.DataFrame({ 

 'Units': np.array([3,4,5,6,10],dtype='int32'),
 'Description_PD': pd.Categorical(['VEIJA 5 TRIÂNGULOS 200','QUEIJO BOLA','QJ BOLA GRD','VEIJO A VACA TRIÂNGULOS 100','HEITE GORDO TERRA']), 
 'Description_Externa' : pd.Categorical(['SQP 4 porções', 'Bola', ' SIESTA BOLA', 'SQP 16 porções', 'TERRA NOSTRA'])

     })

  try_data[['Description_PD','Description_Externa']].apply(unidecode)

Solution

  • Iterate over the col list and in the loop call apply, for some reason your attempt didn't work but it should have:

    In[47]:
    for col in ['Description_PD','Description_Externa']:
        try_data[col] = try_data[col].apply(unidecode)
    try_data
    
    Out[47]: 
      Description_Externa               Description_PD  Units
    0       SQP 4 porcoes       VEIJA 5 TRIANGULOS 200      3
    1                Bola                  QUEIJO BOLA      4
    2         SIESTA BOLA                  QJ BOLA GRD      5
    3      SQP 16 porcoes  VEIJO A VACA TRIANGULOS 100      6
    4        TERRA NOSTRA            HEITE GORDO TERRA     10
    

    For instance calling apply on a single column works fine:

    In[49]:
    try_data['Description_Externa'].apply(unidecode)
    
    Out[49]: 
    0     SQP 4 porcoes
    1              Bola
    2       SIESTA BOLA
    3    SQP 16 porcoes
    4      TERRA NOSTRA
    Name: Description_Externa, dtype: category
    Categories (5, object): [SIESTA BOLA, Bola, SQP 16 porcoes, SQP 4 porcoes, TERRA NOSTRA]