Search code examples
pythonpython-3.xpandasheaderunaccent

How to programatically unaccent pandas dataframe header


I have several pandas dataframes with different accented characters in their column names. I would like to convert accented characters to their unaccented equivalents only in column names. I am looking for similar solutions to which I regularly use in R: names(DT) = stringi::stri_trans_to_general('latin-ASCII', names(DT))


Solution

  • unidecode can convert accented chars to unaccented versions. Loop it across all the columns like so:

    import unidecode
    
    df.columns = [unidecode.unidecode(col) for col in df.columns]