Search code examples
pythonpandasreplacespecial-characters

Replace special characters in pandas dataframe from a string of special characters


I have created a pandas dataframe called df using this code:

import numpy as np import pandas as pd

ds = {'col1' : ['1','3/','4'], 'col2':['A','!B','@C']}

df =pd.DataFrame(data=ds)

The dataframe looks like this:

print(df)

  col1 col2
0    1    A
1   3/   !B
2    4   @C

The columns contain some special characters (/ and @) that I need to replace with a blank space.

Now, I have a list of special characters:

listOfSpecialChars = '¬`!"£$£#/,.+*><@|"'

How can I replace any of the special characters listed in listOfSpecialChars with a blank space, any time I encounter them at any point in a dataframe, for any columns? At the moment I am dealing with a 100K-record dataframe with 560 columns, so I can't write a piece of code for each variable.


Solution

  • You can use apply with str.replace:

    import re
    chars = ''.join(map(re.escape, listOfSpecialChars))
    
    df2 = df.apply(lambda c: c.str.replace(f'[{chars}]', '', regex=True))
    

    Alternatively, stack/unstack:

    df2 = df.stack().str.replace(f'[{chars}]', '', regex=True).unstack()
    

    output:

      col1 col2
    0    1    A
    1    3    B
    2    4    C