Search code examples
pythonstringpandaspython-2.7series

Replace multiple substrings in a Pandas series with a value


All,

To replace one string in one particular column I have done this and it worked fine:

dataUS['sec_type'].str.strip().str.replace("LOCAL","CORP")

I would like now to replace multiple strings with one string say replace ["LOCAL", "FOREIGN", "HELLO"] with "CORP"

How can make it work? the code below didn't work

dataUS['sec_type'].str.strip().str.replace(["LOCAL", "FOREIGN", "HELLO"], "CORP")

Solution

  • You can perform this task by forming a |-separated string. This works because pd.Series.str.replace accepts regex:

    Replace occurrences of pattern/regex in the Series/Index with some other string. Equivalent to str.replace() or re.sub().

    This avoids the need to create a dictionary.

    import pandas as pd
    
    df = pd.DataFrame({'A': ['LOCAL TEST', 'TEST FOREIGN', 'ANOTHER HELLO', 'NOTHING']})
    
    pattern = '|'.join(['LOCAL', 'FOREIGN', 'HELLO'])
    
    df['A'] = df['A'].str.replace(pattern, 'CORP', regex=True)
    
    #               A
    # 0     CORP TEST
    # 1     TEST CORP
    # 2  ANOTHER CORP
    # 3       NOTHING