Search code examples
pythonpandasdataframedata-cleaning

Replacing several strings in a dataframe column


I am looking for a simple way of replacing several string and assigning it to a new df with the updated replacements

This the the sample column I am working with df['Column']

Column
-----------------
K700E
R957Q
Deletion
L747_T751delinsP
S752_I759del
I491M
D770_P772dup
G719A
G735S
N771_H773dup
K467T
E746_T751insIP
D770_N771insD
G724S
K745_A750del
EGFRvIII
V765A
EGFRvII
L858M

Some entries contain text which I don't need, basically needs to be cleaned. Below is my code which I can't seem to get right.

for i in df['Column']:
df['Column'].replace('Truncating Mutations', '9999')
df['Column'].replace('Amplification', '9999')
print(i)

There are also some entries like

EGFR-RAD51 Fusion

I basically want to remove the word 'Fusion' but keep 'EGFR'.

Any advise is very much appreciated from a novice. =)


Solution

  • Alternative answer


    You can also pass a Dictionary with the key and values you want to replace:

    rdict = {
        "Truncating Mutations":"9999", 
        "Amplification":"9999",
        "Fusion":""
        }
    
    df[0] = df[0].replace(rdict)