Search code examples
pythonpandasdataframedictionarydata-cleaning

How to delete a cells from from one column in a DataFrame with a condition on python


I am interested in finding out, in the example dataframe that I have created below column 1 which is var1, I want to remove a cell where the string inside has the letter Z. But I do not want to remove the entire entire row. How can I go about to do this, I thought I might need to use .str.replace() but I do know where to start. (A disclaimer this is tutorial question)


import pandas as pd

df = pd.DataFrame({"var1": ["AZZBBAA", "CCDDDED", "DZZZZFD", "CDEEEEFG"],
                  "var2": [1,2,4,5]})

Which gives me:

    var1      var2
0   AZZBBAA     1
1   CCDDDED     2
2   DZZZZFD     4
3   CDEEEEFG    5

My desired output is below:

    var1      var2
0               1
1   CCDDDED     2
2               4
3   CDEEEEFG    5

Solution

  • Series.str.replace() is also feasible in your case:

    df['var1'] = df['var1'].str.replace('.*Z.*', "")
    

    this will clear the value of var1 column if it contains Z char


           var1  var2
    0               1
    1   CCDDDED     2
    2               4
    3  CDEEEEFG     5