Search code examples
pandasdata-cleaning

pandas replace multiple values (that you do not know) on one column


What is the best way to change several values in a column ('Status') that differ from the only two values that you want to analyse?
As an example, my df is:

Id  Status  Email   Product Age
1   ok          g@      A       20
5   not ok      l@      J       45
1   A           a@      A       27
2   B           h@      B       25 
2   ok          t@      B       33
3   C           b@      E       23
4   not ok      c@      D       30

In the end, I want to have:

Id  Status  Email   Product Age
1   ok          g@      A       20
5   not ok      l@      J       45
1   other       a@      A       27
2   other       h@      B       25 
2   ok          t@      B       33
3   other       b@      E       23
4   not ok      c@      D       30

The greatest difficulty is that my df is very huge, so I do not know all the others values different than 'ok' and 'not ok' (the values that I want to analise). Thanks in advance!


Solution

  • np.where + isin

    df.Status=np.where(df.Status.isin(['ok','not ok']),df.Status,'Others')
    df
    Out[384]: 
       Id  Status Email Product  Age
    0   1      ok    g@       A   20
    1   5  not ok    l@       J   45
    2   1  Others    a@       A   27
    3   2  Others    h@       B   25
    4   2      ok    t@       B   33
    5   3  Others    b@       E   23
    6   4  not ok    c@       D   30