Search code examples
pythonpandasdataframelimit

Set the values out of the defined interval limits to a given value (f.e. NaN) for a column in pandas data frame


Having a defined interval limits of valid values, all the pandas data frame column values out of it should be set to a given value, f.e. NaN. The values defining limits and data frame contents can be assumed to be of numerical type.

Having the following limits and data frame:

min = 2
max = 7
df = pd.DataFrame({'a': [5, 1, 7, 22],'b': [12, 3 , 10, 9]})

    a   b
0   5  12
1   1   3
2   7  10
3  22   9

Setting the limit on column a would result in:

     a   b
0    5  12
1  NaN   3
2    7  10
3  NaN   9

Solution

  • Using where with between

    df.a=df.a.where(df.a.between(min,max),np.nan)
    df
    Out[146]: 
         a   b
    0  5.0  12
    1  NaN   3
    2  7.0  10
    3  NaN   9
    

    Or clip

    df.a.clip(min,max)
    Out[147]: 
    0    5.0
    1    NaN
    2    7.0
    3    NaN
    Name: a, dtype: float64