Search code examples
pythonsframe

How to subset nan/inf values in Dato SFrames


Im trying to subset a column of a large data frame with a couple nan/inf values in one of the columns.

I have tried for example something like this.

df = df[df['a'] == 'NaN']

Or

df = df[df['a'] == 'Inf']

How do I reference these types of values within a column?


Solution

  • NaN is a special value. It is not equal to anything, not even itself. Here's one way to filter by NaN:

    import math
    df = df[df['a'].apply(lambda x: math.isnan(x))]
    

    Inf is a little easier:

    df = df[df['a'] == float('inf')]