Search code examples
pandasdataframereplacefindinf

Replace all inf, -inf values with NaN in a pandas dataframe


I have a large dataframe with inf, -inf values in different columns. I want to replace all inf, -inf values with NaN

I can do so column by column. So this works:

df['column name'] = df['column name'].replace(np.inf, np.nan)

But my code to do so in one go across the dataframe does not.

df.replace([np.inf, -np.inf], np.nan)

The output does not replace the inf values


Solution

  • TL;DR

    • df.replace is fastest for replacing ±inf
    • but you can avoid replacing altogether by just setting mode.use_inf_as_na (deprecated in v2.1.0)

    Replacing inf and -inf

    df = df.replace([np.inf, -np.inf], np.nan)
    

    Just make sure to assign the results back. (Don't use the inplace approach, which is being deprecated in PDEP-8.)

    There are other df.applymap options, but df.replace is fastest:

    • df = df.applymap(lambda x: np.nan if x in [np.inf, -np.inf] else x)
    • df = df.applymap(lambda x: np.nan if np.isinf(x) else x)
    • df = df.applymap(lambda x: x if np.isfinite(x) else np.nan)


    Setting mode.use_inf_as_na (deprecated)

    • Deprecated in pandas 2.1.0
    • Will be removed in pandas 3.0

    Note that we don't actually have to modify df at all. Setting mode.use_inf_as_na will simply change the way inf and -inf are interpreted:

    True means treat None, nan, -inf, inf as null
    False means None and nan are null, but inf, -inf are not null (default)

    • Either enable globally

      pd.set_option('mode.use_inf_as_na', True)
      
    • Or locally via context manager

      with pd.option_context('mode.use_inf_as_na', True):
          ...