Search code examples
pythonpandasequalitysimilarity

Identify equality in columns while ignoring NaNs


How can I ignore the empty/NaN columns with equals of pandas?

It should return TRUE if col 2 is the same as 1 and also when col 2 contains an NaN:

df['col1'].equals(df['col2'])

Solution

  • You can do this with boolean filtering (Symmetric for NA/nan in both columns):

    mask = df['col1'].notna() & df['col2'].notna()
    df.loc[mask, 'col1'].equals(df.loc[mask, 'col2'])