Search code examples
pythonpandasdataframeunit-testing

Unexpected passing test using atol and pandas assert_frame_equal


I am trying to compare two dataframes using the testing library of pandas. I do not want the values to be exactly the same for the test to pass, so I am using atol parameter. Atol specifies the absoulte tolerance allowed. However, when the values to compare become high, the test passes even if the tolerance threshold is exceeded.

I hereafter provide two reproducible example:

import pandas as pd
import pandas.testing

df1 = pd.DataFrame([42])
df2 = pd.DataFrame([41])
#This test fails as expected
pd.testing.assert_frame_equal(df1, df2, check_exact=False, atol=0.1)

df1 = pd.DataFrame([2006642])
df2 = pd.DataFrame([2006641])
pd.testing.assert_frame_equal(df1, df2, check_exact=False, atol=0.1)
#this test passes, but it should not

Can anyone explain why this happens? Have I misunderstood how atol works?


Solution

  • It turns out that atol parameter is not used alone but in conjunction with rtol, which defaults to a value (1e-05), hence why the bigger values I was comparing made the test pass.

    absolute(a - b) <= (atol + rtol * absolute(b))
    

    In order to obtain the expected result, rtol also needs to be set. In my case, in order to use exclusively atol, I need to set rtol to 0.

    df1 = pd.DataFrame([2006642])
    df2 = pd.DataFrame([2006641])
    #this test now fails as expected
    pd.testing.assert_frame_equal(df1, df2, check_exact=False, atol=0.1, rtol=0)
    

    Credit to the answer in is numpy isclose function returning bad answer?