Search code examples
pythonfunctionunit-testingassert

create function to test whether MORE THAN 2 dataframes are different


I have a list of dataframes like this (in df_lst) and i need to check which are different.

here are some data frames

from pandas._testing import assert_frame_equal
d1 = pd.DataFrame({1: [10], 2: [20]})
d2 = pd.DataFrame({1: [10], 2: [20]})
d3 = pd.DataFrame({1: [11], 2: [20]})
d4 = pd.DataFrame({1: [11], 2: [20]})
d5 = pd.DataFrame({1: [9], 2: [20]})

df_lst=[d1,d2,d3,d4,d5]

I know you can check with e.g. this command

assert_frame_equal(d1,d3)

But ideally, it has to be as a function or something that lets you input the list of dataframes - the below code does not work as its missing an argument. it should terminate at this point if it fails to be equal

def check_equality (df_lst):
    for i in df_lst:
        assert_frame_equal(i)

any comments welcome. thanks so much!


Solution

  • If you are comparing a list against a single known correct dataframe, you can do this:

    def check_equality(df_lst, correct_df):
        for i in df_lst:
            assert_frame_equal(correct_df, i)
    

    If you want to do pairwise comparisons of all dataframes in the list, (a, b), (a, c), (a, d), (b, c), etc... you can try this:

    from itertools import product
    
    def check_equality(df_lst):
        for pair in product(df_lst, repeat=2):
            assert_frame_equal(pair[0], pair[1])