I have a list of dataframes like this (in df_lst) and i need to check which are different.
here are some data frames
from pandas._testing import assert_frame_equal
d1 = pd.DataFrame({1: [10], 2: [20]})
d2 = pd.DataFrame({1: [10], 2: [20]})
d3 = pd.DataFrame({1: [11], 2: [20]})
d4 = pd.DataFrame({1: [11], 2: [20]})
d5 = pd.DataFrame({1: [9], 2: [20]})
df_lst=[d1,d2,d3,d4,d5]
I know you can check with e.g. this command
assert_frame_equal(d1,d3)
But ideally, it has to be as a function or something that lets you input the list of dataframes - the below code does not work as its missing an argument. it should terminate at this point if it fails to be equal
def check_equality (df_lst):
for i in df_lst:
assert_frame_equal(i)
any comments welcome. thanks so much!
If you are comparing a list against a single known correct dataframe, you can do this:
def check_equality(df_lst, correct_df):
for i in df_lst:
assert_frame_equal(correct_df, i)
If you want to do pairwise comparisons of all dataframes in the list, (a, b), (a, c), (a, d), (b, c), etc...
you can try this:
from itertools import product
def check_equality(df_lst):
for pair in product(df_lst, repeat=2):
assert_frame_equal(pair[0], pair[1])