I have two data frames, df1 and df2. Now, df1 contains 6 records and df2 contains 4 records. I want to get the unmatched records out of it. I tried it but getting an error ValueError: Can only compare identically-labelled DataFrame objects
I guess this is due to the length of df as the df1 has 6 and df2 has 4 but how do I compare them both and get the unmatched rows?
code
df1=
a b c
0 1 2 3
1 4 5 6
2 3 5 5
3 5 6 7
4 6 7 8
5 6 6 6
df2 =
a b c
0 3 5 5
1 5 6 7
2 6 7 8
3 6 6 6
index = (df != df2).any(axis=1)
df3 = df.loc[index]
which gives:
ValueError: Can only compare identically-labelled DataFrame objects
Expected output:
a b c
0 1 2 3
1 4 5 6
I know that the error is due to the length but is there any way where we can compare two data frames and get the unmatched records out of it?
MultiIndex.from_frame
+ isin
We can use MultiIndex.from_frame
on both df1
and df2
to create the corresponding multiindices, then use isin
to test the membership of the index created from df1
in index
created from df2
to create a boolean mask which can be then used to filter the non matching rows.
i1 = pd.MultiIndex.from_frame(df1)
i2 = pd.MultiIndex.from_frame(df2)
df1[~i1.isin(i2)]
a b c
0 1 2 3
1 4 5 6