i am trying to find list of columns in a data frame with same values in columns. there is a package in R whichAreInDouble, trying implement that in python.
df =
a b c d e f g h i
1 2 3 4 1 2 3 4 5
2 3 4 5 2 3 4 5 6
3 4 5 6 3 4 5 6 7
it should give me list of columns with same values like
a, e are equal
b,f are equal
c,g are equal
Let's try using itertools and combinations:
from itertools import combinations
[(i, j) for i,j in combinations(df, 2) if df[i].equals(df[j])]
Output:
[('a', 'e'), ('b', 'f'), ('c', 'g'), ('d', 'h')]