Search code examples
pandascompare

Compare two dataframes with different labels


Is it possible to compare two dataframes with different labels, with the same results output as in df.compare(df2)?

I tried using df.compare(df2), like 'L' with 'R', and 'L' with 'N' - below - but received the following error message:

ValueError: Can only compare identically-labeled (both index and columns) DataFrame objects
`
L = pd.DataFrame(
{
  'FellowshipID': [1001, 1002, 1003, 1004],
  'FirstName': ['Frodo', 'Samwise', 'Gandalf', 'Pippin'], 
  'Skills': ['Hiding', 'Gardening', 'Spells', 'Fireworks']
}    
)

R = pd.DataFrame(
    {
        'FellowshipID': [1001, 1002, 1006, 1007, 1008], 
        'FirstName': ['Frodo', 'Samwise', 'Legolas', 'Elrond', 'Barromir'], 
        'Age': [50, 39, 2931, 6520, 51]    
    }
)


N = pd.DataFrame(
    data = {
        'Relation': ['fri 1', 'fri 2', 'fri 3', 'fri 4'],
        'Name': ['John', 'Jane', 'Adam', 'Omar'],
        'Char': ['Hw', 'Amb', 'Adv', 'Sprt']
    }
)

`

Thank you :-)


Solution

  • compare requires the two inputs to have the same indices/columns.

    One option is to reindex_like, which will align Right to Left:

    out = L.compare(R.reindex_like(L))
    
    #   FellowshipID         FirstName              Skills      
    #           self   other      self    other       self other
    # 0          NaN     NaN       NaN      NaN     Hiding   NaN
    # 1          NaN     NaN       NaN      NaN  Gardening   NaN
    # 2       1003.0  1006.0   Gandalf  Legolas     Spells   NaN
    # 3       1004.0  1007.0    Pippin   Elrond  Fireworks   NaN
    

    Another is to reindex both inputs on the union of their indices/columns:

    def cust_compare(L, R):
        idx = L.index.union(R.index)
        cols = L.columns.union(R.columns)
        return (L.reindex(index=idx, columns=cols)
                 .compare(R.reindex(index=idx, columns=cols))
               )
    
    out = cust_compare(L, R)
    
    #    Age       FellowshipID         FirstName               Skills      
    #   self other         self   other      self     other       self other
    # 0  NaN    50          NaN     NaN       NaN       NaN     Hiding   NaN
    # 1  NaN    39          NaN     NaN       NaN       NaN  Gardening   NaN
    # 2  NaN  2931       1003.0  1006.0   Gandalf   Legolas     Spells   NaN
    # 3  NaN  6520       1004.0  1007.0    Pippin    Elrond  Fireworks   NaN
    # 4  NaN    51          NaN  1008.0       NaN  Barromir        NaN   NaN