After executing the code below,
df1.merge(df2, on=['Customer_id'], indicator='Status', how='outer')
I get,
Now I want to compare whether all elements in Name_x is equal to Name_y or not, similarly for other columns as well. What i am doing is comparing it individually. Here is my code:
df3.assign(Status1=df3.Name_x == df3.Name_y)
df3.assign(Status2=df3.Age_x == df3.Age_y)
and so on
After doing above i get,
Can i write for loop so that i do not have to write the code
df3.assign()
individually for comparing every two columns?
and most importantly my final status column should look like this,
what i am doing is thinking to write if elif like this:
def f(row):
if row['Status1'] == 'True' and row['Status2'] == 'False': val = 'Matching for column {Name}' elif row['Status1'] == 'False' and row['Status1'] == 'True': val = 'Matching for column {Age} elif row['Status1'] == 'False' and row['Status1'] == 'False': val = 'Not Matching at all' #and so on... return val
This is okay if one want to write if elif only on two columns Status1 and Status2 and becomes lengthy if one want to write if elif on more than 2 columns. Can i get the desired output in few lines of code if i have column Status3 also?
Use pandas.merge
Read both csv into pandas
df1=pd.Dataframe.from_csv('file')
df2=pd.Dataframe.from_csv('file2')
then Merge using an "outer" join
df3=pd.merge(df1,df2,on=[key1,key2 etc], how="outer", indicator=True)
At this point you should be 90% done