I have two dataframes as follows,
import pandas as pd
d1 ={'col1': ['I ate dinner','I ate dinner', 'the play was inetresting','the play was inetresting'],
'col2': ['min', 'max', 'mid','min'],
'col3': ['min', 'max', 'max','max']}
d2 ={'col1': ['I ate dinner',' the glass is shattered', 'the play was inetresting'],
'col2': ['min', 'max', 'max'],
'col3': ['max', 'min', 'mid']}
df1 = pd.DataFrame(d1)
df2 = pd.DataFrame(d2)
I have created a column in df2 called 'exist' and added values (true, false) based on whether the sentences in df2.col1 exist in df1.col1:
common = df1.merge(df2,on=['col1'])
index_list = df2[(~df2.col1.isin(common.col1))].index.to_list()
df2['exist'] = ' '
df2.loc[index_list, 'exist'] = 'false'
df2.loc[df2["exist"] == " ",'exist'] = 'true'
what I would like to do now, is that if the value in the exist column == true, I would like to add that row under the similar row in df1. so the desired output should be:
output:
col1 col2 col3
0 I ate dinner min min
1 I ate dinner max max
2 I ate dinner min max
3 the play was inetresting mid max
4 the play was inetresting min max
5 the play was inetresting max mid
I guess I have to use np.where, but I am not sure how to formulate the append to get the desired output
First idea is filter df2
values by df1.col1
and append to df1
by concat
and then sorting by DataFrame.sort_values
:
df = pd.concat([df1, df2[(df2.col1.isin(df1.col1))]]).sort_values('col1', ignore_index=True)
print (df)
col1 col2 col3
0 I ate dinner min min
1 I ate dinner max max
2 I ate dinner min max
3 the play was inetresting mid max
4 the play was inetresting min max
5 the play was inetresting max mid
If need only common values in both DataFrames is possible filter by numpy.intersect1d
:
common = np.intersect1d(df1['col1'], df2['col1'])
df = (pd.concat([df1[df1.col1.isin(common)],
df2[df2.col1.isin(common)]])
.sort_values('col1', ignore_index=True))
print (df)