I have two dfs, two df's has be to be merged by class and joining dates. Please check the below df's
df1
class teacher age instructor_joining_date
A mark 50 2024-01-20 07:18:29.599
A john 45 2024-05-08 05:31:21.379
df2
class count student_joining_date
A 1 2024-05-17 01:05:58.072
A 50 2024-04-10 10:39:06.608
A 75 2024-04-05 09:49:07.246
Final output df
class count student_joining_date teacher age
A 1 2024-05-17 01:05:58.072 john 45
A 50 2024-04-10 10:39:06.608 mark 50
A 75 2024-04-05 09:49:07.246 mark 50
For df2 we have merge df1 by class and joining dates
Edit: Yes if student_joining_date and instructor_joining_date is different. If student_joining_date is greater than instructor_joining_date then that teacher will be mapped here
You have to use a merge_asof
, then restore the original order with reindex
:
df1['instructor_joining_date'] = pd.to_datetime(df1['instructor_joining_date'])
df2['student_joining_date'] = pd.to_datetime(df2['student_joining_date'])
out = (pd.merge_asof(df2.sort_values(by='student_joining_date').reset_index(),
df1.sort_values(by='instructor_joining_date'),
left_on='student_joining_date',
right_on='instructor_joining_date',
by='class')
.set_index('index').reindex(df2.index)
.drop(columns='instructor_joining_date')
)
Output:
class count student_joining_date teacher age
0 A 1 2024-05-17 01:05:58.072 john 45
1 A 50 2024-04-10 10:39:06.608 mark 50
2 A 75 2024-04-05 09:49:07.246 mark 50
Intermediate before dropping the instructor_joining_date
column:
class count student_joining_date teacher age instructor_joining_date
0 A 1 2024-05-17 01:05:58.072 john 45 2024-05-08 05:31:21.379
1 A 50 2024-04-10 10:39:06.608 mark 50 2024-01-20 07:18:29.599
2 A 75 2024-04-05 09:49:07.246 mark 50 2024-01-20 07:18:29.599