I have one dataframe, I need to filter the dates on the basis of start and end date of the other dataframe
example set is given below. What is the best way in pandas to achieve that?
Considering sample dataframes as below, I have included the expected result set
df1
ID all_date clicks
1 2019-08-21 5
1 2019-08-22 4
2 2019-07-18 5
2 2019-07-21 5
2 2019-07-23 6
df2
ID start_date end_date
1 2019-08-21 2019-08-21
2 2019-07-18 2019-08-21
expected output:
df1
ID all_date clicks
1 2019-08-21 5
2 2019-07-18 5
2 2019-07-21 5
You can melt
and then merge
:
First better to convert dates as datetimes just incase they are strings:
df1.all_date=pd.to_datetime(df1.all_date)
df2[['start_date','end_date']]=df2[['start_date','end_date']].apply(pd.to_datetime)
df1.merge(df2.melt('ID',value_name='all_date').drop('variable',1),on=['ID','all_date'])
ID all_date clicks
0 1 2019-08-21 5
1 1 2019-08-21 5
2 2 2019-07-18 5
Details, after melt
you get:
df2.melt('ID',value_name='all_date').drop('variable',1)
ID all_date
0 1 2019-08-21
1 2 2019-07-18
2 1 2019-08-21
3 2 2019-08-21
Then join on 'ID','all_date'
on both the dataframes