Search code examples
pythonpython-3.xpandaspandasql

How to filter one pandas dataframe dates on the basis of other dataframe


I have one dataframe, I need to filter the dates on the basis of start and end date of the other dataframe

example set is given below. What is the best way in pandas to achieve that?

Considering sample dataframes as below, I have included the expected result set

df1 

ID all_date   clicks
1  2019-08-21   5
1  2019-08-22   4
2  2019-07-18   5
2  2019-07-21   5
2  2019-07-23   6

df2

ID start_date  end_date
1  2019-08-21 2019-08-21
2  2019-07-18 2019-08-21

expected output:

df1

ID all_date   clicks
1  2019-08-21   5
2  2019-07-18   5
2  2019-07-21   5

Solution

  • You can melt and then merge:

    First better to convert dates as datetimes just incase they are strings:

    df1.all_date=pd.to_datetime(df1.all_date)
    df2[['start_date','end_date']]=df2[['start_date','end_date']].apply(pd.to_datetime)
    

    df1.merge(df2.melt('ID',value_name='all_date').drop('variable',1),on=['ID','all_date'])
    

       ID   all_date  clicks
    0   1 2019-08-21       5
    1   1 2019-08-21       5
    2   2 2019-07-18       5
    

    Details, after melt you get:

    df2.melt('ID',value_name='all_date').drop('variable',1)
    
       ID   all_date
    0   1 2019-08-21
    1   2 2019-07-18
    2   1 2019-08-21
    3   2 2019-08-21
    

    Then join on 'ID','all_date' on both the dataframes