python-3.x pandas dataframe data-analysis data-processing

Filter rows of 1st Dataframe from the 2nd Dataframe having different starting dates

I have two dataframes from which a new dataframe has to be created. The first one is given below.

data = {'ID':['A', 'A', 'A', 'A', 'A', 'B','B','B','B', 'C','C','C','C','C','C', 'D','D','D'],
    'Date':['2021-2-13', '2021-2-14', '2021-2-15', '2021-2-16', '2021-2-17', '2021-2-16', '2021-2-17', '2021-2-18', '2021-2-19',
           '2021-2-12', '2021-2-13', '2021-2-14', '2021-2-15', '2021-2-16','2021-2-17', '2021-2-14', '2021-2-15', '2021-2-16'],
   'Steps': [1000, 1200, 1500, 2000, 1400, 4000,3400, 5000,1000, 3500,4000,5000,5300,2000,3500, 5000,5500,5200 ]}
df1 = pd.DataFrame(data)

df1

The image of this is also attached.

The 2nd dataframe contains the starting date of each participant as given and shown below.

data1 = {'ID':['A', 'B', 'C', 'D'],
    'Date':['2021-2-15', '2021-2-17', '2021-2-16', '2021-2-15']}
df2 = pd.DataFrame(data1)

df2

The snippet of it is given below.

Now, the resulting dataframe have to be such that for each participant in the Dataframe1, the rows have to start from the dates given in the 2nd Dataframe. The rows prior to that starting date have to be deleted.

The final dataframe as in how it should look is given below.

Any help is greatly appreciated. Thanks

Solution

You can use .merge + boolean-indexing:

df1["Date"] = pd.to_datetime(df1["Date"])
df2["Date"] = pd.to_datetime(df2["Date"])

x = df1.merge(df2, on="ID", suffixes=("", "_y"))
print(x.loc[x.Date >= x.Date_y, df1.columns].reset_index(drop=True))

Prints:

  ID       Date  Steps
0  A 2021-02-15   1500
1  A 2021-02-16   2000
2  A 2021-02-17   1400
3  B 2021-02-17   3400
4  B 2021-02-18   5000
5  B 2021-02-19   1000
6  C 2021-02-16   2000
7  C 2021-02-17   3500
8  D 2021-02-15   5500
9  D 2021-02-16   5200

Or: If some ID is missing in df2:

x = df1.merge(df2, on="ID", suffixes=("", "_y"), how="outer").fillna(pd.Timestamp(0))
print(x.loc[x.Date >= x.Date_y, df1.columns].reset_index(drop=True))