I have two pandas
dataframes with economic data. Second dataframe contains data with higher granularity.
import datetime
import pandas as pd
df1 = pd.DataFrame(data=[2.88,
2.70],
index=[datetime.date(2020,11,1),
datetime.date(2020,12,1)])
df2 = pd.DataFrame(data=[0.88,
0.87,
0.90,
0.78,
0.79,
0.83,
0.96,
0.98,
0.89,
0.91,
0.87,
0.88,
0.86,
0.83,
0.86,
0.88,
0.88,
0.84,
0.84],
index=[datetime.date(2020,10,30),
datetime.date(2020,11,2),
datetime.date(2020,11,3),
datetime.date(2020,11,4),
datetime.date(2020,11,5),
datetime.date(2020,11,9),
datetime.date(2020,11,10),
datetime.date(2020,11,12),
datetime.date(2020,11,13),
datetime.date(2020,11,16),
datetime.date(2020,11,17),
datetime.date(2020,11,18),
datetime.date(2020,11,19),
datetime.date(2020,11,20),
datetime.date(2020,11,23),
datetime.date(2020,11,24),
datetime.date(2020,11,25),
datetime.date(2020,11,27),
datetime.date(2020,11,30)])
I want to get df1 - df2
where rows with most similar date indices are subtracted, e.g.:
df1.loc[datetime.date(2020,11,1)][0] - df2.loc[datetime.date(2020,11,2)][0]
df1.loc[datetime.date(2020,12,1)][0] - df2.loc[datetime.date(2020,11,30)][0]
When working with date/time it is recommended to use Pandas' internal datetime type:
df1.index = pd.to_datetime(df1.index)
df2.index = pd.to_datetime(df2.index)
Then you can use merge_asof
:
tmp = pd.merge_asof(df1,df2,left_index=True, right_index=True, direction='nearest')
out = tmp['0_x'] - tmp['0_y']
Output:
2020-11-01 2.01
2020-12-01 1.86
dtype: float64