Search code examples
pandasdataframemergemulti-indexdatetimeoffset

Merge multiindex pandas dataframes on a lagged datetimeindex


I am having difficulty:

  1. Merging two pandas multiindex dataframes
  2. Passing an offset to a datetimeindex in the merge

df1:

date        ID   Attr_1
12/31/2010  13   A
             9   B
             1   C
1/31/2011   13   D
             9   E
             1   F

df2:

date        ID   Attr_2
12/31/2010  13  -0.124409
             9   0.555959
             1  -0.705634
1/31/2011   13   0.471078
             9   0.276006
             1  -0.468463

A one-month lagged Attr_2 should be merged to df1. Desired output:

date        ID   Attr_1  Attr_2
12/31/2010  13   A       nan
             9   B       nan
             1   C       nan
1/31/2011   13   D      -0.124409             
             9   E       0.555959
             1   F      -0.705634

I would like to use:

pd.merge(df1, df2['Attr_2'], left_on = ['date', 'ID'], 
         right_on = [df2.index.date - pd.DateOffset(months = 1), 'ID'],
         how='left')

I cannot join on 'date' as it is an index in both dfs. I also cannot subtract the offset to the df2 right_on as I have done.

I have tried unsuccessfully resetting the index in both dfs, and then applying the offset to the right_on parameter.


Solution

  • This involve reconstruct your index

    df1.index = pd.MultiIndex.from_tuples([(pd.to_datetime(x[0]), x[1]) for x in df1.index])
    df2.index = pd.MultiIndex.from_tuples([(pd.to_datetime(x[0])+ pd.DateOffset(months = 1), x[1]) for x in df2.index])
    
    df1.join(df2)
    Out[413]: 
                  Attr_1    Attr_2
    2010-12-31 13      A       NaN
               9       B       NaN
               1       C       NaN
    2011-01-31 13      D -0.124409
               9       E  0.555959
               1       F -0.705634