Search code examples
pythonpandasdataframematchlookup

Python pandas two table match to find latest date


I want to do some matching in pandas like Vlookup in Excel. According to some conditions in Table1, find the latest date in Table2:

Table 1:

Name  Threshold1   Threshold2
A     9            8
B     14           13

Table 2:

Date   Name   Value   
1/1    A      10
1/2    A      9
1/3    A      9
1/4    A      8
1/5    A      8
1/1    B      15
1/2    B      14
1/3    B      14
1/4    B      13
1/5    B      13

The desired table is like:

Name  Threshold1   Threshold1_Date   Threshold2   Threshold2_Date
A     9            1/3               8            1/5
B     14           1/3               13           1/5

Thanks in advance!


Solution

  • Code

    # assuming dataframe is already sorted on `date`
    # drop the duplicates per Name and Value keeping the max date
    cols = ['Name', 'Value']
    s = df2.drop_duplicates(cols, keep='last').set_index(cols)['Date']
    
    # for each threshold column use multindex.map to substitute 
    # values from df2 based on matching Name and Threshold value
    for c in df1.filter(like='Threshold'):
        df1[c + '_date'] = df1.set_index(['Name', c]).index.map(s)
    

    Result

      Name  Threshold1  Threshold2 Threshold1_date Threshold2_date
    0    A           9           8             1/3             1/5
    1    B          14          13             1/3             1/5