Search code examples
pandasdataframeregressioninterpolation

Interpolate arbitrary value between columns with Pandas


I have records containing a 'DaysFrom' column, and additional columns containing values for a standard set of 'DaysFrom':

'Date'      'DaysFrom'  '90'    '180'   '360'
2023-11-04   135         4       5       12
2023-11-05   360         3       5       20
2023-11-06   270         5       7       15
2023-11-07   180         6       3       10

I want to calculate a linear interpolation ('IntVal') for the value in the 'DaysFrom' column:

'Date'      'DaysFrom'  '90'    '180'   '360'   'IntVal'
2023-11-04   135         4       5       12      4.5
2023-11-05   360         3       5       20      20
2023-11-06   270         5       7       15      11
2023-11-07   180         6       3       10      3

For ease of calculation the above example only uses 'DaysFrom' values that fall on a column or halfway between two columns, but 'DaysFrom' can be any arbitrary value between 90 and 360.


Solution

  • If you want to interpolate using all values (in case they are not all on a straight line), you can loop over the rows with numpy.interp:

    import numpy as np
    
    cols = df.columns[2:]
    tmp = df[cols]
    
    df['IntVal'] = [np.interp(df.loc[i, 'DaysFrom'], cols, tmp.loc[i])
                    for i in df.index]
    

    Output:

             Date  DaysFrom  90  180  360  IntVal
    0  2023-11-04       135   4    5   12     4.5
    1  2023-11-05       360   3    5   20    20.0
    2  2023-11-06       270   5    7   15    11.0
    3  2023-11-07       180   6    3   10     3.0