Search code examples
pythonpandastime-seriesinterpolationexponential

Python - Pandas: how can I interpolate between values that grow exponentially?


I have a Pandas Series that contains the price evolution of a product (my country has high inflation), or say, the amount of coronavirus infected people in a certain country. The values in both of these datasets grows exponentially; that means that if you had something like [3, NaN, 27] you'd want to interpolate so that the missing value is filled with 9 in this case. I checked the interpolation method in the Pandas documentation but unless I missed something, I didn't find anything about this type of interpolation.

I can do it manually, you just take the geometric mean, or in the case of more values, get the average growth rate by doing (final value/initial value)^(1/distance between them) and then multiply accordingly. But there's a lot of values to fill in in my Series, so how do I do this automatically? I guess I'm missing something since this seems to be something very basic.


Solution

  • You could take the logarithm of your series, interpolate lineraly and then transform it back to your exponential scale.

    import pandas as pd
    import numpy as np
    
    arr = np.exp(np.arange(1,10))
    arr = pd.Series(arr)
    arr[3] = None
    
    0       2.718282
    1       7.389056
    2      20.085537
    3            NaN
    4     148.413159
    5     403.428793
    6    1096.633158
    7    2980.957987
    8    8103.083928
    dtype: float64
    
    arr = np.log(arr) # Transform according to assumed process.
    arr = arr.interpolate('linear') # Interpolate.
    np.exp(arr) # Invert previous transformation.
    
    0       2.718282
    1       7.389056
    2      20.085537
    3      54.598150
    4     148.413159
    5     403.428793
    6    1096.633158
    7    2980.957987
    8    8103.083928
    dtype: float64