Search code examples
pythonpandasdataframeinterpolationfillna

Pandas Dataframe interpolate inside with constant value


How to make:

[In1]:  df = pd.DataFrame({
            'col1': [100, np.nan, np.nan, 100, np.nan, np.nan, np.nan],
            'col2': [np.nan, 100, np.nan, np.nan, np.nan, 100, np.nan]})
        df

[Out1]:       col1    col2
        0      100     NaN
        1      NaN     100
        2      NaN     NaN
        3      100     NaN
        4      NaN     NaN
        5      NaN     100
        6      NaN     NaN

into:

[Out2]:       col1    col2
        0      100     NaN
        1        0     100
        2        0       0
        3      100       0
        4      NaN     NaN
        5      NaN     100
        6      NaN     NaN

So basically I want to interpolate/fill NaN's with zero only for the inside area and a limit=2. Note in col2 there are three consecutive NaN's in the middle and only two of them are replaced with zero.


Solution

  • You can build masks to identify the non-NAs, and the inner values (with help of a double cummax):

    m = df.notna()
    m2 = m.cummax() & m[::-1].cummax()
    
    out = df.fillna(df.mask(m, 0).ffill(limit=2).where(m2))
    

    Or with interpolate:

    m = df.notna()
    
    out = df.fillna(df.mask(m, 0).interpolate(limit=2, limit_area='inside'))
    
    # or if you only have numbers
    out = df.fillna(df.mul(0).interpolate(limit=2, limit_area='inside'))
    

    Output:

        col1   col2
    0  100.0    NaN
    1    0.0  100.0
    2    0.0    0.0
    3  100.0    0.0
    4    NaN    NaN
    5    NaN  100.0
    6    NaN    NaN