Search code examples
pythonpandas-resample

Python Resample: How do I keep NaN as NaN?


When resample from monthly data to quarterly, I want my last value NaN to remain as NaN. How should I tweak my code?

Thank you

     HS6P1

Jan1989 69.9
Feb1989 59.3
Mar1989 83.5
Apr1989 100.4
May1989 101.4
Jun1989 100.3
Jul1989 98
Aug1989 91.7
Sep1989 82.4
Oct1989 91.3
Nov1989 72.6
Dec1989 NaN

enter image description here

df=pd.read_excel(input_file, sheet_name='Sheet1', usecols='A:D', na_values='ND', index_col=0, header=0)
df.index.names = ['Period']
df.index = pd.to_datetime(df.index)


q0= pd.Series(df['HS6P1'], index=df.index)

m1 = q0.resample('Q').sum()

Current Output
Period
1989-03-31 212.7
1989-06-30 302.1
1989-09-30 272.1
1989-12-31 163.9

Desired Output
Period
1989-03-31 212.7
1989-06-30 302.1
1989-09-30 272.1
1989-12-31 NaN


Solution

  • You can try like this. But if you have NaNs elsewhere, then the sums in that part will be NaNs. Here is information on this topic. np.nan + 1, then the output will be nan. nan turns everything it touches into nan.

    res = q0.resample('Q').apply(lambda x: np.sum(x.values))
    

    And another option. I don't know if it will fit? Used the min_count=3 parameter. Theoretically, there are three values in a quarter, if some values are missing, then there will be NaN.

    m1 = q0.resample('Q').sum(min_count=3)
    

    If you need to return NaN exactly in the last quarter, if there is at least one empty value there.

    def my_func(x):
        return [x.sum(), np.isnan(x).any()]
    
    qqq = q0.resample('Q').apply(my_func)
    
    if qqq[-1][1] == True:
        qqq[-1][0] = np.nan
    
    qqq = pd.Series(qqq.str[0], index=qqq.index)