Search code examples
pythonpandaspandas-resample

How to prevent resample -> aggregate from dropping columns?


Code

df = pd.DataFrame(
    data = {'A': [1, 1, 2], 'B': [None, None, None]},
    index = pd.DatetimeIndex([
        '1990-01-01 00:00:00',
        '1990-01-01 12:00:00',
        '1990-01-02 12:00:00'
    ])
)
print(df.resample('1d').aggregate('mean'))

Output

              A
1990-01-01  1.0
1990-01-02  2.0

Desired output

              A     B
1990-01-01  1.0  None 
1990-01-02  2.0  None 

I don't care whether there's None, np.nan or pd.NA in column B of the output, the problem is that B is dropped.


Solution

  • resample will drop the non numeric columns when using a numeric aggregation. You can reindex after aggregation:

    df.resample('1d').aggregate('mean').reindex(df.columns, axis=1)
    

    output:

                  A   B
    1990-01-01  1.0 NaN
    1990-01-02  2.0 NaN