Search code examples
pandasnumpygroup-byapplymulti-index

Calculating temporal and sptial gradients while using groupby in multi-index pandas dataframe


Say I have the following sample pandas dataframe of water content (i.e. "wc") values at specified depths along a column of soil:

import pandas as pd

df = pd.DataFrame([[1, 2,5,3,1], [1, 3, 5,3, 2], [4, 6, 6,3,1], [1, 2,5,3,1], [1, 3, 5,3, 2], [4, 6, 6,3,1]], columns=pd.MultiIndex.from_product([['wc'], [10, 20, 30, 45, 80]]))

df['model'] = [5,5, 5, 6,6,6]

df['time'] = [0, 1, 2,0, 1, 2]

df.set_index(['time', 'model'], inplace=True)

>> df
[Out]:
           wc            
           10 20 30 45 80
time model               
0    5      1  2  5  3  1
1    5      1  3  5  3  2
2    5      4  6  6  3  1
0    6      1  2  5  3  1
1    6      1  3  5  3  2
2    6      4  6  6  3  1

I would like to calulate the spatial (between columns) and temporal (between rows) gradients for each model "group" in the following structure:

           wc             temp_grad      spat_grad
           10 20 30 45 80 10 20 30 45 80 10 20 30 45 
time model               
0    5      1  2  5  3  1
1    5      1  3  5  3  2
2    5      4  6  6  3  1
0    6      1  2  5  3  1
1    6      1  3  5  3  2
2    6      4  6  6  3  1

My attempt involved writing a function first for the temporal gradients and combining this with groupby:

def temp_grad(df):
    temp_grad = np.gradient(df[('wc', 10.0)], df.index.get_level_values(0))
    return pd.Series(temp_grad, index=x.index)

df[('temp_grad', 10.0)] = (df.groupby(level = ['model'], group_keys=False)
                              .apply(temp_grad))

but I am not sure how to automate this to apply for all wc columns as well as navigate the multi-indexing issues.


Solution

  • Assuming the function you write is actually what you want, then for temp_grad, you can do at once all the columns in the apply. use np.gradient the same way you did in your function but specify along the axis=0 (rows). Built a dataframe with index and columns as the original data. For the spat_grad, I think the model does not really matter, so no need of the groupby, do np.gradient directly on df['wc'], and along the axis=1 (columns) this time. Built a dataframe the same way. To get the expected output, concat all three of them like:

    df = pd.concat([
        df['wc'], # original data
        # add the temp_grad
        df['wc'].groupby(level = ['model'], group_keys=False)
          .apply(lambda x: #do all the columns at once, specifying the axis in gradient
                 pd.DataFrame(np.gradient(x, x.index.get_level_values(0), axis=0), 
                              columns=x.columns, index=x.index)), # build a dataframe 
        # for spat, no need of groupby as it is row-wise operation
        # change the axis, and the values for the x
        pd.DataFrame(np.gradient(df['wc'], df['wc'].columns, axis=1),
                     columns=df['wc'].columns, index=df['wc'].index)
        ], 
        keys=['wc','temp_grad','spat_grad'],  # redefine the multiindex columns
        axis=1 # concat along the columns
    )
    

    and you get

    print(df)
               wc             temp_grad                     spat_grad       \
               10 20 30 45 80        10   20   30   45   80        10   20   
    time model                                                               
    0    5      1  2  5  3  1       0.0  1.0  0.0  0.0  1.0       0.1  0.2   
    1    5      1  3  5  3  2       1.5  2.0  0.5  0.0  0.0       0.2  0.2   
    2    5      4  6  6  3  1       3.0  3.0  1.0  0.0 -1.0       0.2  0.1   
    0    6      1  2  5  3  1       0.0  1.0  0.0  0.0  1.0       0.1  0.2   
    1    6      1  3  5  3  2       1.5  2.0  0.5  0.0  0.0       0.2  0.2   
    2    6      4  6  6  3  1       3.0  3.0  1.0  0.0 -1.0       0.2  0.1   
    
                                              
                      30        45        80  
    time model                                
    0    5      0.126667 -0.110476 -0.057143  
    1    5      0.066667 -0.101905 -0.028571  
    2    5     -0.080000 -0.157143 -0.057143  
    0    6      0.126667 -0.110476 -0.057143  
    1    6      0.066667 -0.101905 -0.028571  
    2    6     -0.080000 -0.157143 -0.057143