Search code examples
pandasrowsvar

Pandas Calculate Monthly Variability by Row


I'm trying to create a panda df with monthly deviations (variability) based on a df with monthly (Jan - Dec) averages for a set of years (1991-1993). I have a df - "data" - that is a 4 x 13 that looks like this with first column "year" and then the month data i.e., "01" is Jan, etc.

month  year  01  02  03  04  05  06  07  08  09  10  11  12
year                                                                      
0      1991  2   6   9   11  3   5   8   9   0   10  12  7
1      1992  3   4   1   17  5   8   9   1   2   18  19  3
2      1993  6   6   2   10  5   3   8   3   4   17  20  4

I have a df with the monthly averages let's call "monthly_averages" that looks like this:

month  01   02   03   04   05   06   07   08   09   10   11   12 
0      3    4    5    13   4    6    8    5    2    13   14   5

My result should be a df with 4 rows and 13 cols such that the deviation are shown for each year (1991, 1992, 1993) with the year column + columns for every month. The best solution that I've found is like this below but it only gives me the deviation for the first row, first column with NaN the rest of the values: (data.div(monthly_averages[['01']], axis=0))-1 Thank you in advance for your insight.


Solution

  • If I understand your description correctly, you may try this

    df_final = (data.div(monthly_averages.loc[0], axis=1) - 1).fillna(data)
    
    Out[234]:
             01   02   03        04    05        06     07   08   09        10  \
    0 -0.333333  0.5  0.8 -0.153846 -0.25 -0.166667  0.000  0.8 -1.0 -0.230769
    1  0.000000  0.0 -0.8  0.307692  0.25  0.333333  0.125 -0.8  0.0  0.384615
    2  1.000000  0.5 -0.6 -0.230769  0.25 -0.500000  0.000 -0.4  1.0  0.307692
    
             11   12    year
    0 -0.142857  0.4  1991.0
    1  0.357143 -0.4  1992.0
    2  0.428571 -0.2  1993.0