Search code examples
pandasdataframeweighted-average

Weights in Panel Data - Pandas


I have a pandas DataFrame that is indexed by monthly dates and contains 1's and 0's.

For each date (row), I would like to sum the row and then divide each data point by that sum in order to determine weights over time (if the datapoint is zero it should remain zero).

My data looks like this:

                A       B       C       D      
01-2001         1       1       0       1
02-2001         1       0       0       1
03-2001         1       0       0       1
04-2001         1       1       1       1
05-2001         1       1       1       0

The output should be as such:

                A       B       C       D      
01-2001         0.33    0.33    0       0.33
02-2001         0.5     0       0       0.5
03-2001         0.5     0       0       0.5
04-2001         0.25    0.25    0.25    0.25
05-2001         0.33    0.33    0.33    0

Can anyone help?


Solution

  • You can use div with sum:

    print df.div(df.sum(axis=1), axis=0)
                    A         B         C         D
    01-2001  0.333333  0.333333  0.000000  0.333333
    02-2001  0.500000  0.000000  0.000000  0.500000
    03-2001  0.500000  0.000000  0.000000  0.500000
    04-2001  0.250000  0.250000  0.250000  0.250000
    05-2001  0.333333  0.333333  0.333333  0.000000
    

    And if you need round:

    print df.div(df.sum(axis=1), axis=0).round(2)
                A     B     C     D
    01-2001  0.33  0.33  0.00  0.33
    02-2001  0.50  0.00  0.00  0.50
    03-2001  0.50  0.00  0.00  0.50
    04-2001  0.25  0.25  0.25  0.25
    05-2001  0.33  0.33  0.33  0.00