Search code examples
pythonpandasdataframecalculated-columns

Pandas - Create Column on the fly based on condition


Suppose that I have a data frame as below

+------------+-------+
|    Date    | Price |
+------------+-------+
| 25/08/2021 |    30 |
| 24/08/2021 |    20 |
| 23/08/2021 |    50 |
| 20/08/2021 |    10 |
| 19/08/2021 |    24 |
| 18/08/2021 |    23 |
| 17/08/2021 |    22 |
| 16/08/2021 |    10 |
+------------+-------+

The above data frame can be generated using below code

data = {'Date':['2021-08-25', '2021-08-24', '2021-08-23', '2021-08-20',
                '2021-08-19', '2021-08-18', '2021-08-17', '2021-08-16'],
        'Price':[30, 20, 50, 10, 24, 23, 22, 10]}
df = pd.DataFrame(data)

I want to create a column weight on the fly based on a scalar phi. Suppose phi = 0.95 the weight at t would be 1-phi i.e. at 2021-08-25 value for weight would be 0.05. For remaining dates the value would be W_t+1 * phi. So for date 2021-08-24 value for weight would be 0.05*0.95=0.0475

Expected Output

+------------+-------+-------------+
|    Date    | Price |   Weight    |
+------------+-------+-------------+
| 2021-08-25 |    30 |        0.05 |
| 2021-08-24 |    20 |      0.0475 |
| 2021-08-23 |    50 |    0.045125 |
| 2021-08-20 |    10 |  0.04286875 |
| 2021-08-19 |    24 | 0.040725313 |
| 2021-08-18 |    23 | 0.038689047 |
| 2021-08-17 |    22 | 0.036754595 |
| 2021-08-16 |    10 | 0.034916865 |
+------------+-------+-------------+

What would be the vectorized approach to create column weight on the fly?


Solution

  • Going by the example output values given:

    df['Weight'] = (1 - phi) * phi ** np.arange(len(df))
    
             Date  Price    Weight
    0  2021-08-25     30  0.050000
    1  2021-08-24     20  0.047500
    2  2021-08-23     50  0.045125
    3  2021-08-20     10  0.042869
    4  2021-08-19     24  0.040725
    5  2021-08-18     23  0.038689
    6  2021-08-17     22  0.036755
    7  2021-08-16     10  0.034917
    

    (The output values are shown rounded, which is Pandas' standard.)