Python recalculate cumulative values of each row of multidimensional data set without looping?

I am working on complex calculations of a dataset that requires recalculating based on the sum of the previous rows. I'll try to articulate as best as I can.

I have an n x n dataset, I need to create a new n x n dataset that takes the value from the first record, subtracts the cumulative sum of that row up to that value, and then multiplies by a value from another 1 x n series.

For example: n x n Dataset :

i	0	1	2	3	4
0	0	0	0	0	0
1	0	0	0	0	0
2	200	0	0	0	0
3	200	150	0	0	0
4	200	150	125	0	0
5	200	150	125	100	0
6	200	150	125	100	75

1 x n Series :

i	0
0	.000
1	.001
2	.002
3	.003
4	.004
5	.005
6	.006

Result:

i	0	1	2	3	4
0	0	0	0	0	0
1	0	0	0	0	0
2	.4 [(200 - sum 0) * .002]	0	0	0	0
3	.5988 [(200 - sum .4) * .003]	.45 [(150 - sum 0) * .003]	0	0	0
4	.7960 [(200 - sum (.4, .5988)) * .004]	.5982 [(150 - sum .45) * .004]	.50	0	0
5	.9910 [(200 - sum (.4, .5988, .7960)) * .005]	.7447 [(150 - sum .45, .5982) * .005]	.6225	.50	0
6	1.1832 [(200 - sum (.4, .5988, .7960, .9910)) * .006]	.8892 [(150 - sum .45, .5982, .7447) * .006]	.74326	.5970	.450

I am good at using pandas or numpy to get this data. I am currently looping over each row and column and it is taking a really long time to calculate.

Any help is greatly appreciated.

This is code that works using looping.

import numpy as np
import pandas as pd


df_a =pd.DataFrame([[0,0,0,0,0,0,0],[0,0,0,0,0,0,0],[200,0,0,0,0,0,0],[200,150,0,0,0,0,0],[200,150,125,0,0,0,0],[200,150,125,100,0,0,0],[200,150,125,100,75,0,0]]).astype(float)

df_series = pd.Series([0.0,.001, .002, .003, .004, .005, .006])
df_b = pd.DataFrame(np.arange(49).reshape(7,7)).astype(float)

for i in range(df_a.shape[0]): #iterate over rows
    for j in range(df_a.shape[1]): #iterate over columns
        df_a_val = df_a.iloc[i, j] #source dataset value
        ser_val = df_series.iloc[i] #value of series at row
        df_b_prev = df_b.iloc[:i,j] #get series of previous values
        df_b_sum = df_b_prev.sum() #sum previous values
        
        
        df_b.at[i, j] = (df_a_val-df_b_sum)*(ser_val) #cell based calculation and set value
        print("(df_a_val: {} - df_b_sum: {}) * ser_val : {} = {}".format(df_a_val, df_b_sum, ser_val, df_b.at[i, j]))
print(df_series)
print(df_b)

Solution

You could use shift after broadcasted multiplication

a = (df_a > 0) * df_series.to_numpy()[:,None]
df_a * a * (1 - a).cumprod(0).shift(fill_value=0)

          0         1         2      3     4    5    6
0  0.000000  0.000000  0.000000  0.000  0.00  0.0  0.0
1  0.000000  0.000000  0.000000  0.000  0.00  0.0  0.0
2  0.400000  0.000000  0.000000  0.000  0.00  0.0  0.0
3  0.598800  0.450000  0.000000  0.000  0.00  0.0  0.0
4  0.796005  0.598200  0.500000  0.000  0.00  0.0  0.0
5  0.991026  0.744759  0.622500  0.500  0.00  0.0  0.0
6  1.183285  0.889242  0.743265  0.597  0.45  0.0  0.0

i	0	1	2	3	4
0	0	0	0	0	0
1	0	0	0	0	0
2	200	0	0	0	0
3	200	150	0	0	0
4	200	150	125	0	0
5	200	150	125	100	0
6	200	150	125	100	75

i	0	1	2	3	4
0	0	0	0	0	0
1	0	0	0	0	0
2	200	0	0	0	0
3	200	150	0	0	0
4	200	150	125	0	0
5	200	150	125	100	0
6	200	150	125	100	75

i	0	1	2	3	4
0	0	0	0	0	0
1	0	0	0	0	0
2	200	0	0	0	0
3	200	150	0	0	0
4	200	150	125	0	0
5	200	150	125	100	0
6	200	150	125	100	75