Given a two-dimensional numerical array X
of shape (m,n)
, I would like to compute an array Y
of the same shape, where Y[i,j]
is the cumulative sum of X[i_,j_]
for 0<=i_<=i, 0<=j_<=j
. If X
describes a 2D probability distribution, Y
could be thought of as the 2D cumulative distribution function (CDF).
I can obviously compute all entries of Y
in a double for
loop. However, there is a recursive aspect to this computation, as Y[i,j] = X[i,j] + Y[i-1,j] + Y[i,j-1] - Y[i-1,j-1]
(where negative indexing means 0).
I was looking for "2d Python cumsum", and I've found that NumPy's cumsum
merely flattens the array.
My Questions:
Y
efficiently?Thanks.
A kernel splitting method can be applied here to solve this problem very efficiently with only two np.cumsum
: one vertical and one horizontal (or the other way since this is symmetric).
Here is an example:
x = np.random.randint(0, 10, (4, 5))
print(x)
y = np.cumsum(np.cumsum(x, axis=0), axis=1)
print(y)
Here is the result:
[[1 9 8 1 7]
[0 6 8 2 3]
[1 3 6 4 4]
[0 8 1 2 9]]
[[ 1 10 18 19 26]
[ 1 16 32 35 45]
[ 2 20 42 49 63]
[ 2 28 51 60 83]]