I have a computation that in which I need go through items of a 3d numpy array and add them to the values in the second dimension of the array (skipping the values in that dimension). It is analogous to this canonical mimimal reproduction example:
import numpy as np
data = np.array([
[[1, 1, 1], [10, 10, 10], [1, 1, 1]],
[[2, 2, 2], [20, 20, 20], [2, 2, 2]],
[[3, 3, 3], [30, 30, 30], [3, 3, 3]] ])
def process_data(const_idx, data, i, j, k):
if const_idx != j:
# PROBLEM: how can I access this value if this function is vectorized?
value_to_add = data[i][const_idx][k]
data[i][j][k] += value_to_add
const_idx = 1
for i in range(data.shape[0]):
for j in range(data.shape[1]):
for k in range(data.shape[2]):
process_data(const_idx, data, i, j, k)
Where the expected output in this case would be:
[[[11 11 11]
[10 10 10]
[11 11 11]]
[[22 22 22]
[20 20 20]
[22 22 22]]
[[33 33 33]
[30 30 30]
[33 33 33]]]
The code above works but it is very slow for large arrays. I would like to vectorize this function.
My first stab is something like this:
def process_data(val, data, const_idx):
# PROBLEM: How can I access this value given that I do not have access to the i / j / k coordinates val came from?
value_to_add = ...
# PROBLEM: I cannot make this check either since I dont know the j index being processed here
if const_idx != j:
return val + value_to_add
return val
vfunc = np.vectorize(process_data)
result = vfunc(data, data, const_idx)
How can I accomplish this, or is perhaps vectorization not the answer?
points to the index of the row which acts as an addition factor.
You can shortly perform the inplace addition on the needed dimensions with the following approach:
def add_by_idx(arr, idx):
r = np.arange(arr.shape[1]) # row indices
arr[:, r[r != idx], :] += arr[:, [idx], :]
add_by_idx(data, 1)
[[[11 11 11]
[10 10 10]
[11 11 11]]
[[22 22 22]
[20 20 20]
[22 22 22]]
[[33 33 33]
[30 30 30]
[33 33 33]]]