Suppose we have a numpy array of numpy arrays of zeros as
arr1=np.zeros((len(Train),(L))
where Train
is a (dataset) numpy array of arrays of integers of fixed length.
We also have another 1d numpy array, positions
of length as len(Train)
.
Now we wish to add elements of Train
to arr1
at the positions specified by positions
.
One way is to use a for loop on the Train
array as:
k=len(Train[0])
for i in range(len(Train)):
arr1[i,int(positions[i]):int((positions[i]+k))]=Train[i,0:k])]
However, going over the entire Train
set using the explicit for loop is slow and I would like to optimize it.
Here is one way by generating all the indexes you want to assign to. Setup:
import numpy as np
n = 12 # Number of training samples
l = 8 # Number of columns in the output array
k = 4 # Number of columns in the training samples
arr = np.zeros((n, l), dtype=int)
train = np.random.randint(10, size=(n, k))
positions = np.random.randint(l - k, size=n)
Random example data:
>>> train
array([[3, 4, 3, 2],
[3, 6, 4, 1],
[0, 7, 9, 6],
[4, 0, 4, 8],
[2, 2, 6, 2],
[4, 5, 1, 7],
[5, 4, 4, 4],
[0, 8, 5, 3],
[2, 9, 3, 3],
[3, 3, 7, 9],
[8, 9, 4, 8],
[8, 7, 6, 4]])
>>> positions
array([3, 2, 3, 2, 0, 1, 2, 2, 3, 2, 1, 1])
Advanced indexing with broadcasting trickery:
rows = np.arange(n)[:, None] # Shape (n, 1)
cols = np.arange(k) + positions[:, None] # Shape (n, k)
arr[rows, cols] = train
output:
>>> arr
array([[0, 0, 0, 3, 4, 3, 2, 0],
[0, 0, 3, 6, 4, 1, 0, 0],
[0, 0, 0, 0, 7, 9, 6, 0],
[0, 0, 4, 0, 4, 8, 0, 0],
[2, 2, 6, 2, 0, 0, 0, 0],
[0, 4, 5, 1, 7, 0, 0, 0],
[0, 0, 5, 4, 4, 4, 0, 0],
[0, 0, 0, 8, 5, 3, 0, 0],
[0, 0, 0, 2, 9, 3, 3, 0],
[0, 0, 3, 3, 7, 9, 0, 0],
[0, 8, 9, 4, 8, 0, 0, 0],
[0, 8, 7, 6, 4, 0, 0, 0]])