I have a matrix of size 5 x 98 x 3. I want to find the transpose of each block of 98 x 3 and multiply it with itself to find the standard deviation. Hence, I want my final answer to be of the size 5 x 3 x 3. What would be an efficient way of doing this using numpy.
I can currently do this using the following code:
MU.shape[0] = 5
rows = 98
SIGMA = []
for i in np.arange(MU.shape[0]):
SIGMA.append([])
SIGMA[i] = np.matmul(np.transpose(diff[i]),diff[i])
SIGMA = np.array(SIGMA)
SIGMA = SIGMA/rows
Here diff is of the size 5 x 98 x 3.
Use np.einsum
to sum-reduce the last axes off against each other -
SIGMA = np.einsum('ijk,ijl->ikl',diff,diff)
SIGMA = SIGMA/rows
Use optimize
flag with True
value in np.einsum
to leverage BLAS
.
We can also use np.matmul
to get those sum-reductions
-
SIGMA = np.matmul(diff.swapaxes(1,2),diff)