Let's say we have a
n*n
matrix A
m*m
matrix B
b=[b_1,...,b_m]
of block sizes with b_1 + ... + b_m = n
and b_i >= 1
.Then I define a n*n
block matrix B_block
whose (i,j)
-th block is a b_i*b_j
matrix of constant value B_{ij}
. How can I efficiently compute A+B_block
in numpy?
So far I am doing
A = np.arange(36).reshape(6,6)
B = np.arange(9).reshape(3,3)
blocks_sizes = np.array([3,2,1])
B_block = np.block([[np.ones((a,b))*B[i,j] for j,b in enumerate(block_sizes)] for i,a in enumerate(block_sizes)])
C = A + B_block
print(A)
print(B_block)
print(C)
resulting in
[[ 0 1 2 3 4 5]
[ 6 7 8 9 10 11]
[12 13 14 15 16 17]
[18 19 20 21 22 23]
[24 25 26 27 28 29]
[30 31 32 33 34 35]]
[[0. 0. 0. 1. 1. 2.]
[0. 0. 0. 1. 1. 2.]
[0. 0. 0. 1. 1. 2.]
[3. 3. 3. 4. 4. 5.]
[3. 3. 3. 4. 4. 5.]
[6. 6. 6. 7. 7. 8.]]
[[ 0. 1. 2. 4. 5. 7.]
[ 6. 7. 8. 10. 11. 13.]
[12. 13. 14. 16. 17. 19.]
[21. 22. 23. 25. 26. 28.]
[27. 28. 29. 31. 32. 34.]
[36. 37. 38. 40. 41. 43.]]
which works fine but seems inefficient. Is there a numpy-solution which does not require constructing the block matrix by hand, or python-loops?
If all the blocks were of the same size, then I could use np.kron
...
B_block = B[np.arange(len(blocks_sizes)).repeat(blocks_sizes)][:, np.arange(len(blocks_sizes)).repeat(blocks_sizes)]
C = A + B_block
my performance:
4.36 µs ± 93.9 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
your performance:
83.1 µs ± 34.7 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
If you prefer you can also define slice_ids = np.arange(len(blocks_sizes)).repeat(blocks_sizes)
in order to make the lighter and so:
B_block = B[slice_ids][:, slice_ids]
C = A + B_block