The following part in my Julia code kills all my performance:
for j = 1:size(phi,3)
for i = 1:size(phi,2)
phi[:,i,j] += dt*convolutionmagnitude*
weightMatrix*phi[:,i,j]
end
end
I.e. phi
is a three-tensor and for every i
, j
we want to update the first dimension by a matrix-vector product (times some scalars). weightMatrix
is a matrix of size size(phi,1)
by size(phi,1)
(which might be sparse in the future). Everything happens with floats
.
Julia allocates a lot of memory even though everything should work in place (at least I expect that). I have read through the julia documentation and found view
but could not make use of it. How can I accelerate this computation?
Slices (phi[:,i,j]
) on the r.h.s. of assignments always allocate. As you said, you could use views (which aren't completely allocation free either (yet)), which should speed up things. Below I use the @views
macro which replaces all slices by views.
Your +=
operation allocates as well. a += b
is basically a = a + b
which will allocate an array for a+b
and then assign a to it. It is not in-place. To make it in-place you need to add a dot: a .+= b
.
Once your code is running you can add @inbounds
to turn off bound checks when accessing pieces of arrays.
In total, try the following:
@inbounds @views for j = 1:size(phi,3)
for i = 1:size(phi,2)
phi[:,i,j] .+= dt .* convolutionmagnitude .* weightMatrix * phi[:,i,j]
end
end
Note that this will still allocate, as it creates an intermediate vector for weightMatrix * phi[:,i,j]
. You cannot put a dot here as this would mean element-wise multiplication rather than matrix-vector multiplication. However, you could reuse a preallocated piece of memory using mul!
(assuming Julia >0.7 here):
using LinearAlgebra # get mul!
tmp = similar(phi[:,1,1])
@inbounds @views for j = 1:size(phi,3)
for i = 1:size(phi,2)
mul!(tmp, weightMatrix, phi[:,i,j])
phi[:,i,j] .+= dt .* convolutionmagnitude .* tmp
end
end
Finally, let me give you some nice reads on this:
Disclaimer: I haven't tested any of this but just wrote it down in the text editor here, so it might contain trivial typos or similar. Nonetheless, I hope it demonstrates some issues and helps!