Search code examples
performancejuliamatrix-multiplication

Slow (repeated) Matrix Multiplication in Julia 1.0


The following part in my Julia code kills all my performance:

        for j = 1:size(phi,3)
            for i = 1:size(phi,2)
                    phi[:,i,j] += dt*convolutionmagnitude*
                                    weightMatrix*phi[:,i,j]                     
            end
        end 

I.e. phi is a three-tensor and for every i, j we want to update the first dimension by a matrix-vector product (times some scalars). weightMatrix is a matrix of size size(phi,1) by size(phi,1) (which might be sparse in the future). Everything happens with floats.

Julia allocates a lot of memory even though everything should work in place (at least I expect that). I have read through the julia documentation and found view but could not make use of it. How can I accelerate this computation?


Solution

    • Slices (phi[:,i,j]) on the r.h.s. of assignments always allocate. As you said, you could use views (which aren't completely allocation free either (yet)), which should speed up things. Below I use the @views macro which replaces all slices by views.

    • Your += operation allocates as well. a += b is basically a = a + b which will allocate an array for a+b and then assign a to it. It is not in-place. To make it in-place you need to add a dot: a .+= b.

    • Once your code is running you can add @inbounds to turn off bound checks when accessing pieces of arrays.

    In total, try the following:

        @inbounds @views for j = 1:size(phi,3)
            for i = 1:size(phi,2)
                    phi[:,i,j] .+= dt .* convolutionmagnitude .* weightMatrix * phi[:,i,j]                     
            end
        end
    

    Note that this will still allocate, as it creates an intermediate vector for weightMatrix * phi[:,i,j]. You cannot put a dot here as this would mean element-wise multiplication rather than matrix-vector multiplication. However, you could reuse a preallocated piece of memory using mul! (assuming Julia >0.7 here):

        using LinearAlgebra # get mul!
    
        tmp = similar(phi[:,1,1])
        @inbounds @views for j = 1:size(phi,3)
            for i = 1:size(phi,2)
                    mul!(tmp, weightMatrix, phi[:,i,j])
                    phi[:,i,j] .+= dt .* convolutionmagnitude .* tmp                   
            end
        end
    

    Finally, let me give you some nice reads on this:

    Disclaimer: I haven't tested any of this but just wrote it down in the text editor here, so it might contain trivial typos or similar. Nonetheless, I hope it demonstrates some issues and helps!