I am solving a differential equation whose solution comes up with size (2000,2000000), i.e., size(sol)=(2000,2000000). Now I want to store the time evolution of first 1000 variable to an array. for instance say,
x=sol[1:1000,:];
and from this I need to calculate a value
r1= abs.(mean(exp.(x*1im),dims=1)[1,:])[1:end]
Now storing the solution to x and then calculating r1 takes too much time and even something the system crash. Is there any solution to this ?
I am using Atom to run the code and the RAM of my system is 48 GB.
I am going to assume that the array sol
is a Matrix{Float64}
, though this would be something that is important to include in your question.
Then I will point out some places where you are creating unnecessary allocations:
x = sol[1:1000,:]
r1 = abs.(mean(exp.(x*1im),dims=1)[1,:])[1:end]
Starting from the innermost:
x*1im
creates a full new array of the same size as x
.exp.()
creates another full array of the same size as x
.mean
creates another array, though that is a reasonable thing to expect.[1,:]
creates a new array the same size as one row of x
[1:end]
also creates a copy of the output vector, and this doesn't seem to have any purpose at all.x = sol[1:1000, :]
allocates a large array.We want to get rid of most of these:
1. (and 2.)\ You could write exp.(x .* 1im)
with a dot after x
, but it is more efficient to use the cis
function, which does the same thing as exp(1im * x)
mean(cis, x; dims=1)
instead, that does not create the temporary.vec
function to force a vector from a matrix without copying data.view
to avoid copying.Result:
# original
function absmean(sol)
x = sol[1:1000,:]
r1 = abs.(mean(exp.(x*1im),dims=1)[1,:])[1:end]
return r1
end
# new version
function absmean2(sol)
x = @view sol[1:1000, :]
r1 = abs.(mean(cis, x; dims=1)) |> vec
return r1
end
Benchmark:
# making a smaller version to reduce benchmarking time
julia> sol = randn(2000, 2000);
julia> @btime absmean($sol);
72.293 ms (18 allocations: 76.39 MiB)
julia> @btime absmean2($sol);
29.985 ms (15 allocations: 47.88 KiB)
Not that much faster, but less than 1/1000th of the memory use. There are many more things you can do to improve performance, but these were the 'low-hanging fruit'.
Final remark. It seems like your solution data are oriented along the rows of sol
instead of along the columns. That is a bit strange, since Julia arrays are column-major. It doesn't matter in this particular calculation, but in other cases it could be detrimental to performance.
Edit: I'll just add a version to show how you can get more dramatic speedups. This uses LoopVectorization.jl with multithreading (6 cores, 12 threads). Some re-writing of the code is necessary to get LoopVectorization to accept the code:
using LoopVectorization
function absmean4(sol)
x = @view sol[1:1000, :]
r = similar(x, size(x, 2))
# use @turbo instead of @tturbo for single-threaded
@tturbo for j in axes(x, 2)
s = zero(eltype(x))
c = zero(eltype(x))
for i in axes(x, 1)
(a, b) = sincos(x[i, j])
s += a
c += b
end
r[j] = sqrt(s^2 + c^2) / size(x, 1)
end
return r
end
julia> @btime absmean4($sol);
1.830 ms (1 allocation: 15.75 KiB)
That's a 40x speedup vs the original, and 5000x less memory use.
If you want to go absolutely all out, you can pre-allocate the output vector r
and pass it into your function. Then you will have zero allocations.