Search code examples
vectorjuliaoperator-overloadingvectorizationarray-broadcasting

Julia vectorized operators


I understand that in Julia most operators can be vectorized by prefixing it with .. However I don't understand why some of them worth both ways:

julia> a = rand(1_000_000);

julia> @time a*2;
  0.051112 seconds (183.93 k allocations: 17.849 MiB, 7.14% gc time, 89.16% compilation time)

julia> @time a*2;
  0.002070 seconds (2 allocations: 7.629 MiB)

julia> @time a.*2;
  0.026533 seconds (8.87 k allocations: 8.127 MiB, 93.23% compilation time)

julia> @time a.*2;
  0.001575 seconds (4 allocations: 7.630 MiB)

julia> a + 0.1;
ERROR: MethodError: no method matching +(::Vector{Float64}, ::Float64)

Why array broadcasting works for * but not +?

What drives the difference in performance/allocations between * and .*?


Solution

  • For * when you write a * 2 is not a broadcasting operation but vector times scalar multiplication, which is a valid operation in vector space, see https://en.wikipedia.org/wiki/Scalar_multiplication.

    For + when you write a + 1 you ask for addition of vector and scalar, which is not operation that is normally supported in vector spaces. You have to broadcast + to achieve the desired result. You have a list of eight axioms that vector space must meet in https://en.wikipedia.org/wiki/Vector_space. Adding of scalar and vector is not one of them.

    Regarding performance - on my laptop both operations (broadcasted and non broadcasted multiplication) have the same performance characteristic:

    julia> using BenchmarkTools
    
    julia> @benchmark $a * 2
    BenchmarkTools.Trial: 2346 samples with 1 evaluation.
     Range (min … max):  1.373 ms … 8.099 ms  ┊ GC (min … max):  0.00% … 61.16%
     Time  (median):     1.513 ms             ┊ GC (median):     0.00%
     Time  (mean ± σ):   2.127 ms ± 1.101 ms  ┊ GC (mean ± σ):  19.42% ± 22.16%
    
      █▆▄▃▂▁▁      ▁▁▁            ▃▂▁
      ███████▇██▇▇█████▇▆▇▇▅▅▆▆▅▅▅██████▆▆▇▅▆▆▄▅▅▅▃▅▄▃▅▅▅▅▄▄▃▃▃ █
      1.37 ms     Histogram: log(frequency) by time     5.99 ms <
    
     Memory estimate: 7.63 MiB, allocs estimate: 2.
    
    julia> @benchmark $a .* 2
    BenchmarkTools.Trial: 2333 samples with 1 evaluation.
     Range (min … max):  1.377 ms … 7.586 ms  ┊ GC (min … max):  0.00% … 62.77%
     Time  (median):     1.528 ms             ┊ GC (median):     0.00%
     Time  (mean ± σ):   2.137 ms ± 1.104 ms  ┊ GC (mean ± σ):  19.25% ± 22.06%
    
      █▆▄▃▃▁▁▁▁   ▁▂▁▁            ▄▂▁▁ ▁                        ▁
      █████████▇█████████▇▇▇▇▇▅▆▄▆██████▆▇▆▆▆▇▇▆▃▄▅▆▇▆▆▆▅▅▁▃▁▅▃ █
      1.38 ms     Histogram: log(frequency) by time     5.98 ms <
    
     Memory estimate: 7.63 MiB, allocs estimate: 2.
    

    Running @time instead of using BenchmarkTools.jl is not reliable as single run of some operation can have a large variability of run time (e.g. your processor might be busy with some other tasks or GC can be triggered).