Search code examples
parallel-processingjuliamulticore

Julia - @spawn computing jobs sequentially instead of parallel


I am trying to run a function in parallel in Julia (ver. 1.1.0) using the @spawn macro.

I have noticed that using @spawn the jobs are actually performed sequentially (albeit from different workers). This is not happening when using the [pmap][1] function which computes the jobs in parallel.

Following is the code for the main.jl program which calls the function (in the module hello_module) that should be executed:

#### MAIN START ####
# deploy the workers
addprocs(4)
# load modules with multi-core functions
@everywhere include(joinpath(dirname(@__FILE__), "hello_module.jl"))

# number of cores
cpus = nworkers()

# print hello world in parallel
hello_module.parallel_hello_world(cpus)

  [1]: https://docs.julialang.org/en/v1/stdlib/Distributed/#Distributed.pmap

...and here is the code for the module:

module hello_module    
using Distributed
using Printf: @printf
using Base

"""Print Hello World on STDOUT"""
function hello_world()
    println("Hello World!")
end

"""Print Hello World in Parallel."""
function parallel_hello_world(threads::Int)

    # create array with as many elements as the threads
    a = [x for x=1:threads]

    #= This would perform the computation in parallel
    wp = WorkerPool(workers())
    c = pmap(hello_world, wp, a, distributed=true)
    =#

    # spawn the jobs
    for t in a
        r = @spawn hello_world()
        # @show r
        s = fetch(r)
    end    
end

end # module end

Solution

  • You need to use green threading to manage your parallelism. In Julia it is achieved by using @sync and @async macros. See the minimal working example below:

    using Distributed
    
    addprocs(3)
    @everywhere using Dates
    @everywhere function f()
        println("starting at $(myid()) time $(now()) ")
        sleep(1)
        println("finishing at $(myid()) time $(now()) ")
        return myid()^3
    end
    
    function test()
        fs = Dict{Int,Future}()
        @sync for w in workers()
            @async fs[w] = @spawnat w f()
        end
        res = Dict{Int,Int}()
        @sync for w in workers()
            @async res[w] = fetch(fs[w])
        end
        res
    end
    

    And here is the output that clearly shows that the functions are being run in parallel:

    julia> test()
          From worker 3:    starting at 3 time 2019-04-02T01:18:48.411
          From worker 2:    starting at 2 time 2019-04-02T01:18:48.411
          From worker 4:    starting at 4 time 2019-04-02T01:18:48.415
          From worker 2:    finishing at 2 time 2019-04-02T01:18:49.414
          From worker 3:    finishing at 3 time 2019-04-02T01:18:49.414
          From worker 4:    finishing at 4 time 2019-04-02T01:18:49.418
    Dict{Int64,Int64} with 3 entries:
      4 => 64
      2 => 8
      3 => 27
    

    EDIT:

    I recommend you managing how your computations are allocated. However, you can also use @spawn. Note that in the scenario below jobs got getting allocated simultaneously on workers.

    function test(N::Int)
        fs = Dict{Int,Future}()
        @sync for task in 1:N
            @async fs[task] = @spawn f()
        end
        res = Dict{Int,Int}()
        @sync for task in 1:N
            @async res[task] = fetch(fs[task])
        end
        res
    end
    

    And here is the output:

    julia> test(6)
          From worker 2:    starting at 2 time 2019-04-02T10:03:07.332
          From worker 2:    starting at 2 time 2019-04-02T10:03:07.34
          From worker 3:    starting at 3 time 2019-04-02T10:03:07.332
          From worker 3:    starting at 3 time 2019-04-02T10:03:07.34
          From worker 4:    starting at 4 time 2019-04-02T10:03:07.332
          From worker 4:    starting at 4 time 2019-04-02T10:03:07.34
          From worker 4:    finishin at 4 time 2019-04-02T10:03:08.348
          From worker 2:    finishin at 2 time 2019-04-02T10:03:08.348
          From worker 3:    finishin at 3 time 2019-04-02T10:03:08.348
          From worker 3:    finishin at 3 time 2019-04-02T10:03:08.348
          From worker 4:    finishin at 4 time 2019-04-02T10:03:08.348
          From worker 2:    finishin at 2 time 2019-04-02T10:03:08.348
    Dict{Int64,Int64} with 6 entries:
      4 => 8
      2 => 27
      3 => 64
      5 => 27
      6 => 64
      1 => 8