Search code examples
optimizationjuliamathematical-optimizationtensor

How to efficiently perform a grid search over an n-dimensional hypercube in Julia?


I'm trying to maximise a complex function with a Grid Search over a hypercube. I first tried using Numpy's meshgrid to generate all the function arguments and creating indices to it with python itertools' product. Unfortunately the result is very (very) slow. I realise that I can't expect much in terms of speed from a grid search, and an "efficient grid search" above D 3 or 4 might be a bit of contradiction in terms, but I thought I might speed the process up a bit by writing it in Julia. This should help at least due to the fact that my implementation of the function I'm actually trying to maximise with this grid search is significantly faster in Julia. What would be the most efficient way to do this?


Solution

  • Here is the simplest code that does a 3-dimensional grid (a Cartesian product of 3 parameters). The computation is executed over a Julia cluster with 4 processes (you can adjust to whatever you have on your machine( and the results are collected to a DataFrame.

    
    
    using Distributed
         
    # Adds 4 workers (and avoids adding moreif e.g. if rerunning Jupyter cell
    addprocs(max(0, (4+1)-nprocs()))
    
    @everywhere using Distributed, Random, DataFrames
    
    @everywhere Random.seed!(myid())
    
    @everywhere function your_computation(i, j, k)
        # do your complicated computation for your grid search here
        3i + 2j + k
    end
    
    data = @distributed (append!) for (i, j, k) = vec(collect(Iterators.product(1:4, 1:3, 1:2)))
        c = calc(i, j, k)
        DataFrame(;i,j,k,c,procid = myid())
    end     
    

    Notes:

    • for the start the most interesting part for you is to how make the cartesian product for the values of i, j and k,
    • This could be also run multithreaded via Threads.@threads - generally multiprocessing is more scalable though. In a multithreaded you would need to use locking when appending the data frame. Should you need a multithreaded version of this code (instead of multiprocessing) let me know.
    • I use addprocs to add processes to the Julia cluster. Combined with ClusterManagers.jl this can be used to add processes on remote machines. The biggest Julia cluster I have been running with a similar loop was 100 modes/servers with 8000 logical threads.