Search code examples
pythonmultiprocessingscipy-optimize

Multiprocessing Scipy optimization in Python


I'm trying to run an optimization problem in parallel. The code works well when not in parallel but I struggle to add the multiprocessing layer to it. It's a sort of a vectorized MLE estimation, but I wrote a much simpler code below featuring the same error.

from scipy import optimize
import multiprocessing as mp

# function 'func' to be minimized (with a number of argument>2)
def func(x, arg1, arg2, arg3):
    
    x = x*x + arg1*x + arg2*x + arg3*x
    
    return x

# function 'fit' that is called to minimize function 'func'
def fit(func, arguments):
    
    x0, arg1, arg2, arg3 = arguments

    results = optimize.minimize(func, x0,  args=(arg1, arg2, arg3), method='BFGS')
    
    print(f'value of the function at the minimum: {results.fun}')
    print(f'value of the parameter x when the function is at the minimum: {results.x}')
    
    return results

# main code
if __name__ == "__main__":
    
    # Arbitrary values of the parameters
    x0=100
    arg1=1
    arg2=2
    arg3=3
    
    # gather in a tuple
    arguments=(x0, arg1, arg2, arg3)
    
    # if not run with multiprocessing:
    #fit(func, arguments)

    # multiprocessing
    with mp.Pool(mp.cpu_count()) as pool:
        pool.map(fit,arguments)

The error I get is:

Process SpawnPoolWorker-3:
Traceback (most recent call last):
  File "C:\ProgramData\anaconda3\lib\multiprocessing\process.py", line 315, in _bootstrap
    self.run()
  File "C:\ProgramData\anaconda3\lib\multiprocessing\process.py", line 108, in run
    self._target(*self._args, **self._kwargs)
  File "C:\ProgramData\anaconda3\lib\multiprocessing\pool.py", line 114, in worker
    task = get()
  File "C:\ProgramData\anaconda3\lib\multiprocessing\queues.py", line 358, in get
    return _ForkingPickler.loads(res)
AttributeError: Can't get attribute 'fit' on <module '__main__' (built-in)>

Another thing I am trying to do is to see at each iteration of the minimization problem the values of results.fun and results.x to know where the algorithm is. I've understood that this is done through callback functions, however I've seen that these can be used with a pool.apply_async function and I'm not sure that it would work for a maximum likelihood estimation problem.

For reference, I'm on Windows and Python 3.8.10

Many thanks for your help!


Solution

  • Try changing arguments to a tuple packed inside a list and it works like a charm !

    arguments= [(x0, arg1, arg2, arg3)]
    

    Take care and stay classy ma chère !