Search code examples
pythonparallel-processingmultiprocessingjoblibpari

Is it possible to parallelize python code using Pari via cypari2?


I'm having trouble trying run a few loops in parallel when employing Pari via cypari2. I'll including a couple of small working examples along with the Tracebacks in case anyone has some insight on this.

Example 1 -- using joblib:

from cypari2 import Pari
from joblib import Parallel, delayed

def AddOne(v):
    return v + pari.one()

pari = Pari()
vec = [pari('x_1'), pari('x_2')]
print(vec)

#works
newVec = Parallel(n_jobs=1)(delayed(AddOne)(i) for i in vec)
print(newVec)

#doesn't work
newVec2 = Parallel(n_jobs=2)(delayed(AddOne)(i) for i in vec)
print(newVec2)

The output:

[x_1, x_2]
[x_1 + 1, x_2 + 1]
joblib.externals.loky.process_executor._RemoteTraceback: 
"""
Traceback (most recent call last):
  File "/usr/lib/python3/dist-packages/joblib/externals/loky/backend/queues.py", line 150, in _feed
    obj_ = dumps(obj, reducers=reducers)
  File "/usr/lib/python3/dist-packages/joblib/externals/loky/backend/reduction.py", line 247, in dumps
    dump(obj, buf, reducers=reducers, protocol=protocol)
  File "/usr/lib/python3/dist-packages/joblib/externals/loky/backend/reduction.py", line 240, in dump
    _LokyPickler(file, reducers=reducers, protocol=protocol).dump(obj)
  File "/usr/lib/python3/dist-packages/joblib/externals/cloudpickle/cloudpickle_fast.py", line 538, in dump
    return Pickler.dump(self, obj)
  File "stringsource", line 2, in cypari2.pari_instance.Pari.__reduce_cython__
TypeError: no default __reduce__ due to non-trivial __cinit__
"""

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "min_jake_joblib.py", line 16, in <module>
    newVec2 = Parallel(n_jobs=2)(delayed(AddOne)(i) for i in vec)
  File "/usr/lib/python3/dist-packages/joblib/parallel.py", line 1016, in __call__
    self.retrieve()
  File "/usr/lib/python3/dist-packages/joblib/parallel.py", line 908, in retrieve
    self._output.extend(job.get(timeout=self.timeout))
  File "/usr/lib/python3/dist-packages/joblib/_parallel_backends.py", line 554, in wrap_future_result
    return future.result(timeout=timeout)
  File "/usr/lib/python3.8/concurrent/futures/_base.py", line 444, in result
    return self.__get_result()
  File "/usr/lib/python3.8/concurrent/futures/_base.py", line 389, in __get_result
    raise self._exception
_pickle.PicklingError: Could not pickle the task to send it to the workers.

Seems to be a problem with pickling the Pari objects, but is there any way around it?

Example 2 -- using multiprocessing:

from cypari2 import Pari
import multiprocessing

def AddOne(v):
    return v + pari.one()

pari = Pari()
vec = [pari('x_1'), pari('x_2')]
print(vec)

#doesn't work
if __name__ == '__main__':
    pool = multiprocessing.Pool(processes = 2) ## doesn't matter how many I use
    newVec = pool.map(AddOne, (i for i in vec))
    print(newVec)

It seg faults, but doesn't completely exit automatically, so I have to use Ctrl^C to kill it. The output:

[x_1, x_2]
Exception in thread Thread-3:
Traceback (most recent call last):
  File "/usr/lib/python3.8/threading.py", line 932, in _bootstrap_inner
    self.run()
  File "/usr/lib/python3.8/threading.py", line 870, in run
    self._target(*self._args, **self._kwargs)
  File "/usr/lib/python3.8/multiprocessing/pool.py", line 576, in _handle_results
    task = get()
  File "/usr/lib/python3.8/multiprocessing/connection.py", line 251, in recv
    return _ForkingPickler.loads(buf.getbuffer())
  File "cypari2/gen.pyx", line 4705, in cypari2.gen.objtogen
  File "cypari2/gen.pyx", line 4812, in cypari2.gen.objtogen
  File "cypari2/convert.pyx", line 557, in cypari2.convert.PyObject_AsGEN
cysignals.signals.SignalError: Segmentation fault
^CProcess ForkPoolWorker-1:
Process ForkPoolWorker-2:
Traceback (most recent call last):
  File "min_jake_multiprocessing.py", line 14, in <module>
Traceback (most recent call last):
  File "/usr/lib/python3.8/multiprocessing/process.py", line 315, in _bootstrap
    self.run()
  File "/usr/lib/python3.8/multiprocessing/process.py", line 108, in run
    self._target(*self._args, **self._kwargs)
  File "/usr/lib/python3.8/multiprocessing/pool.py", line 114, in worker
    task = get()
Traceback (most recent call last):
  File "/usr/lib/python3.8/multiprocessing/queues.py", line 356, in get
    res = self._reader.recv_bytes()
  File "/usr/lib/python3.8/multiprocessing/connection.py", line 216, in recv_bytes
    buf = self._recv_bytes(maxlength)
  File "/usr/lib/python3.8/multiprocessing/connection.py", line 414, in _recv_bytes
    buf = self._recv(4)
  File "/usr/lib/python3.8/multiprocessing/connection.py", line 379, in _recv
    chunk = read(handle, remaining)
  File "src/cysignals/signals.pyx", line 320, in cysignals.signals.python_check_interrupt
  File "/usr/lib/python3.8/multiprocessing/process.py", line 315, in _bootstrap
    self.run()
  File "/usr/lib/python3.8/multiprocessing/process.py", line 108, in run
    self._target(*self._args, **self._kwargs)
  File "/usr/lib/python3.8/multiprocessing/pool.py", line 114, in worker
    task = get()
  File "/usr/lib/python3.8/multiprocessing/queues.py", line 355, in get
    with self._rlock:
  File "/usr/lib/python3.8/multiprocessing/synchronize.py", line 95, in __enter__
    return self._semlock.__enter__()
  File "src/cysignals/signals.pyx", line 320, in cysignals.signals.python_check_interrupt
KeyboardInterrupt
KeyboardInterrupt
    newVec = pool.map(AddOne, (i for i in vec))
  File "/usr/lib/python3.8/multiprocessing/pool.py", line 364, in map
    return self._map_async(func, iterable, mapstar, chunksize).get()
  File "/usr/lib/python3.8/multiprocessing/pool.py", line 765, in get
    self.wait(timeout)
  File "/usr/lib/python3.8/multiprocessing/pool.py", line 762, in wait
    self._event.wait(timeout)
  File "/usr/lib/python3.8/threading.py", line 558, in wait
    signaled = self._cond.wait(timeout)
  File "/usr/lib/python3.8/threading.py", line 302, in wait
    waiter.acquire()
  File "src/cysignals/signals.pyx", line 320, in cysignals.signals.python_check_interrupt
KeyboardInterrupt
Traceback (most recent call last):
  File "/usr/lib/python3.8/multiprocessing/util.py", line 300, in _run_finalizers
    finalizer()
  File "/usr/lib/python3.8/multiprocessing/util.py", line 224, in __call__
    res = self._callback(*self._args, **self._kwargs)
  File "/usr/lib/python3.8/multiprocessing/pool.py", line 695, in _terminate_pool
    raise AssertionError(
AssertionError: Cannot have cache with result_hander not alive

I suppose someone will tell me to use sympy or some other symbolic algebra package instead, but the symbolic algebra I need to do is quite complex and Pari can handle it extremely well. However, in the end I'd like to be able to process a queue of class objects that contain Pari objects in parallel. Any thoughts/suggestions are appreciated.


Solution

  • Well, this isn't a full answer, but it works for me so I wanted to share in case anyone else runs into this issue.

    The first issue appears to be that the versions of libpari-dev and pari-gp on the apt repository were too old. The apt repository contains version 2.11 whereas the version on Pari's git repository is version 2.14. Uninstalling and following the instructions from here to install from source fixed most of my problems.

    Interestingly, I still needed to install libpari-gmp-tls6 from the apt repository to get things to work. But, after that I was able to get the test examples above to run. The example using multiprocessing ran successfully without modification, but the example using joblib required the use of the "threading" backend in order to run.