Search code examples
pythonmultiprocessingpython-multiprocessingpathos

failing to pickle objects and import modules with python pathos multiprocessing


Below are some toy examples to reproduce the issues I am having with pathos.multiprocessing on Python 3.5. One issue is that the parallel process fails to recognize what Test is, even when Test is not used within test(). I've seen some posts address the second issue saying that I need an import numpy as np inside the test function, but that isn't working for me.

import numpy as np

from enum import Enum
from pathos.multiprocessing import ProcessingPool

class TestEnum(Enum):
    A = 1
    B = 2

def test(x):
    if x >= 0:
        return np.array(TestEnum.A)
    else:
        return np.array(TestEnum.B)

def main():
    inputs = np.arange(100)
    pool = ProcessingPool()
    outputs = pool.map(test, inputs)

The error I am getting is: _pickle.PicklingError: Can't pickle <enum 'TestEnum'>: it's not found as builtins.TestEnum

If I were to get rid of all occurrences of TestEnum, then the next error would would be that np is not recognized. I saw other posts on this site suggesting that an import numpy as np is required at the top of main(), but this did not work for me. The error I get when I try to import modules inside main() is: ImportError: __import__ not found


Solution

  • Question: If I add an entry point and call main() through it work

    It's mandatory for multiprocessing!

    Python » 3.6 Documentation Section: Safe importing of main module

    Safe importing of main module
    Make sure that the main module can be safely imported by a new Python interpreter without causing unintended side effects (such a starting a new process).

    One should protect the “entry point” of the program by using if __name__ == '__main__':