Search code examples
pythonpython-multiprocessingpathos

Pickling issue with python pathos


import pathos.multiprocessing as mp
class Model_Output_File():
    """
    Class to read Model Output files
    """
    def __init__(self, ftype = ''):
        """
        Constructor
        """
        # Create a sqlite database in the analysis directory
        self.db_name = 'sqlite:///' + constants.anly_dir + os.sep + ftype + '_' + '.db'
        self.engine  = create_engine(self.db_name)
        self.ftype   = ftype

    def parse_DGN(self, fl):
        df      = pandas.read_csv(...)
        df.to_sql(self.db_name, self.engine, if_exists='append')

    def collect_epic_output(self, fls):
        pool = mp.ProcessingPool(4)
        if(self.ftype == 'DGN'):
            pool.map(self.parse_DGN, fls)
        else:
            logging.info( 'Wrong file type')

if __name__ == '__main__':
    list_fls = fnmatch.filter(...)
    obj = Model_Output_File(ftype = 'DGN')
    obj.collect_model_output(list_fls)

In the code above, I am using the pathos multiprocessing library to avoid python multiprocessing issues with classes. However I am getting a pickling error:

  pool.map(self.parse_DGN, fls)
  File "C:\Anaconda64\lib\site-packages\pathos-0.2a1.dev0-py2.7.egg\pathos\multiprocessing.py", line 131, in map
    return _pool.map(star(f), zip(*args)) # chunksize
  File "C:\Anaconda64\lib\multiprocessing\pool.py", line 251, in map
    return self.map_async(func, iterable, chunksize).get()
  File "C:\Anaconda64\lib\multiprocessing\pool.py", line 567, in get
    raise self._value
cPickle.PicklingError: Can't pickle <type 'function'>: attribute lookup __builtin__.function failed

How do I fix this?


Solution

  • I'm the pathos author. You are getting a cPickle.PicklingError… which you should not get with pathos. Make sure you have multiprocess installed, and if you do, that you have a C++ compiler. You can check for pickling errors by importing dill, and doing a dill.copy(self.parse_DGN) inside your class, or externally using the instance of the class. If that works, then you probably have some installation issue, where pathos is finding the python standard library multiprocessing. If so, then you probably need to install a compiler… like Microsoft Visual Studio Community. See: github.com/mmckerns/tuthpc. Make sure to rebuild multiprocess after the install of the MS compiler.