python session python-multiprocessing pool

Problems with session ids on parallel threads

So I am trying to use multiprocessing to iterate concurrently through files in separate folders. I have a function that calls the parallel process:

from multiprocessing.dummy import Pool

lsFolders = ['Folder1', 'Folder2']

pool = Pool( processes = 6 )

iterateThroughFiles = IterateThroughFiles() # instantiated by call to pool.map()

pool.map( iterateThroughFiles.runProcess, lsFolders )

Then I have the implementation of the IterateThroughFiles-class:

class IterateThroughFiles( object ):

  def runProcess( self, folder ):
      self.sessionId = uuid.uuid4()
      print( self.sessionId )             # Prints a correct sessionId
      logAtLevel( "INFO", "Session ID of: "
                         + str( self.sessionId )
                         + " has been generated for folder: "
                         + folder
                           )

      print( self.sessionId )             # Prints only the second generated
      #                                   # session id for both threads
      print( folder )                     # Prints the correct folder

When I generate the sessionId and print it directly after, the sessionId is correct, additionally the logAtLevel() wrapper function logs the correct value of the sessionId.

The next print statement, though, prints only the second session id and apparently the first sessionId is forgotten in the thread.

Does anyone know why this is happening? I thought when running in parallel each thread was distinct in terms of the objects it created and its memory? Is this incorrect? Does this have something to with the uuid generator?

Solution

The issue is that you are only generating one instance of IterateThroughFiles which is being used in both threads. Instead, you want something like the following

def factory(folder):
    return IterateThroughFiles().runProcess(folder)

and pass that factory function into map. That way you will get two instances.