I am learning multiprocess, and I want to create an array of 20 objects of class and save data to each object with multiprocess.
when I try to print the data, the data is empty..
Im trying to figure out why it happen but im struggling with knowing what is the problem, because when i run with debug, the code enters to the right places and append the data, but when I finish the process the data is empty..
Here is my code:
import time
from multiprocessing import Process
class AppendInProcess:
def __init__(self):
self.list = []
def appendToObj(self, value):
self.list.append(value)
def process_function(obj, value):
obj.appendToObj(value)
if __name__ == '__main__':
processes = []
objects = []
time_start = time.time()
for i in range(20):
obj = AppendInProcess()
objects.append(obj)
p = Process(target=process_function, args=(obj, i))
processes.append(p)
p.start()
for process in processes:
process.join()
time_end = time.time()
print(f"Time taken: {time_end - time_start} seconds")
for obj in objects:
print(obj.list)
run over array of object and call a process that add data to array
So the problem lies in the fact that each process runs in its own memory space, and modifications to the obj.data list in each process do not affect the main process's obj.data. That means, once the process has been finished, the memory is flushed as well for this process. What you have to do is to use a shared variable, to get the data! Also notice you want to create that many processes as you have CPU's. You can use Manager().list for example for this. Check the python documentation regarding manager
So you can do it like this and will print the correct data for each object:
class AppendInProcess:
def __init__(self, shared_data):
self.data = shared_data
def appendToObj(self, value):
self.data.append(value)
def process_function(obj, value):
obj.appendToObj(value)
if __name__ == "__main__":
manager = Manager()
processes = []
objects = []
time_start = time.time()
for i in range(20):
shared_data = manager.list() # Create shared list for each
obj = AppendInProcess(shared_data)
objects.append(obj)
p = Process(target=process_function, args=(obj, i))
processes.append(p)
p.start()
for process in processes:
process.join()
time_end = time.time()
for obj in objects:
print("data", obj.data)