Search code examples
pythonmultiprocessingload-balancing

how to terminate a process at the end of its target function?


I am trying to refine a very large JSON Data Set. In order to do that, I split the file into many subparts (with the Unix split command), and assign each part to a process so that it can be fetched and refined independetly. Each process has its input file, which corresponds to a subset of the main dataset. Here is how my code looks like:

import multiprocessing as mp
def my_target(input_file, output_file):
  ...
  some code 
  ...
# Is it possible to end the process here ?
#end of the function
worker_count = mp.cpu_count()
processes = [mp.Process(target = my_target, args=(input_file, output_file)) for _ in range(worker_count)]

for p in processes:
   p.start()

It is very likely that the processes won't terminate at the same time and hence here is my question: Is it possible to terminate a process when it reaches the last line of the target_function my_target() ?

I suppose that letting processes idle after they're finished with their tasks can slow the evolution of other processes no ?

Any recommendations ?


Solution

  • I guess, that you should check this question, as related to what you might need: how to to terminate process using python's multiprocessing. Because you have to take care about the "zombie process", because if the process is ended and not joined - it will become idle.