Search code examples
pythonlinuxmultithreadingmultiprocessingfork

Why the thread inherited by the child process is in the stopped status while in parent process is in the status started?


Safely forking a multithreaded process is problematic

I'm learning to use the multiprocessing module. Now I'm focusing on the following sentence:

Note that safely forking a multithreaded process is problematic.

present in the official documentation in the section dedicated to context an start methods.

About this topic I have read this answer which contains the statement:

"Note that safely forking a multithreaded process is problematic": here problematic is quite an euphemism for "impossible".

Test code

I have written this test code (for Linux platform):

import multiprocessing
import threading
import time

a = 1
t1 = None

def thread_function_1():
    global a
    while True:
        time.sleep(2)
        a += 1

def p1():
    global a
    global t1
    while True:
        time.sleep(1)
        print(f"Process p1 ---> a = {a}, t1 = {t1}, t1 is_alive = {t1.is_alive()}")

if __name__ == "__main__":
    multiprocessing.set_start_method('fork')
    t1 = threading.Thread(target = thread_function_1)
    t1.start()
    p = multiprocessing.Process(target=p1)
    p.start()
    while True:
        a += 1
        time.sleep(1)
        print(f"Thread Main ---> a = {a}, t1 = {t1}, t1 is_alive = {t1.is_alive()}")

I'm running the code on Linux so the instruction:

multiprocessing.set_start_method('fork')

selects a start method which is already The default start method on Linux. This is because for the test I'm using Python 3.6 (from documentation: The default start method will change away from fork in Python 3.14.).

The output of its execution is:

Thread Main ---> a = 2, t1 = <Thread(Thread-1, started 140618847368960)>, t1 is_alive = True
Process p1 ---> a = 1, t1 = <Thread(Thread-1, stopped 140618847368960)>, t1 is_alive = False
Thread Main ---> a = 4, t1 = <Thread(Thread-1, started 140618847368960)>, t1 is_alive = True
Process p1 ---> a = 1, t1 = <Thread(Thread-1, stopped 140618847368960)>, t1 is_alive = False
Thread Main ---> a = 5, t1 = <Thread(Thread-1, started 140618847368960)>, t1 is_alive = True
Process p1 ---> a = 1, t1 = <Thread(Thread-1, stopped 140618847368960)>, t1 is_alive = False
Thread Main ---> a = 7, t1 = <Thread(Thread-1, started 140618847368960)>, t1 is_alive = True
Process p1 ---> a = 1, t1 = <Thread(Thread-1, stopped 140618847368960)>, t1 is_alive = False

The output shows that:

  • the process p1 has a reference to the global variable t1 which points to an instance of the class Thread; but in Main process the thread is in the status started, when in the process p1 is in the status stopped
  • see also the different value returned by the method is_alive() (true in Main, false in p1)
  • previous considerations are confirmed by the fact that in the process p1 the value of the variable a is always 1 while in the Main process is increased by the thread t1.

Question

Why the thread inherited by the child process is in the stopped status while in parent process is in the status started?
This behavior is linked to the sentence "Note that safely forking a multithreaded process is problematic."?


Solution

  • Why the thread inherited by the child process is in the stopped status while in parent process is in the status started?

    I suppose it would be better to have some status that indicates that the thread never really existed at all. But the thread is stopped in the child because that thread never existed in the context of the child process.

    If you're asking why the thread isn't running in the child, it's because that would lead to absolute chaos. Suppose that thread was in a library function that interacted with some service. Duplicating the thread could result in an invalid sequence of operations on that service.

    This behavior is linked to the sentence "Note that safely forking a multithreaded process is problematic."?

    I suppose so. For example, suppose one of the other threads held a lock when you called fork. That lock will never be released in the child process because the thread that would unlock it doesn't exist.