Search code examples
pythonmultiprocessingpython-multiprocessing

Why are distinct multiprocessing.Value objects pointing to the same value?


In the following program, two processes are created. Each receives it's own multiprocessing.Value object. Somehow, the two processes still influence each other. Why?

I would expect the two processes to deadlock: The first process can increment the counter once (it's initially 0) but then should wait for it to be even again. The second process can never increment it, as it's 0 initially and never changes. Somehow, this deadlock does not happen, both processes continue to run.

import multiprocessing
import time


def inc_even(counter):
    print(f"inc_even started. {hash(counter)=}")
    while True:
        if counter.value % 2 == 0:
            counter.value += 1
            print("inc_even incremented counter")
        else:
            time.sleep(1)


def inc_odd(counter):
    print(f"inc_odd started. {hash(counter)=}")
    while True:
        if counter.value % 2 == 1:
            counter.value += 1
            print("inc_odd incremented counter")
        else:
            time.sleep(1)


multiprocessing.Process(
    target=inc_even,
    kwargs={
        "counter": multiprocessing.Value("i", 0),
    },
).start()
multiprocessing.Process(
    target=inc_odd,
    kwargs={
        "counter": multiprocessing.Value("i", 0),
    },
).start()

Output:

inc_even started. hash(counter)=8786024404161
inc_even incremented counter
inc_odd started. hash(counter)=8786024404157
inc_odd incremented counter
inc_even incremented counter
inc_odd incremented counter
...

Interestingly, if I change this to first create two variables that hold the two counters, the deadlock occurs:

counterA = multiprocessing.Value("i", 0)
counterB = multiprocessing.Value("i", 0)
multiprocessing.Process(
    target=inc_even,
    kwargs={
        "counter": counterA,
    },
).start()
multiprocessing.Process(
    target=inc_odd,
    kwargs={
        "counter": counterB,
    },
).start()

Output:

inc_even started. hash(counter)=8765172881357
inc_even incremented counter
inc_odd started. hash(counter)=8765172756097

Edit: If I replace the multiprocessing.Value object with some custom class, this does not happen:

class CustomCounter:
    def __init__(self) -> None:
        self.value = 0
multiprocessing.Process(
    target=inc_even,
    kwargs={
        "counter": CustomCounter(),
    },
).start()
multiprocessing.Process(
    target=inc_odd,
    kwargs={
        "counter": CustomCounter(),
    },
).start()

This deadlocks as expected. So it must be caused by multiprocessing.Value, not just the multiprocessing in general.


Solution

  • The first Value is getting reclaimed in the parent process, causing the second Value to get allocated using the same shared memory.


    According to the docs, your code should work. Under "Explicitly pass resources to child processes", the docs say

    Apart from making the code (potentially) compatible with Windows and the other start methods this also ensures that as long as the child process is still alive the object will not be garbage collected in the parent process. This might be important if some resource is freed when the object is garbage collected in the parent process.

    Unfortunately, the docs don't match the implementation. The actual implementation explicitly deletes the references that were supposed to prevent your Values from being reclaimed, inside BaseProcess.start:

    del self._target, self._args, self._kwargs
    

    This means that your code only works when you save references to the Value instances yourself, as you did in the version with the counterA and counterB variables.

    The mismatch between docs and implementation should probably be reported on the CPython issue tracker.