I am trying to build a Process
subclass to utilize multiple GPUs in my desktop.
class GPUProcess(mp.Process):
used_ids: list[int] = []
next_id: int = 0
def __init__(self, *, target: Callable[[Any], Any], kwargs: Any):
gpu_id = GPUProcess.next_id
if gpu_id in GPUProcess.used_ids:
raise RuntimeError(
f"Attempt to reserve reserved processor {gpu_id} {self.used_ids=}"
)
GPUProcess.next_id += 1
GPUProcess.used_ids.append(gpu_id)
self._gpu_id = gpu_id
# Define target process func with contant gpu_id
def _target(**_target_kwargs):
target(
**_target_kwargs,
gpu_id=self.gpu_id,
)
super(GPUProcess, self).__init__(target=_target, kwargs=kwargs)
@property
def gpu_id(self):
return self._gpu_id
def __del__(self):
GPUProcess.used_ids.remove(self.gpu_id)
def __repr__(self) -> str:
return f"<{type(self)} gpu_id={self.gpu_id} hash={hash(self)}>"
# Test creation
def test_process_creation():
# Expect two gpus
def dummy_func(*args):
return args
processes = []
for _ in range(2):
p = GPUProcess(
target=dummy_func,
kwargs=dict(a=("a", "b", "c")),
)
processes.append(p)
for p in processes:
p.start()
for p in processes:
p.join()
del processes
assert GPUProcess.used_ids == [], f"{GPUProcess.used_ids=}!=[]"
if __name__ == "__main__":
test_process_creation()
__del__
is not called for the second process.
AssertionError: GPUProcess.used_ids=[1]!=[]
Why is the second __del__
not called?
Later, I'd utilize this class with mp.Pool
to run a large set of payloads using one GPUProcess
per my GPU and a function that uses gpu_id
keyword to decide utilized device. Is this even sensible approach in Python?
The short answer is that __del__
is not being called because the variable p
from the previous for loop is still referencing the second process object
object.__del__
is not guaranteed to be called when del object
is called, as per the official documentation.
object.__del__
is called when the reference count of the object is 0.
The del
keyword reduces the reference count of the object by 1.
So you could resolve this by setting p = None
or del p
before calling the assertion to remove that reference, or call the assertion after exiting the test_process_creation()
function, as that will remove all the reference counts from that stack level.
I found this video from the mCoding channel to be very informative about __del__
and how NOT to use it.
As an aside, I will note that while testing your code with those changes I found a couple of issues you will need to resolve:
ValueError: list.remove(x): x not in list
. Your __del__()
method should
either use a try/except, or check if self.gpu_id is in the list
before removing it.test_process_creation()
you defined the dummy function as dummy_func(*args)
, but the test processes you created specifically have keyword arguments defined with kwargs=dict(a=("a", "b", "c"))
. This will cause the exception TypeError: test_process_creation.<locals>.dummy_func() got an unexpected keyword argument 'a'
. The simplest solution would be to change the definition to dummy_func(**kwargs)