Search code examples
pythonmultithreadingparallel-processingmultiprocessingnvidia-jetson

How can I run a process with CPU and another process with GPU?


In my script there are 5 different process which should work parallel. For now I am using multiprocessing pool method to run parallel processes. Actually it is working very well. But the problem is I want to use this script in a platform which has only 4 CPU. But this platform has also GPU (Nvidia Jetson Nano). So I want to run 4 process with CPUs and another one process should work with GPU. Let me explain with some code:

imports...

manager_1 = Manager()
variable = manager_1.Value(ctypes.Array, [])
counter_lock_1= manager_1.Lock()

manager_2 = Manager()
variable_2 = manager_2.Value(ctypes.Array, [])
counter_lock_2 = manager_2.Lock()

manager_3 = Manager()
variable_3 = manager_3.Value(ctypes.Array, [])
counter_lock_3 = manager_3.Lock() 

def process1(variable,variable_2,..):
    while True:
        ---Do something---
        variable.value = something

def process2(variable,..):
    while True:
        ---Do something---

def process3(variable,variable_2,..):
    while True:
        ---Do something---

def process4(variable,variable_2,variable_3,..):
    while True:
        ---Do something---

def process5(variable,variable_2,..):
    while True:
        ---Do something---

def main():
    f_1 = functools.partial(process1,variable,variable_2,...)
    f_2 = functools.partial(process2,variable,...)
    f_3 = functools.partial(process3,variable,variable_2,...)
    f_4 = functools.partial(process4,variable,variable_2,variable_3)
    f_5 = functools.partial(process5,variable,variable_2,...)

    with Pool() as pool:
        res = pool.map(smap, [f_1, f_2, f_3, f_4, f_5])

main()

My script template is someting like this. For example if I use 4 CPU platform, what happend to f_5? How can I run it with GPU.

Note: Actually f_5 is already working with GPU because it is about an object detection function. I can choose do object detection with GPU device. But I have to define this function in pool because to get variables. I guess I run it with a CPU at the begin but after that it is using GPU to detect objects. How can I do this directly by using GPU? Also do you have a suggestion about using pool or something another which can effect performance? Thank you.


Solution

  • GPU's cannot run Python. The actual object detection will be done in a non-Python function; likely a CUDA function (the native language of NVidia CPU's).

    Since you have 4 CPU cores, the 5 threads will not run simultaneously. The "lock" code is also very suspect. In general, locks should be taken by the function that uses the variable, and the locks will prevent conflicts between multiple functions using the same variable. This requires that each function has a separate lock. counter_lock_1 outside any of the 5 process functions is likely in the wrong place.