Search code examples
pythonpython-multiprocessing

Python multiprocessing context switching of CPU's


I have created this simple code to check multiprocessing reading from a global dictionary object:

import numpy as np
import multiprocessing as mp
import psutil

from itertools import repeat

def computations_x( max_int ):
    
    #random selection
    
    mask_1   = np.random.randint( low=0, high=max_int, size=1000  )
    mask_2   = np.random.randint( low=0, high=max_int, size=1000  )
    
    exponent_1 = np.sqrt( np.pi )
    vector_1   = np.array( [ read_obj[ k ]**( exponent_1 ) for k in mask_1  ]  )
    vector_2   = np.array( [ read_obj[ k ]**np.pi for k in mask_2  ]  )
    
    result = []
    
    for j in range(100):
        res_col = []
        for i in range(100):
            
            c = np.multiply( vector_1, vector_2 ).sum( axis=0 )
            res_col.append(c)
        
        res_col = np.array( res_col )
        
        result.append( res_col )
        
    result = np.array( result )
    
    return result
            

global read_obj

total_items = 40000
max_int     = 1000
keys        = np.arange(0, max_int)

number_processors      = psutil.cpu_count( logical=False )
#number_used_processors = 1
number_used_processors = number_processors - 1
     
number_tasks           = number_used_processors        

read_obj = { k: np.random.rand( 1000 ) for k in keys   }

pool        = mp.Pool( processes = number_used_processors )

args        = list( repeat( max_int, number_tasks ) ) 
results     = pool.map( computations_x, args )
                
pool.close()  
pool.join()

However, when looking at CPU performance, I see that the CPU's are being switched by the OS when performing the computations. I am running on Ubuntu 18.04, is this normal behaviour when using Python's MP module? Here is what I observe in the system monitor when debugging the code (I am using Eclipse2019 for debugging)

enter image description here

Any help is appreciated, as in my main project I need to share a global "read only" object through processes in the same spirit as is done here, and I want to be sure this is not affecting performance really badly; I also want to make sure all tasks are executed concurrently within the Pool class. thanks.


Solution

  • I'd say that is the normal behaviour as the OS has to make sure that other processes are not starving for CPU time.

    Here's a nice article on the OS scheduler basics: https://www.ardanlabs.com/blog/2018/08/scheduling-in-go-part1.html

    It's focusing on Golang but the first part is pretty general.