Search code examples
parallel-processingcontext-switch

Getting a number of context switches for a process / thread


Out of curiosity I want to know how many times my program was context switched by the OS. Like all the registers were saved and the control was passed to another process or thread, and then after some time everything was restored and we continue as it never happened.

Does the system maintain such a number somewhere or is there a sort of hack or whatever?

I am on Linux in particular but I am interested about other systems as well.


Solution

  • Well, let's examine the case. Linux type O/S keeps these details systematically and one may use a comfort of Python, for both inspecting the state and also for easy design of a monitoring system, that can report any excessive circumstances ( the former quite matching a just out of curiosity cases, the latter quite handy for any re-work / re-use for systematic work ) :


    A "Monitor" example for both { voluntary | involuntary }-Ctx Switching :

    Python here serves for both the educational role and for the ease and comfort of further extending the scope of functionalities:

    Having assigned signal.signal( signal.SIGALRM, SIG_ALRM_handler_A ) and the timing, the system gets ready to report both voluntary and involuntary ( enforced ) Context-Switches, for which a "FAT"-blocking piece of computing was used, that resorts, due to historical reasons to non-GIL Numpy/C/FORTRAN code and thus gets disturbed by just involuntary-CtxSwitched cases, as was shown below:

    len(str([np.math.factorial(2**f) for f in range(20)][-1]))

    but by using a principally any other PID-number, this trivial monitoring mechanics can serve for whatever other purposes:

    ########################################################################
    ### SIGALRM_handler_          
    ###
    
    import psutil, resource, os, time
            
    SIG_ALRM_last_ctx_switch_VOLUNTARY = -1
    SIG_ALRM_last_ctx_switch_FORCED    = -1
    
    def SIG_ALRM_handler_A( aSigNUM, aFrame ):                              # SIG_ALRM fired evenly even during [ np.math.factorial( 2**f ) for f in range( 20 ) ] C-based processing =======================================
        # onEntry_ROTATE_SigHandlers() -- MAY set another sub-sampled SIG_ALRM_handler_B() ... { last: 0, 0: handler_A, 1: handler_B, 2: handler_C }
        #
        # onEntry_SEQ of calls of regular, hierarchically timed MONITORS ( just the SNAPSHOT-DATA ACQUISITION Code-SPRINTs, handle later due to possible TimeDOMAIN overlaps )
        # 
        #
        # print( time.ctime() )
        # print( formatExtMemoryUsed( getExtMemoryUsed() ) )
        # print( 60 * "=", psutil.Process( os.getpid() ).num_ctx_switches(), "~~~", aProcess.cpu_percent( interval = 0 ) )
        #                                        ???                        # WHY CPU 0.0%
        aProcess         =   psutil.Process( os.getpid() )
        aProcessCpuPCT   =         aProcess.cpu_percent( interval = 0 )     # EVENLY-TIME-STEPPED
        aCtxSwitchNUMs   =         aProcess.num_ctx_switches()              # THIS PROCESS ( may inspect other per-incident later ... on anomaly )
        
        aVolCtxSwitchCNT = aCtxSwitchNUMs.voluntary
        aForcedSwitchCNT = aCtxSwitchNUMs.involuntary
        
        global SIG_ALRM_last_ctx_switch_VOLUNTARY
        global SIG_ALRM_last_ctx_switch_FORCED
        
        if (     SIG_ALRM_last_ctx_switch_VOLUNTARY != -1 ):                # .INIT VALUE STILL UNCHANGED
            #----------
            # .ON_TICK: must process delta(s)
            if ( SIG_ALRM_last_ctx_switch_VOLUNTARY == aVolCtxSwitchCNT ):
                #
                # AN INDIRECT INDICATION OF A LONG-RUNNING WORKLOAD OUTSIDE GIL-STEPPING ( regex / C-lib / FORTRAN / numpy-block et al )
                #                                                                                 |||||              vvv
                # SIG_:  Wed Oct 19 12:24:32 2016 ------------------------------ pctxsw(voluntary=48714, involuntary=315)  ~~~  0.0
                # SIG_:  Wed Oct 19 12:24:37 2016 ------------------------------ pctxsw(voluntary=48714, involuntary=323)  ~~~  0.0
                # SIG_:  Wed Oct 19 12:24:42 2016 ------------------------------ pctxsw(voluntary=48714, involuntary=331)  ~~~  0.0
                # SIG_:  Wed Oct 19 12:24:47 2016 ------------------------------ pctxsw(voluntary=48714, involuntary=338)  ~~~  0.0
                # SIG_:  Wed Oct 19 12:24:52 2016 ------------------------------ pctxsw(voluntary=48714, involuntary=346)  ~~~  0.0
                # SIG_:  Wed Oct 19 12:24:57 2016 ------------------------------ pctxsw(voluntary=48714, involuntary=353)  ~~~  0.0
                # ...                                                                             |||||              ^^^
                # 00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000]
                # >>>                                                                             |||||              |||
                #                                                                                 vvvvv              |||
                # SIG_:  Wed Oct 19 12:26:17 2016 ------------------------------ pctxsw(voluntary=49983, involuntary=502)  ~~~  0.0
                # SIG_:  Wed Oct 19 12:26:22 2016 ------------------------------ pctxsw(voluntary=49984, involuntary=502)  ~~~  0.0
                # SIG_:  Wed Oct 19 12:26:27 2016 ------------------------------ pctxsw(voluntary=49985, involuntary=502)  ~~~  0.0
                # SIG_:  Wed Oct 19 12:26:32 2016 ------------------------------ pctxsw(voluntary=49986, involuntary=502)  ~~~  0.0
                # SIG_:  Wed Oct 19 12:26:37 2016 ------------------------------ pctxsw(voluntary=49987, involuntary=502)  ~~~  0.0
                # SIG_:  Wed Oct 19 12:26:42 2016 ------------------------------ pctxsw(voluntary=49988, involuntary=502)  ~~~  0.0
                
                #rint(   "SIG_ALRM_handler_A(): A SUSPECT CPU-LOAD:: ", time.ctime(), 10 * "-",  aProcess.num_ctx_switches(), "{0: > 8.2f} CPU_CORE_LOAD [%]".format( aProcessCpuPCT ), " INSPECT processes ... ev. add a Stateful-self-Introspection" )
                print(   "SIG_ALRM_handler_A(): A SUSPECT CPU-LOAD:: ", time.ctime(), 10 * "-",  aProcess.num_ctx_switches(), "{0:_>60s}".format( str( aProcess.threads() ) ), " INSPECT processes ... ev. add a Stateful-self-Introspection" )
                #rint(   "SIG_ALRM_handler_A(): A SUSPECT CPU-LOAD:: ", str( resource.getrusage( resource.RUSAGE_SELF ) )[22:] )
        else:
            #----------
            # .ON_INIT: may report .INIT()
            #rint(   "SIG_ALRM_handler_A(): A SUSPECT CPU-LOAD:: ", time.ctime(), ...
            print(   "SIG_ALRM_handler_A(): activated            ", time.ctime(), 30 * "-",  aProcess.num_ctx_switches() )
        
        ##########
        # FINALLY:
        
        SIG_ALRM_last_ctx_switch_VOLUNTARY = aVolCtxSwitchCNT               # .STO ACTUALs
        SIG_ALRM_last_ctx_switch_FORCED    = aForcedSwitchCNT               # .STO ACTUALs
        
        #rint(   "SIG_: ", time.ctime(), 30 * "-",  aProcess.num_ctx_switches(), " ~~~ ", aProcess.cpu_percent( interval = 0 ), " % -?- ", aProcess.threads() )
    
    #____________________________________________________________________
    # SIG_ALRM_handler_A( aSigNUM, aFrame ):                      DEFINED
    #####################################################################
    

    ##########
    # FINALLY:
    # 
    # > signal.signal(    signal.SIGALRM, SIG_ALRM_handler_A )          # .ASSOC { SIGALRM: thisHandler }
    # > signal.setitimer( signal.ITIMER_REAL, 10, 5 )                   # .SET   @5 [sec] interval, after first run, starting after 10[sec] initial-delay
    # > signal.setitimer( signal.ITIMER_REAL,  0, 5 )                   # .UNSET
    # > SIG_ALRM_last_ctx_switch_VOLUNTARY = -1                         # .RESET .INIT() the global { signalling | state }-variable
    # > len(str([np.math.factorial(2**f) for f in range(20)][-1]))      # .RUN   A "FAT"-BLOCKING CHUNK OF A regex/numpy/C/FORTRAN-calculus
        
    

    Also the Thread-level CtxSwitch details

    While this was not elaborated to a similar depth, the same as above applies to:

    >>> psutil.Process( 18263 ).cpu_percent()                           0.0
    >>> psutil.Process( 18263 ).ppid()                                  18054
    
    >>> psutil.Process( 18054 ).cpu_percent()                           0.0
    === ( 18054 ).threads(): [ 17679, 17680, 17681, 18054, 18265, 18266, 18267, ]
                                                                                                    ==4 -------------vvv-------------------=4--------------vvvv-------------------=4--------------vvv
    >>> [ psutil.Process( p ).num_ctx_switches() for p in ( 18259, 18260, 18261 ) ] [pctxsw(voluntary=4, involuntary=267), pctxsw(voluntary=4, involuntary=1909), pctxsw(voluntary=4, involuntary=444)]
    >>> [ psutil.Process( p ).num_ctx_switches() for p in ( 18259, 18260, 18261 ) ] [pctxsw(voluntary=4, involuntary=273), pctxsw(voluntary=4, involuntary=1915), pctxsw(voluntary=4, involuntary=445)]
    >>> [ psutil.Process( p ).num_ctx_switches() for p in ( 18259, 18260, 18261 ) ] [pctxsw(voluntary=4, involuntary=275), pctxsw(voluntary=4, involuntary=1917), pctxsw(voluntary=4, involuntary=445)]