Search code examples
pythonmultithreadingsubprocessterminatelong-running-processes

Inconsistent behavior when attempting to terminate python subprocess running on a thread


I am running into problems when I attempt to terminate a run a long running process running on a separate thread.

The below is the program. WorkOne creates a subprocess and runs a long running process "adb logcat" that generates log lines. I start WorkOne in main(), wait for 5 sec and attempt to stop it. Multiple runs gives multiple outputs

import threading
import time
import subprocess
import sys

class WorkOne(threading.Thread):

    def __init__(self):
        threading.Thread.__init__(self)
        self.event = threading.Event()  
        self.process = subprocess.Popen(['adb','logcat'], stdout=subprocess.PIPE, stderr=sys.stdout.fileno())      

    def run(self):   
        for line in iter(self.process.stdout.readline,''):            
            #print line
            if self.event.is_set():
                self.process.terminate()
                self.process.kill()
                break;
        print 'exited For'

    def stop(self):
        self.event.set()

def main():

    print 'starting worker1'
    worker1 = WorkOne()
    worker1.start()
    print 'number of threads: ' + str(threading.active_count())
    time.sleep(5)
    worker1.stop()
    worker1.join(5)
    print 'number of threads: ' + str(threading.active_count())

if __name__ == '__main__':
    main()

Sometimes I get [A]:

starting worker1
number of threads: 2
number of threads: 2
exited For

Sometimes I get [B]:

starting worker1
number of threads: 2
number of threads: 1
exited For

Sometimes I get [C]:

starting worker1
number of threads: 2
number of threads: 2

I think I should expect to get [B] all the time. What is going wrong here?


Solution

  • I think [B] is only possible if the subprocess takes less than 10 seconds: The main thread sleeps 5 seconds, and after that worker finishes within the 5 seconds timeout of join().

    For 10 seconds or more, worker can be alive even after the join() call since it has a timeout argument, which may happen or not. Then you can get [A] (subprocess finishes a few seconds later) or [C] (subprocess finishes much later).

    To get always [B], remove the timeout argument of join() so the main thread waits until worker finishes (or make sure you kill the process within 10 seconds by placing the kill call outside of the loop).