Search code examples
pythonmultithreadingrestartpython-multithreading

How do I detect if a thread died, and then restart it?


I have an application that fires up a series of threads. Occassionally, one of these threads dies (usually due to a network problem). How can I properly detect a thread crash and restart just that thread? Here is example code:

import random
import threading
import time

class MyThread(threading.Thread):
    def __init__(self, pass_value):
        super(MyThread, self).__init__()
        self.running = False
        self.value = pass_value

    def run(self):
        self.running = True

        while self.running:
            time.sleep(0.25)

            rand = random.randint(0,10)
            print threading.current_thread().name, rand, self.value
            if rand == 4:
                raise ValueError('Returned 4!')


if __name__ == '__main__':
    group1 = []
    group2 = []
    for g in range(4):
        group1.append(MyThread(g))
        group2.append(MyThread(g+20))


    for m in group1:
        m.start()

    print "Now start second wave..."

    for p in group2:
        p.start()

In this example, I start 4 threads then I start 4 more threads. Each thread randomly generates an int between 0 and 10. If that int is 4, it raises an exception. Notice that I don't join the threads. I want both group1 and group2 list of threads to be running. I found that if I joined the threads it would wait until the thread terminated. My thread is supposed to be a daemon process, thus should rarely (if ever) hit the ValueError Exception this example code is showing and should be running constantly. By joining it, the next set of threads doesn't begin.

How can I detect that a specific thread died and restart just that one thread?

I have attempted the following loop right after my for p in group2 loop.

while True:
    # Create a copy of our groups to iterate over, 
    # so that we can delete dead threads if needed
    for m in group1[:]:
        if not m.isAlive():
            group1.remove(m)
            group1.append(MyThread(1))

    for m in group2[:]:
        if not m.isAlive():
            group2.remove(m)
            group2.append(MyThread(500))

    time.sleep(5.0)

I took this method from this question.

The problem with this, is that isAlive() seems to always return True, because the threads never restart.

Edit

Would it be more appropriate in this situation to use multiprocessing? I found this tutorial. Is it more appropriate to have separate processes if I am going to need to restart the process? It seems that restarting a thread is difficult.

It was mentioned in the comments that I should check is_active() against the thread. I don't see this mentioned in the documentation, but I do see the isAlive that I am currently using. As I mentioned above, though, this returns True, thus I'm never able to see that a thread as died.


Solution

  • You could potentially put in an a try except around where you expect it to crash (if it can be anywhere you can do it around the whole run function) and have an indicator variable which has its status.

    So something like the following:

    class MyThread(threading.Thread):
        def __init__(self, pass_value):
            super(MyThread, self).__init__()
            self.running = False
            self.value = pass_value
            self.RUNNING = 0
            self.FINISHED_OK  = 1
            self.STOPPED = 2
            self.CRASHED = 3
            self.status = self.STOPPED
    
        def run(self):
            self.running = True    
            self.status = self.RUNNING
    
    
            while self.running:
                time.sleep(0.25)
    
                rand = random.randint(0,10)
                print threading.current_thread().name, rand, self.value
    
                try:
                    if rand == 4:
                        raise ValueError('Returned 4!')
                except:
                    self.status = self.CRASHED
    

    Then you can use your loop:

    while True:
        # Create a copy of our groups to iterate over, 
        # so that we can delete dead threads if needed
        for m in group1[:]:
            if m.status == m.CRASHED:
                value = m.value
                group1.remove(m)
                group1.append(MyThread(value))
    
        for m in group2[:]:
            if m.status == m.CRASHED:
                value = m.value
                group2.remove(m)
                group2.append(MyThread(value))
    
    time.sleep(5.0)