Search code examples
c++multithreadingmemorythread-safetythreadpool

Handling a thread crash


I have searched online for the following question: Can a process continue execution if one of its spawned threads crashes in cpp?

All answers seem to suggest that its not possible/can lead to undefined behaviour (even after writing sigsegv handler) since threads share memory; or there are some specific platforms on which one can handle a thread crash making your code platform specific.

Edit: I want to know -

  • if I can handle unhandled exceptions thrown by a thread
  • if can I handle a general crash (the most probable answer for this being no)

So my questions are:

Say I have a main process which spawns 'n' threads. Each of those threads don't want to communicate with each other; however they want to communicate with the main process.

  1. If I make data structures shared between threads and my main process read-only; can I handle a thread crash safely in this case without any potential UB?

  2. What I am worried about is, that even if those threads don't want to communicate with each other, there is still a chance that a thread may try to access memory of some other thread (say a bug in code), since threads share the same address space. How should I go about handling this case then?

  3. I am fine in making my code platform specific (I am restricting myself to cpp though). What platforms support handling a thread crash?

  4. Is memory corruption due to a thread crash the only thing I should worry about while handling a thread crash? Say, I guarantee that a thread will always access its own memory and will never access memory of some other thread. Can I safely handle a crash for this thread? Is there any documentation or guarantees given by cpp on when can I safely handle a thread crash?

Note: I know that I wouldn't need to worry about invalid memory access had I been using processes; but I don't want to use processes.

Thanks in advance!


Solution

  • No, there are a lot of system level stuff that is shared, can cause crashes if not handled properly. File handlers, memory allocators and other stuff.

    If you want to keep going you need to handle the cause of the crash. Exceptions would be an example (rollback the state until you get to a known good one).

    You can't always do this. If for example you are running some library you don't control you don't know what they do in there. For this you can create a new process. Sending data back and forth is more cumbersome but it gives you enough isolation so that one process can crash without breaking the other one.