Search code examples
erlangshutdownerlang-otperlang-supervisor

Erlang process termination: Where/When does it happen?


Consider processes all linked in a tree, either a formal supervision tree, or some ad-hoc structure.

Now, consider some child or worker down in this tree, with a parent or supervisor above it. I have two questions.

  1. We would like to "gracefully" exit this process if it needs to be killed or shutdown, because it could be halfway through updating some account balance. Assume we have properly coded up some terminate function and connected this process to others with the proper plumbing. Now assume this process is in its main loop doing work. The signal to terminate comes in. Where exactly (or possibly the question should be WHEN EXACTLY) does this termination happen? In other words, when will terminate be called? Will the thing just preempt itself right in the middle of the loop it is running and call terminate? Will it wait until the end of the loop but before starting the loop again? Will it only do it while in receive mode? Etc.

  2. Same question but without terminate function having been coded. Assume parent process is a supervisor, and this child is following normal OTP conventions. Parent tells child to shutdown, or parent crashes or whatever. The child is in its main loop. When/where/how does shutdown occur? In the middle of the main loop? After it? Etc.


Solution

  • It is quite nicely explained in the docs (sections 12.4, 12.5, 12.6, 12.7).

    There are two cases:

    1. Your process terminated due to some bad logic.

    It throws an error, so it can be in the middle of work and this could be bad. If you want to prevent that, you can try to define mechanism, that involves two processes. First one begins the transaction, second one does the actual work and after that, first one commits the changes. If something bad happens to second process (it dies, because of errors), the first one simply does not commit the changes.

    1. You are trying to kill the process from outside. For example, when your supervisor restarts or linked process dies.

    In this case, you can also be in the middle of something, but Erlang gives you the trap_exit flag. It means, that instead of dying, the process will receive a message, that you can handle. That in turn means, that terminate function will be called after you get to the receive block. So the process will finish one chunk of work and when it will be ready for next, it will call terminate and after that die.

    So you can bypass the exiting by using trap_exit. You can also bypass the trap_exit sending exit(Pid, kill), which terminates process even if it traps exits.

    There is no way to bypass exit(Pid, kill), so be careful with using it.