I have a Erlang application, with the supervisor starting a gen_server. The spawned gen_server has logic in its init/1 to hook a new process to the supervisor. When it do this operation with just,
supervisor:start_child(supervisor_name, Child_spec),
inside the init/1 the application hung up. But if I use,
rpc:cast(node(), supervisor, start_child, [supervisor_name, Child_spec]),
then the application run smoothly. Can any one give me some ideas to debug this situation or an insight is very much appreciated.
This happens because the supervisor starts its child processes one after another, waiting for each to finish initialisation before spawning the next one.
That is, the supervisor is given the start function of your gen_server module, something like {my_module, start_link, []}
. It's going to wait until that function returns, and not handle any other requests meanwhile. my_module:start_link/0
calls gen_server:start_link/4
, which will return only once the callback function my_module:init/1
returns.
However, my_module:init/1
makes a blocking call to the supervisor, which the supervisor is not expecting at this point, since it's waiting for my_module:init/1
to return - and you have a deadlock.
The reason it works with rpc:cast
is that rpc:cast
doesn't wait for the function to return, and thus there is no deadlock.
Do you need to add the new child spec in your gen_server init
callback function? You could just add both child specs in your supervisor init
function, and they would be started one after another.