I am trying to run a simple_one_for_one supervisor
where the supervisor
and worker
are placed in separate modules and i keep getting the below error when using supervisor:start_child
:
>A=sup:start_link().
>B=supervisor:start_child(A,[]).
{error,{'EXIT',{undef,[{worker,start_link,[],[]},
{supervisor,do_start_child_i,3,
[{file,"supervisor.erl"},{line,379}]},
{supervisor,handle_call,3,
[{file,"supervisor.erl"},{line,404}]},
{gen_server,try_handle_call,4,
[{file,"gen_server.erl"},{line,661}]},
{gen_server,handle_msg,6,
[{file,"gen_server.erl"},{line,690}]},
{proc_lib,init_p_do_apply,3,
[{file,"proc_lib.erl"},{line,249}]}]}}}
Supervisor
-module(sup).
-behaviour(supervisor).
-compile([export_all]).
start_link()->
{ok,Pid}=supervisor:start_link(?MODULE,[]),
io:format("sugi pl"),
Pid.
init(_Args) ->
RestartStrategy = {simple_one_for_one, 10, 60},
ChildSpec = {
worker,
{worker, start_link, []}, //tried adding here a parameter in the A
permanent,
brutal_kill,
worker,
[sup]
},
{ok, {RestartStrategy,[ChildSpec]}}.
Worker
-module(worker).
-compile(export_all).
start_link([Arg])-> //tried both [Arg] and Arg
{ok,Pid}=spawn_link(?MODULE,init,[]),
Pid.
init([Time])->
receive->
{From,Msg}->From !{Time,Msg},
init(Time)
end.
Command
>c("[somepath]/sup.erl"),A=sup:start_link(),B=supervisor:start_child(A,[]).
I can see clearly that the problem is when trying to add a child.Somehow the init
function is not called properly but i do not see why.(Badmatch)
I have tried adding a parameter in the A
of the MFA
of the ChildSpec
to no avail
There are a number of problems with your code.
sup:start_link/0
is wrong; you're returning Pid instead of {ok, Pid}
.While it's not really incorrect, you're using supervisor:start_link/2
, which doesn't register the supervisor name. Having the name is handy, so it's better to use supervisor:start_link/3
:
supervisor:start_link({local, ?MODULE}, ?MODULE, []),
This associates the module name with its process id, allowing you to use the process name in your shell commands rather than using pid variables.
io:format/2
call in sup:start_link/0
, presumably for debugging. A much better way of debugging is to call sys:trace(sup, true)
from your shell once you've started the sup
supervisor. You can turn it off from the shell as well, by specifying false
instead of true
as the second argument.Fixing the above issues leaves use with the following definition of sup:start_link/0
:
start_link() ->
supervisor:start_link({local, ?MODULE}, ?MODULE, []).
Let's recompile, start the supervisor, then compile worker
(after fixing its syntax errors), and then trace the supervisor as we try to start a child:
1> c(sup).
sup.erl:3: Warning: export_all flag enabled - all functions will be exported
{ok,sup}
2> sup:start_link().
{ok,<0.94.0>}
3> sys:trace(sup, true).
ok
4> c(worker).
worker.erl:2: Warning: export_all flag enabled - all functions will be exported
worker.erl:5: Warning: variable 'Arg' is unused
{ok,worker}
5> supervisor:start_child(sup, []).
*DBG* sup got call {start_child,[]} from <0.81.0>
*DBG* sup sent {error,
{'EXIT',
{undef,
[{worker,start_link,[],[]},
{supervisor,do_start_child_i,3,
[{file,"supervisor.erl"},{line,379}]},
...
This abbreviated trace output shows that sup
died when it tried to start a worker
by calling worker:start_link/0
(the []
indicates there are zero arguments). The child specification tells sup
to start it this way, since it contains
{worker, start_link, []}
and we started the child via supervisor:start_child(sup, [])
. For a simple_one_for_one
child, the arguments sent to its start function are composed of the argument list from the child specification combined with the arguments specified in the call to supervisor:start_child/2
; in this case, that's the equivalent of [] ++ []
which is the same as []
, indicating no arguments. Let's change the worker:start_link/1
function to worker:start_link/0
instead, recompile it, and try again:
6> c(worker).
worker.erl:2: Warning: export_all flag enabled - all functions will be exported
{ok,worker}
7> supervisor:start_child(sup, []).
*DBG* sup got call {start_child,[]} from <0.81.0>
*DBG* sup sent {error,
{'EXIT',
{{badmatch,<0.94.0>},
[{worker,start_link,0,[{file,"worker.erl"},{line,6}]},
...
This time, the abbreviated output shows a badmatch
. This is because spawn_link/3
returns a pid, but worker:start_link/0
is expecting it to return {ok, Pid}
. Let's fix that, and also fix the return value to be {ok, Pid}
instead of just Pid
:
start_link()->
Pid = spawn_link(?MODULE,init,[]),
{ok, Pid}.
Then let's recompile and try again:
8> c(worker).
worker.erl:2: Warning: export_all flag enabled - all functions will be exported
{ok,worker}
9> supervisor:start_child(sup, []).
*DBG* sup got call {start_child,[]} from <0.81.0>
*DBG* sup sent {ok,<0.106.0>} to <0.81.0>, new state {state,
{local,sup},
simple_one_for_one,
{[worker],
#{worker =>
{child,undefined,
worker,
{worker,
start_link,[]},
permanent,
brutal_kill,worker,
[sup]}}},
{maps,
#{<0.106.0> => []}},
10,60,[],0,sup,[]}
*DBG* sup got {'EXIT',<0.106.0>,{undef,[{worker,init,[],[]}]}}
OK, this time the supervisor actually started the child, but it died immediately because it's trying to call worker:init/0
but only worker:init/1
is defined. Because the child dies immediately, the supervisor tries repeatedly to start it, based on its restart strategy:
RestartStrategy = {simple_one_for_one, 10, 60},
Because this is a hard error, it fails instantly every time, and the supervisor dies after 10 failed restarts in 60 seconds or less, just like it's supposed to:
=SUPERVISOR REPORT==== 20-Apr-2020::10:43:43.557307 ===
supervisor: {local,sup}
errorContext: shutdown
reason: reached_max_restart_intensity
offender: [{pid,<0.117.0>},
{id,worker},
{mfargs,{worker,start_link,[]}},
{restart_type,permanent},
{shutdown,brutal_kill},
{child_type,worker}]
** exception error: shutdown
It appears from your code that you're attempting to pass some sort of Time
argument to worker:init/1
, so let's change start_link/0
to pass a timestamp:
start_link()->
Pid = spawn_link(?MODULE,init,[os:timestamp()]),
{ok, Pid}.
Let's also fix init/1
to take the argument directly, rather than in a list:
init(Time) ->
receive
{From,Msg} ->
From ! {Time,Msg},
init(Time)
end.
Let's restart the supervisor, recompile worker
, and try again:
10> sup:start_link().
{ok,<0.119.0>}
11> sys:trace(sup, true).
ok
12> c(worker).
worker.erl:2: Warning: export_all flag enabled - all functions will be exported
{ok,worker}
13> {ok, Child} = supervisor:start_child(sup, []).
*DBG* sup got call {start_child,[]} from <0.118.0>
*DBG* sup sent {ok,<0.127.0>} to <0.118.0>, new state {state,
{local,sup},
simple_one_for_one,
{[worker],
#{worker =>
{child,undefined,
worker,
{worker,
start_link,[]},
permanent,
brutal_kill,
worker,
[sup]}}},
{maps,
#{<0.127.0> => []}},
10,60,[],0,sup,[]}
{ok,<0.127.0>}
It looks like it worked. Let's see if the supervisor agrees, by asking it how many children it has:
14> supervisor:count_children(sup).
...
[{specs,1},{active,1},{supervisors,0},{workers,1}]
It has one worker, just as we expected. Finally, let's send a message to the worker and see if it responds as expected:
15> Child ! {self(), "are you there?"}.
{<0.118.0>,"are you there?"}
16> flush().
Shell got {{1587,394860,258120},"are you there?"}
ok
Now it all seems to work.
One final fix is to change the modules in your child specification; rather than [sup]
it should be the module itself, [worker]
. With that change, your revised working modules are below. You might also want to reconsider whether you want to use permanent
for the children, since this is a simple_one_for_one
supervisor; transient
is likely a better choice, but I've left it as originally written. Consider reviewing the Supervisor Behavior documentation for more information.
Supervisor
-module(sup).
-behaviour(supervisor).
-compile([export_all]).
start_link()->
supervisor:start_link({local, ?MODULE}, ?MODULE, []).
init(_Args) ->
RestartStrategy = {simple_one_for_one, 10, 60},
ChildSpec = {
worker,
{worker, start_link, []},
permanent,
brutal_kill,
worker,
[worker]
},
{ok, {RestartStrategy,[ChildSpec]}}.
Worker
-module(worker).
-compile(export_all).
start_link()->
Pid = spawn_link(?MODULE,init,[os:timestamp()]),
{ok, Pid}.
init(Time) ->
receive
{From,Msg} ->
From ! {Time,Msg},
init(Time)
end.