Search code examples
socketserlanggen-tcp

Unable to accept connections on socket, when creating sockets on remote node via RPC in Erlang


I am struggling to identify the reason for gen_tcp:accept always returning an {error, closed} response.

Essentially, I have a supervisor that creates a listening socket:

gen_tcp:listen(8081, [binary, {packet, 0}, {active, false}, {reuseaddr, true}]),

This socket is then passed to a child, which is an implementation of the gen_server behaviour. The child then accepts connections on the socket.

accept(ListeningSocket, {ok, Socket}) ->                                   
    spawn(fun() -> loop(Socket) end),                                      
    accept(ListeningSocket);
accept(_ListeningSocket, {error, Error}) ->
    io:format("Unable to listen on socket: ~p.~n", [Error]),
    gen_server:call(self(), stop).

accept(ListeningSocket) ->                                                 
    accept(ListeningSocket, gen_tcp:accept(ListeningSocket)).                                                                                             

loop(Socket) ->                                                            
    case gen_tcp:recv(Socket, 0) of                                        
        {ok, Data} ->                                                      
            io:format("~p~n", [Data]),                                     
            process_request(Data),                                         
            gen_tcp:send(Socket, Data),                                    
            loop(Socket);                                                  
        {error, closed} -> ok                                              
   end.

I load the supervisor and gen_server BEAM binaries locally, and load them on a another node (which runs on the same machine) via an RPC call to code:load_binary. Next, I execute the supervisor via an RPC call, which in turn starts the server.{error, closed} is always returned by gen_tcp:accept in this scenario.

Should I run the supervisor and server while logged in to a node shell, then the server can accept connections without issue. This includes 'remsh' to the remote node that would fail to accept connections, had I previously RPCed it to start the server unsuccessfully.

I seem to be able to replicate the issue by using the shell alone:

[Terminal 1]: erl -sname node -setcookie abc -distributed -noshell

[Terminal 2]: erl -sname rpc -setcookie abc:

              net_adm:ping('node@verne').
              {ok, ListeningSocket} = rpc:call('node@verne', gen_tcp, listen, [8081, [binary, {packet, 0}, {active, true}, {reuseaddr, true}]]).
              rpc:call('node@verne', gen_tcp, accept, [ListeningSocket]).

The response to the final RPC is {error, closed}.

Could this be something to do with socket/port ownership?

In case it is of help, there are no clients waiting to connect, and I don't set timeouts anywhere.


Solution

  • Each rpc:call starts a new process on the target node to handle the request. In your final example, your first call creates a listen socket within such a process, and when that process dies at the end of the rpc call, the socket is closed. Your second rpc call to attempt an accept therefore fails due to the already-closed listen socket.

    Your design seems unusual in several ways. For example, it's not normal to have supervisors opening sockets. You also say the child is a gen_server yet you show a manual recv loop, which if run within a gen_server would block it. You might instead explain what you're trying to accomplish and request help on coming up with a design to meet your goals.