Is there a way to avoid HUP once you used O_NONBLOCK on a socket?

When I use sockets in blocking mode, I can have a simple system that does something like this:

client            server

A -------------------> B
      register

A <------------------> B
   (various messages)

A -------------------> B
      unregister

Just after the unregister message is sent, the process A can quit immediately and yet B receives the message as expected.

If I turn on non-blocking mode on A's socket, B never receives unregister if A send that message and then quits immediately (I tested by adding a sleep(1) after sending unregister, in that case it works as expected.) So, more or less, my client cannot cleanly unregister itself.

Note: when B poll() A's socket, I get a Hanged Up signal (POLLHUP) instead of the last unregister message, then the hang up.

I tried to add a call to turn blocking mode back on, and somehow it makes no difference. I use the following code to change the blocking mode:

int optval(0 or 1);
ioctl(get_socket(), FIONBIO, &optval);

Just in case, I tried with fcntl() too, although I'm sure that tweaks the same flag as far as the kernel is concerned.

int flags(fcntl(get_socket(), F_GETFL));
flags |= O_NONBLOCK;   // use this line to turn ON
flags &= ~O_NONBLOCK;  // use this line to turn OFF
fcntl(get_socket(), F_SETFL, flags);

As a side note, I send and receive my messages using the read() and write() functions.

Update:

For those interested, the test is now in our git:

Server: https://sourceforge.net/p/snapcpp/code/ci/master/tree/snapwebsites/tests/test_shutdown_server.cpp
Client: https://sourceforge.net/p/snapcpp/code/ci/master/tree/snapwebsites/tests/test_shutdown_client.cpp

These use the snap library, mainly the snap_communicator which depends on tcp client/server:

tcp: https://sourceforge.net/p/snapcpp/code/ci/master/tree/snapwebsites/lib/tcp_client_server.cpp
communicator: https://sourceforge.net/p/snapcpp/code/ci/master/tree/snapwebsites/lib/snap_communicator.cpp

Solution

As you are discovering, send on a socket only queues the data to be sent. It doesn't actually mean the server got it. This is true for blocking and non-blocking sockets.

Several possibilities:

Make sure you call close on the socket before your client program exits. You didn't say in your question if this was happening, but it's probably a good idea.
If #1 doesn't work, use the SO_LINGER option on the socket. Set a timeout interval appropriate.

Something like the following

  struct linger ling;
  ling.l_onoff = 1;
  ling.l_linger = 3; // 3 second wait for data to finish being set.
  setsockopt(s, SOL_SOCKET, SO_LINGER, &ling, sizeof(ling));

An alternative to #2 is to modify your protocol such that the client gets some sort of acknowledgement message from the server before closing the socket and exiting. Or for simplicity, the client waits for the server closes the socket before exiting. (recv will return 0 when the server closes the socket)

My recommendation is to make sure you have implemented #1. If that doesn't it to it for you, evaluate #3. #2, if nothing else.