Search code examples
pythonsocketsmessage-queuezeromqpyzmq

What happens when Python ZMQ PULL socket is receiving messages at a different speed than the PUSH socket?


Lets say that the PUSH socket is sending messages every 1 second, and the PULL socket is receiving messages every 10 seconds.

So, in 100 seconds, the PUSH socket has sent 100 messages, while the PULL socket has only received 10.

Now, what happens if the PUSH socket dies, and the PULL socket keeps running? Will it still receive messages?

Also, is there a limit to the messages that the PUSH socket with hold with nobody receiving it?


Solution

  • You'll find it all in the documentation, but I think the answers are as follows:

    To understand what will happen when the PUSH socket dies, it helps to understand a bit about how ZMQ works. When you write a message to the PUSH socket, you're actually asking a ZMQ management thread to buffer and transfer that message. That talks (using the ZMTP protocol) to the ZMQ management thread in the client which buffers it. The client's management thread will then inform the client application that a message has been received. The client ZMQ management thread keeps the message in its buffer until the application has read the message.

    The size of the buffers are not infinite. For a constant stream of messages, at some point the client management thread's message buffer can fill up, meaning that it refuses to take messages from the sender, meaning that the sender's management thread's buffers start to fill up. Eventually, it will refuse to take messages from the sending application, and a zmq_send() blocks.

    You can alter the size of these buffers by setting high water marks on sockets, but by default they grow on demand (I think). That means that messages can be accumulated until memory is exhausted, but regardless if all the buffers are full the PUSH socket write blocks.

    So, as long as the management threads are cooperating to keep the client end buffer containing at least something, messages are flowing at the peak rate of the combined applications (1 every 10 seconds in your example). The issue is what do these threads decide to do if the client application isn't reading at the rate the sending application is writing?

    I believe that the policy changed between ZMQ version 3 and 4. I think that in v3, they were biased to accumulate messages in the sending end. But in v4 I think they switched, and messages would be accumulated in the client end buffers.

    This means that, so long as the management thread buffers haven't filled up, for version 4 if the PUSH end dies then all of the messages sent but not yet read have been transferred across the network and are waiting in the PULL end's management thread buffers, and can be read. Whereas in version 3, there'd be more messages kept in the PUSH end management thread buffers, and they've not been sent when the PUSH end dies.

    I may have got that version 3, 4 thing the wrong way round.