Parallel Req/Rep via Pub/Sub

I have multiple servers, at any point, one and only one will be the leader whcih can respond to a request, all others just drop the request. The issue is that the client does not know which server is the leader.

I have tried using a pub socket on the client for the parallel request out, however I can't work out the right semantics for the response. In terms of how to get the server to respond to that specific client.

A hacky solution which I have tried is to have a sub socket on the client to pub sockets on all the servers, with the leader responding by publishing a message with a filter such that it only goes to the client.

However I am unable to receive any responses this way, the server believes that it sent the message and the client believes it subscribed to "" but then doesn't receive anything...

So I am wondering whether there is a more proper way of doing this? I have thought that potentially a dealer/router with sending to a specific client would work, however I am unsure how to do that.

Essentially I am trying to do a standard Req/Rep however doing the req in parallel to all the nodes, rather than round robin.

UPDATE: By sending the routing id of the dealer in the pub request, making the remote call idempotent (just returning pre-computed results on repeated attempts), and then sending the result back via a router, with message filtering on the receiving side, it now works.

Solution

is (there) a more proper way of doing this?

Yes.

Start to apply the Maslow's Hammer rule:

_{When the only tool you have is a hammer, every problem begins to resemble a nail.}

In other words, do not try use (one) hammer for solving every problem. PUB/SUB-archetype was designed to serve those-and-only-those multi-party Formal-Communications-Pattern archetypes, where many SUB-scribe to .recv() some PUB-lisher(s) .send()-broadcast messages, but nothing other.

Similarly, REQ/REP-archetype was defined and implemented so as to serve one-and-only-one multi-party distributed Formal-Communications-Pattern ( and will obviously not meet any use-case, which has any single other or even a slightly different requirement ).

Users often require some special, non-trivial features, that obviously were not a part of the said trivial Formal-Communications-Pattern archetype primitives ( those ready-made blocks, made available in the ZeroMQ toolbox ).

It is architects' / designers' role to define, analyse and implement any more complex user-specific distributed-behaviour definition ( a protocol ) and to implement it, most often using a layered combination of the ready-made ZeroMQ primitives.

If in doubts, take a sheet of paper and pencil, draw a small crowd of kids on playground and sketch their "shouts", their "listening", their "silence", "waiting" and "doubts", their many or few "replies", their "voting" and "anger" of not being voted for by friends, their fight for a place on the Sun and their "persistence" not to let others take theirs turn and let them sit on the "swing" after releasing the so far pleasurable swinging oneselves.

All this is the part of finding the right mix of ( protocol-orchestrated ) levels of control and levels of freedom to act.

There we get the new, distributed-behaviour, tailor-made for your specific use-case.

Probability to find a ready-made primitive tool to match and fulfil any user-specific use case is limitlessly close to Zero ( sure, unless one's own, user-specific use-case requirements match all those of the primitive archetype, but that is not a user-specific use-case, but a re-use of an already implemented archetype for the very same situation, that was foreseen by the ZeroMQ fathers, wasn't it? )

Again, welcome to the art of Zen-of-Zero.

Maybe you'd like to read this, and this, and this.