I have a 1 client to 1 server model, let's say the client will emit 100 req/s, but the server can only handle 50 req/s.
The client side is very tolerant. It says okay, I don't care about you dropping some of my requests, just process the latest one. What pattern should be applied in this situation?
Timeline ('<' denotes the server is processing the latest requests):
client: 1 -- 2 -- 3 -- 4 -- 5 -- 6 -- 7 -- 8 -- 9 -- 10
server: <<<<<<<<<< 1 <<<<<<< 3 <<<<<<<<<<<< 5 <<<<<<< (processing 8)
You probably want something like PUB/SUB with a low high water mark client to server, something like PUSH/PULL server to client, and something (like a GUID or serial number) in your requests that will be added to responses so that the client can work out what request a return from the server pertains to.
The PUB/SUB is useful because it will drop messages in the PUBlisher if there are no SUBscribers. Having a the high water mark set to a low value (1) means that it won't stash a load of old messages until the server can absorb them. The PUSH/PULL for replies from the server to the client won't drop anything, so the replies will certainly get through.
This way the server can take it's own sweet time receiving requests on its SUB socket, and get a reply back on the PUSH socket.
Note that we're not really using the PUB/SUB or PUSH/PULL patterns here - we're just using them for point-to-point connections with differing policies on what to do if things start bottling up. PUB/SUB drops messages, PUSH/PULL does not.
That may be what you're looking for!