Redirect NServiceBus message based on Endpoint availability

I'm new to NServiceBus, but currently using it with SQL Server Transport to send messages between three machines: one belongs to an endpoint called Server, and two belong to an endpoint called Agent. This is working as expected, with messages sent to the Agent endpoint distributed to one of the two machines via the default round-robin.

I now want to add a new endpoint called PriorityAgent with a different queue and two additional machines. While all endpoints use the same message type, I know where each message should be handled prior to sending it, so normally I can just choose the correct destination endpoint and the message will be processed accordingly.

However, I need to build in a special case: if all machines on the PriorityAgent endpoint are currently down, messages that ordinarily should be sent there should be sent to the Agent endpoint instead, so they can be processed without delay. On the other hand, if all machines on the Agent endpoint are currently down, any Agent messages should not be sent to PriorityAgent, they can simply wait for an Agent machine to return.

I've been researching the proper way to implement this, and haven't seen many results. I imagine this isn't an unheard-of scenario, so my assumption is that I'm searching for the wrong things or thinking about this problem in the wrong way. Still, I came up with a couple potential solutions:

Separately track heartbeats of PriorityAgent machines, and add a mutator or behavior to change the destination of outgoing PriorityAgent messages to the Agent endpoint if those heartbeats stop.
Give PriorityAgent messages a short expiration, and somehow handle the expiration to redirect messages to the Agent endpoint. I'm not sure if this is actually possible.

Is one of these solutions on the right track, or am I off-base entirely?

Solution

You have not seen many do this because it's considered an antipattern. Or rather one of two antipatterns.

1) Either you are sending a command, in which case the RECEIVER of the command defines the contract. Why are you sending a command defined by PriorityAgent to Agent? There should be no coupling there. A command belongs to ONE logical endpoint/queue.

2) Or you are publishing an event defined by whoever publishes, with both PriorityAgent and Agent as subscribers. The two subscribers should be 100% autonomous and share nothing. Checking heartbeats/sharing info between these two logical separate entities is a bad thing. Why have them separately in the first place then? If they know about each other "dirty secrets," they should be the same thing.

If your primary concern is that the PriorityAgent messages will not be handled if the machines hosting it are down, and want to use the machines hosting Agent as a backup, simply deploy PriorityAgent there as well. One machine can run more than one endpoint just fine.

That way you can leverage the additional machines, but don't have to get dirty with sending the same command to a different logical endpoint or coupling two different logical endpoints together through some back channel.