signalr load-balancing asp.net-web-api2 long-running-processes

Best practice for load balanced Web API that uses shared long running processes?

I am building a platform that will support logging data from IP connected devices. The logger uses a proprietary API to communicate with the connected devices and dump the data to the database. I'm using ASP.NET Web API to provide start/stop functionality for each logger.

In a stand alone server environment, I'd simply create a global variable that contains a list of live loggers. But that won't work in a load balanced environment. For instance, a post request by User A comes into Web 2 which creates a new logger on Web 2 (recording in the database that Web 2 has an active logger with X id). Then, a delete request by User B comes into Web 5. That request would need to talk to Web 2 to actually shut down the logger.

Is there a common practice for keeping track of long running processes and which IIS instance the process is running on? Is there a best practice for communicating between instances in a load balanced environment. I'm planning to use SignalR to communicate status to connected users.

If you have some links to actual code sample, that would be great!

Edit/Clarification

I have multiple devices on a local LAN that are controlled through a third party DLL. To control a specific device, an instance of a class defined in the third party DLL is created using the local LAN Ip Address of the device. With that class instance, the device can be turned, controlled and managed. Additionally, the device sends messages via TCP/IP that can be received via a callback method.

I want to expose control of these devices and messages from these devices through a web site that will be load balanced. A User (say Greg) can request through the website to start logging for any device in the device list (say Device at 192.168.1.51). My code will receive that request through a Web API post (say devices/51). Once received, I will spin up a new instance of the Device Logger API class and register the callback function. Incoming messages from the device via the callback will be pushed via SignalR to connected clients and written to the database for historical purposes. Another User (say Tim) can attach to the SignalR hub to view messages from Devices currently being logged. Tim can also stop live logging of the Device by submitting a delete request to the Web API (say devices/51).

I am looking for an example of intra-web server communication. Since the web site is in a server farm and the device logger API instance for device 51 lives on a specific web server, my web api delete for devices will need to communicate between web servers to shutdown the logging.

Maybe this is as easy as setting up a secondary SignalR hub for intra-web communication. Have all web servers register with the hub upon startup and then broadcast the command to the registered web servers. If the receiving web server has the class instance for that device, it executes the command, if not, it ignores the command.

Thoughts? Is there another way to communicate between web servers?

Solution

In my experience, it's best to keep an automatically-duplicated load-balanced service stateless. In your example, the logger on Web 2 has state that Web 1 does not have, so your service as described is not stateless. To remove the logger, a request must go to a particular instance in the pool of automatically-duplicated service instances.

What you could do instead is have an additional backend (non-public facing) service for logging. Each time one of the N duplicates of your frontend service needs to create, interact with, or destroy a logger, it can do so through an API on the single backend logging service.

Think:

POST /loggers >> 204 Created, Location: /loggers/48913
PUT /loggers/daily-statistics
POST /loggers/48913/messages
DELETE /loggers/48913

Response to Edit/Clarification:

Right now you are thinking of your system as in the following picture:

A     B     C     D      Web API instances
|\         /|\    |
U V   Q   W X Y   Z      Devices

A, B, C, and D are the load-balanced Web API, and U, V, W, X, Y, and Z are the devices you are monitoring. Connections between Web API instances and devices are denoted by lines.

What happens when a request comes to B to stop listening to device Y? It must be rerouted to C, because it is the connected Web API.

What happens when a request comes to C to start listening to Q? Does it pick up a fourth device while B remains idle?

My suggestion is to start thinking of your system in this way instead:

A     B     C     D      Web API instances

   1     2     3         Device listeners
  /|\   / \   / \
 U V Q W   X Y   Z       Devices

The Web API instances are 100% stateless. To a client of a Web API, which of A, B, C, or D is processing the request is hidden and irrelevant.

The device listeners, on the other hand, are explicitly addressed. They can communicate among each other and are aware of which device listener is connected to which device. When a request comes to a Web API instance regarding logging device X, an appropriate request is forwarded randomly to one of the device listeners.

If the request is to start logging device X, the device listener that receives the request checks the other device listeners to see if any has a lesser workload at the time. If so, it responds with a redirect to that device listener. If not, it creates the logger for that device.
If the request is to interact with a logger for or stop listening to a device, the device listener that receives the request responds with a redirect to the appropriate device listener, or does the work itself if it is connected to the specified device.

Splitting the Web API instances and device listeners into two distinct groups allows you to keep the frontend ideal of stateless web services.