.net web-services long-running-processes

Recommendations for designing a long-running, resource-intensive web service

I have a .NET function that does some complex calculation. Depending on the parameters that are passed in, the function:

Takes anywhere from several minutes to several hours to run
Uses 100% of a single core during the computation
Requires anywhere from 100s of MB to several GB of memory
Writes anywhere from several MB to several GB of data to disk
May throw an exception, including an OutOfMemoryException

The amount to data to be written to disk can be accurately predicted from the function parameterisation. There is no easy way to predict the other resource requirements from the function parameterisation.

I need to expose this function via a web service. This service needs to be:

Resiliant and gracefully report any problems during the calculation
Capable of handling concurrent requests, as long as there are sufficient resources to handle the request without significant performance degradation, and to gracefully deny the request otherwise.

I'm intending to handle the long-running nature by having the initial request return a status resource that can be polled for progress. Once the calculation is complete this resource will provide the location of the output data, which the client can download (probably via FTP).

I'm less clear on how best to handle the other requirements. I'm considering some sort of "calculation pool" that maintains instances of the calculator and keeps track of which ones are currently being used, but I haven't figured out the details.

Does anyone with experience of similar situations have any suggestions? As long as the solution can run on a Windows box, all technology options can be considered.

Solution

I'd suggest splitting your application in two parts.

The web service itself. It's functionality:
- Get a work item from a client;
- Transfer this work to a backend service that performs the actual work;
- Report progress and the result;
The backend service. It's functionality:
- Process the requests friom the web service;
- Perform the actual computation.

The reasons for this design are
1) it's relatively difficult to handle the workload in the hosted application (ASP.NET) because the server (IIS) will manage the resources, while in a separate app you have more direct control;
2) two-tier design is more scalable - for instance, later you could easily move the backend to another physical machine (or several machines).

The web service should be stateless - for instance, after a request is accepted, the user gets back some ID and uses this ID to poll the service for the result.

The backend server, probably, has to maintain a queue of the requests to process and a set of worker threads that process them. The workers should monitor the resources available and take care not to overload the machine (and, of course, gracefully handle all possible error conditions).