Search code examples
ramazon-ec2amazon-emramazon-swf

Amazon EC2 On-Demand Workers for Short Tasks


I am looking to build a web application which needs to run resource-intensive MCMC (Markov chain Monte Carlo) calculations on-demand in R to generate some probability graphs for the user.

Constraints:

  1. Obviously I don't want to run the resource-intensive calculations on the same server as the web app front-end, so these tasks need to be handed off to a worker instance.

  2. These calculations take a good amount of CPU to run and I'd like to keep latency as low as possible (hopefully seconds, not minutes), so I would prefer to run the calculations on beefier hardware.

  3. I cannot afford to run a beefy EC2 instance at ~66¢/hr x 24hrs/day, so on-demand or spot request instances are probably necessary.

Here are the options I've come up with:

  1. Run a cheap, affordable worker instance 24hrs a day which takes one task at a time managed by Amazon SWF (or SQS).

    Cons:

    • high latency - Cheaper hardware, longer wait times.



  2. Spawn a beefier worker instance per-task (spun up whenever a job is added to the queue) and terminate the instance upon completion.

    Cons:

    • expensive/wasteful - I'd be paying for an hour on the server each time and only using seconds for my calculation

    • startup overhead - Would spinning up a new EC2 instance on-demand introduce non-negligible latency (offsetting the whole purpose of utilizing beefier hardware)?



  3. Like #2 but with low-bid EC2 spot requests.

    Cons:

    • startup overhead - See #2

    • inconsistancy? - I've never worked with spot requests before, so I have no idea how volatile or hands-on such a solution would be... do I have to continually adjust my bids to make sure I can still get tasks done at peak hours? Also, I suppose I'd have to monitor my processes closely to make sure they aren't interrupted mid-calculation.



  4. Some kind of hybrid solution where I actively monitor beefy-hardware worker instances and their loads and intelligently spin up and terminate instances on the hour to maintain an optimal balance of cost and availability

    Cons:

    • complicated and costly setup - Unless there's a good managed service out there to handle stuff like this, I'd have to set all all of that infrastructure up myself...

I wish there was some service where I could pay for a highly-available on-demand hardware on a minute to minute basis rather than hourly.

So my questions are the following:

  • How would you recommend solving this problem?

  • Is there a good EC2 instance managing solution that could sit on top of Amazon SWF and help me load balance and terminate idle workers?

  • Would spot-request bids solve my problem or are they more suited to tasks which don't necessarily need to be completed right away?


Solution

  • There's another option that you may not be aware of. I actually just stumbled upon it: http://multyvac.com

    I have no experience using it (so I can't vouch for it), but it looks like the first solution I've seen that actually offers true "utility computing". It began with just Python but now supports any language.