Search code examples
amazon-ec2mapreduceelastic-map-reduce

Re-use Amazon Elastic MapReduce instance


I have tried a simple Map/Reduce task using Amazon Elastic MapReduce and it took just 3 mins to complete the task. Is it possible to re-use the same instance to run another task.

Even though I have just used the instance for 3 mins Amazon will charge for 1 hr, so I want to use the balance 57 mins to run several other tasks.


Solution

  • The answer is yes.

    here's how you do it using the command line client:

    When you create an instance pass the --alive flag, this tells emr to keep the cluster around after your job has run.

    Then you can submit more tasks to the cluster:

    elastic-mapreduce --jobflow <job-id> --stream --input <s3dir> --output <s3dir> --mapper <script1> --reducer  <script2>
    

    To terminate the cluster later, simply run:

    elastic-mapreduce <jobid> --terminate
    

    try running elastic-mapreduce --help to see all the commands you can run.

    If you don't have the command line client, get it here.