I have tried a simple Map/Reduce task using Amazon Elastic MapReduce
and it took just 3 mins to complete the task. Is it possible to re-use the same instance to run another task.
Even though I have just used the instance for 3 mins Amazon will charge for 1 hr
, so I want to use the balance 57 mins to run several other tasks.
The answer is yes.
here's how you do it using the command line client:
When you create an instance pass the --alive flag, this tells emr to keep the cluster around after your job has run.
Then you can submit more tasks to the cluster:
elastic-mapreduce --jobflow <job-id> --stream --input <s3dir> --output <s3dir> --mapper <script1> --reducer <script2>
To terminate the cluster later, simply run:
elastic-mapreduce <jobid> --terminate
try running elastic-mapreduce --help to see all the commands you can run.
If you don't have the command line client, get it here.