Search code examples
pythonamazon-web-servicesamazon-emrmrjob

mrjob in emr is running only 1 MRStep out of 3 MRSteps and cluster is shutting down


The AWS CONSOLE emr terminated just after executing STEP 1 of mrjob

log of the first step in aws

The error looks something like this :- Terminating cluster: j-SDOP2KOKWYZM

botocore.exceptions.ClientError: An error occurred (ValidationException) when calling the AddJobFlowSteps operation: A job flow that is shutting down, terminated, or finished may not be modified.


Solution

  • Following the error. It is clear that the cluster is terminating after step 1. This issue is because the botocore package is deprecated.

    A solution to this could be:

    1. Start a persistent cluster
    2. Use that cluster ID to run the mr job in emr

    Commands:

    mrjob create-cluster
    

    Make sure you have configured the cluster-info in mrjob.config file. The above command lets you create a persistent cluster.

    python3 MovieSimilarities.py -r emr --cluster-id "your-cluster-id"
        --items=ml-100k/u.item ml-100k/u.data > sims2t.txt
    

    Now specify the cluster id to run your cluster.