I am trying to create an EMR cluster on java, but i can't neither find it on the EMR cluster list, neither can see the instances requested on EC2.
EMR roles do exist:
sqlInjection@VirtualBox:~$ aws iam list-roles | grep EMR
"RoleName": "EMR_DefaultRole",
"Arn": "arn:aws:iam::removed:role/EMR_DefaultRole"
"RoleName": "EMR_EC2_DefaultRole",
"Arn": "arn:aws:iam::removed:role/EMR_EC2_DefaultRole"
and now my java code:
AWSCredentials awsCredentials = new BasicAWSCredentials(awsKey, awsKeySecret);
AmazonElasticMapReduce emr = new AmazonElasticMapReduceClient(awsCredentials);
StepFactory stepFactory = new StepFactory();
StepConfig enabledebugging = new StepConfig()
.withName("Enable debugging")
HadoopJarStepConfig hadoopConfig1 = new HadoopJarStepConfig()
.withMainClass("com.strackoverflow.DriverFoo") // optional main class, this can be omitted if jar above has a manifest
.withArgs("--input=s3://foo.bucket/logs/,s3://foo.bucket/morelogs/", "--output=s3://foo.bucket/myEMROutput" , "--inputType=text"); // i have custom java code to handle the --input, --output and --inputType parameters
StepConfig customStep = new StepConfig("Step1", hadoopConfig1);
Collection <StepConfig> steps = new ArrayList<StepConfig>();
JobFlowInstancesConfig instancesConfig = new JobFlowInstancesConfig()
.withEc2KeyName("fookey") //not fookey.pem
.withKeepJobFlowAliveWhenNoSteps(false) // on aws example is set to true
RunJobFlowRequest request = new RunJobFlowRequest()
.withName("java programatic request")
.withSteps(steps) // on the amazon example is lunched debug and hive, here is debug and a jar
RunJobFlowResult result = emr.runJobFlow(request);
System.out.println("toString "+ result.toString());
System.out.println("getJobFlowId "+ result.getJobFlowId());
System.out.println("hashCode "+ result.hashCode());
Where is my cluster? I cannot see it on cluster list, output folder is not created, logs folder stays empty and no instances are visible on EC2.
by the program outputs this
toString {JobFlowId: j-2xxxxxxU}
getJobFlowId j-2xxxxxU
hashCode -1xxxxx4
I had follow the instruction from here to create the cluster http://docs.aws.amazon.com/ElasticMapReduce/latest/DeveloperGuide/calling-emr-with-java-sdk.html
And this to create the java job http://docs.aws.amazon.com/ElasticMapReduce/latest/DeveloperGuide/emr-common-programming-sample.html
On the Amazon example, the region is not configured.
After configuring the region the cluster is launched properly.
AmazonElasticMapReduce emr = new AmazonElasticMapReduceClient(awsCredentials);