I have a pig script that accept some arguments. I need to use AWS PowerShell Cmdlet only. I am able to create cluster with pig installed using below command:
$app = New-Object Amazon.ElasticMapReduce.Model.Application
$app.Name="Pig"
$jobid = Start-EMRJobFlow -Name "Pig Job" -Application $app -Instances_MasterInstanceType "m3.xlarge" -Instances_KeepJobFlowAliveWhenNoSteps $true -Instances_InstanceCount 1 -LogUri "s3://mybucket/logs" -VisibleToAllUsers $true -ReleaseLabel "emr-5.7.0" -SecurityConfiguration "my-sec-grp" -JobFlowRole "EMR_EC2_DefaultRole" -ServiceRole "EMR_DefaultRole"
But I am not able to add step for pig job. I followed some articles but those are very old or those are using some custom jar to submit the job. I just need to submit a pig script which is accepting some parameters. Any help will be highly appreciated Note: i need powershell specific commands. I am able to do this using AWS cli.
I got the way to submit pig scripts from powershell. I was following this link. But the problem was that its regarding Hive scripts. So the step where its creating step as
$runhivescriptargs = @("s3://us-east-1.elasticmapreduce/libs/hive/hive-script", `
"--base-path", "s3://us-east-1.elasticmapreduce/libs/hive", `
"--hive-versions","latest", `
"--run-hive-script", `
"--args", `
"-f", "s3://elasticmapreduce/samples/hive-ads/libs/join-clicks-to-impressions.q", `
"-d", "SAMPLE=s3://elasticmapreduce/samples/hive-ads",`
"-d", "DAY=2009-04-13", `
"-d", "HOUR=08", `
"-d", "NEXT_DAY=2009-04-13", `
"-d", "NEXT_HOUR=09",`
"-d", "INPUT=s3://elasticmapreduce/samples/hive-ads/tables", `
"-d", "OUTPUT=s3://my-output-bucket/joinclick1", `
"-d", "LIB=s3://elasticmapreduce/samples/hive-ads/libs")
So i followed the same steps but somehow in case of pig scripts arguments need to be passed using -p option not using -d option So the my step creation is like:
$runpigscriptargs = @("s3://us-east-1.elasticmapreduce/libs/pig/pig-script", `
"--base-path", "s3://us-east-1.elasticmapreduce/libs/pig", `
"--run-pig-script", `
"--args", `
"-f", $scriptfile, `
"-p", "Id=$Id",`
"-p", "jarPath=$jarPath",`
"-p", "inputPath=$newInputPath", `
"-p", "outputPath=$outputPath")
I am not specifying pig version as i have already created a EMR cluster having latest version of pig installed Thanks