Search code examples
amazon-web-serviceshadoopmapreduceapache-pig

STREAM keyword in pig script that runs in Amazon Mapreduce


I have a pig script, that activates another python program. I was able to do so in my own hadoop environment, but I always fail when I run my script in Amazon map reduce WS.

The log say:

org.apache.pig.backend.executionengine.ExecException: ERROR 2090: Received Error while processing the reduce plan: '' failed with exit status: 127 at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.runPipeline(PigMapReduce.java:347) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.processOnePackageOutput(PigMapReduce.java:288) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.reduce(PigMapReduce.java:260) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.reduce(PigMapReduce.java:142) at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:321) at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2216)

Any Idea?


Solution

  • Problem solved! All I need is to use the cache('s3://') option when defining the streaming command