I'm practicing a video tutorial from plural sight about Amazon EMR. I am stuck as i cannot proceed as i am getting this error
Not a valid JAR: /home/hadoop/contrib/streaming/hadoop-streaming.jar
Please note that tutorial is old and it is using a older Emr version. I am using the latest version is that a problem ?
The steps that i took are after entering the credentials in putty
1) Hadoop
2) mkdir streamingCode`
3) wget -o ./streamingCode/wordSplitter.py s3://elasticmapreduce/samples/wordcount/wordSplitter.py
4) hadoop jar contrib/streaming/hadoop-streaming.jar -files streamingCode/wordSplitter.py -mapper wordSplitter.py input s3://elasticmapreduce/samples/wordcount/input -output streamingCode/wordCountOut -reducer aggregate`
I cannot execute step 4 as i am getting the below error
Not a valid JAR: /home/hadoop/contrib/streaming/hadoop-streaming.jar
The Hadoop streaming jar is still available in the latest release of EMR Hadoop. Starting with EMR release 4.0.0 it can be found at /usr/lib/hadoop-mapreduce/hadoop-streaming.jar
.
Another good resource for differences between versions can be found at http://docs.aws.amazon.com/ElasticMapReduce/latest/ReleaseGuide/emr-release-differences.html.