There are similar questions on Stack overflow but none of them answer the question. The problem arises when as per the following link http://grepalex.com/2013/02/25/hadoop-libjars/ ,we need to use export HADOOP_CLASSPATH=/path/jar1:/path/jar2 to get it to work. So how can I execute export HADOOP_CLASSPATH=/path/jar1:/path/jar2 for -libjars option to work.
I have implemented a Tool Runner . It works perfectly on hadoop and HDFS.
I tried executing this while using custom jar but it gives Exception java.lang.NoClassDefFoundError: org/json/simple/parser/JSONParser
:
This is what I ran in EMR where I am using MultipleInputs and a file to parse so you can see multiple paths as arguments this works while running in hadoop.
Alert -libjars s3n://akshayhazari/jars/json-simple-1.1.1.jar -D mapred.output.compress=true -D mapred.output.compression.type=BLOCK -D io.seqfile.compression.type=BLOCK -D mapred.output.compression.codec=org.apache.hadoop.io.compress.GzipCodec s3n://akshayhazari/rule/rule1.json s3n://akshayhazari/Alert/input/data.txt.gz s3n://akshayhazari/Alert/input/data1.txt.gz s3n://akshayhazari/Alert/output
Any help is appreciated.
Can you try creating FatJar and Run. Try to create one jar with dependency added and then Run with EMR. It will work.
in ant build you can use as below
< zip destfile="/lib/abc-fatjar.jar" >
< zipgroupfileset dir="lib" includes="jobcustomjar.jar,json-simple-1.1.1.jar" />
< /zip >