Can anyone help me figure out why i am getting error while using REGISTER to register the jar file 'elephant bird' to load json data:
I work in the local mode of the pig 0.16 and get the error: /home/shanky/Downloads/elephant-bird-hadoop-compat-4.1.jar' does not exist. /home/shanky/Downloads/elephant-bird-pig-4.1.jar' does not exist.
Code to load json data:
REGISTER '/home/shanky/elephant-bird-hadoop-compat-4.1.jar';
REGISTER '/home/shanky/Downloads/elephant-bird-pig-4.1.jar';
REGISTER '/home/shanky/Downloads/json-simple-1.1.1.jar';
load_tweets = LOAD '/home/shanky/Downloads/data.json' USING com.twitter.elephantbird.pig.load.JsonLoader('-nestedLoad') AS myMap;
dump load_tweets;
I tried replacing REGISTER statement by removing quotes and putting hdfs:// but nothing work for me.
The quotes shouldn't be included per the pig documentation (https://pig.apache.org/docs/r0.16.0/basic.html#register-jar), but your syntax did work for me (I'm using 0.12.0-cdh5.12.0 though).
Since you said you tried it without the quotes, some thoughts:
*You mention trying adding hdfs://, are these dependencies on hdfs by any chance? It doesn't seem like it since they have Downloads in the path, but if they are, you won't be able to locate them running pig in local mode. If they are on your local filesystem, you should be able to access them with the path as you have it whether you run it locally or not.
*Are the files actually there? Are the permissions right? Etc.
*Assuming you just want to get around the issue for now, have you tried any of the other methods of registering a jar, such as -Dpig.additional.jars.uris=/home/shanky/elephant-bird-hadoop-compat-4.1.jar,/home/shanky/Downloads/elephant-bird-pig-4.1.jar