Search code examples
hadoophiveapache-pighcatalog

pig script does not exists error , even if I can see it in hdfs


I am trying to run the pig script using the -f usecatalog option but it is giving me issue. it says script does not exist, while I can see the file is present in hdfs file system. see below.

[hdfs@ip-xx-xx-xx-x-xx ec2-user]$ pig -useHCatalog -f   /user/admin/pig/scripts/hcat1.pig  
    WARNING: Use "yarn jar" to launch YARN applications.  
    16/04/01 13:44:13 INFO pig.ExecTypeProvider: Trying ExecType : LOCAL  
    16/04/01 13:44:13 INFO pig.ExecTypeProvider: Trying ExecType : MAPREDUCE  
    16/04/01 13:44:13 INFO pig.ExecTypeProvider: Picked MAPREDUCE as the ExecType  
    2016-04-01 13:44:13,645 [main] INFO  org.apache.pig.Main - Apache Pig version 0.15.0.2.3.4.0-3485 (rexported) compiled Dec 16 20                     15, 04:30:33  
    2016-04-01 13:44:13,645 [main] INFO  org.apache.pig.Main - Logging error messages to: /tmp/hsperfdata_hdfs/pig_1459532653643.log  
    2016-04-01 13:44:14,184 [main] ERROR org.apache.pig.Main - ERROR 2997: Encountered IOException. File /user/admin/pig/scripts/hca                     t1.pig does not exist  
    Details at logfile: /tmp/hsperfdata_hdfs/pig_1459532653643.log  
    2016-04-01 13:44:14,203 [main] INFO  org.apache.pig.Main - Pig script completed in 753 milliseconds (753 ms)

[hdfs@ip-xxx-xx-xx-xx ec2-user]$ hadoop fs -cat /user/admin/pig/scripts/hcat1.pig  
    a = load 'trucks' using org.apache.hive.hcatalog.pig.HCatLoader();  
    b = filter a by truckid == 'A1';  
    store b INTO '/user/admin/pig/scritps/outputb1';  

Solution

  • You need to specify the complete HDFS URI to run the scripts that are stored in HDFS.

    Here is what you need:

    $pig -useHCatalog hdfs://namenode_hostname:port/user/admin/pig/scripts/hcat1.pig