Search code examples
hiveapache-pighcatalog

Pig not locating Hive Table using Hcatalog


I using PIG to access a table batting_data created via HCatalog. While doing so I am facing an error saying the mentioned table was not not found. However this batting_data table is available in HIVE. I also understand that if the database name is not mentioned then default is assumed.

ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1115: Table not found : default.batting_data table not found

  1. I have setup hive-site.xml as below. Pls note am not using a remote server for the metastore but a local server which is mysql

    <configuration>
    <property>
            <name>javax.jdo.option.ConnectionURL</name>
            <value>jdbc:mysql://localhost/metastore?createDatabaseIfNotExist=true</value>
            <description>the URL of the MySQL database</description>
    </property>
    
    <property>
            <name>javax.jdo.option.ConnectionDriverName</name>
            <value>com.mysql.jdbc.Driver</value>
    </property>
    
    <property>
            <name>javax.jdo.option.ConnectionUserName</name>
            <value>root</value>
    </property>
    
    <property>
            <name>javax.jdo.option.ConnectionPassword</name>
            <value>root</value>
    </property>
    
    <property>
            <name>hive.hwi.listen.host</name>
            <value>0.0.0.0</value>
    </property>
    <property>
            <name>hive.hwi.listen.port</name>
            <value>9999</value>
    </property>
    <property>
            <name>hive.hwi.war.file</name>
            <value>lib/hive-hwi-0.12.0.war</value>
    </property>
    
    <property>
            <name>hive.metastore.local</name>
            <value>true</value>
    </property>
    

  2. I have setup the below in my .bashrc for the PIG integration with HIVE and HCATALOG.

    export PIG_OPTS=-Dhive.metastore.local=true export PIG_CLASSPATH=$HCAT_HOME/share/hcatalog/:$HIVE_HOME/lib/

  3. The below statement will be loaded by default when the GRUNT shell when PIG starts.

    REGISTER /home/shiva/hive-0.12.0/hcatalog/share/hcatalog/hcatalog-core-0.12.0.jar; REGISTER /home/shiva/hive-0.12.0/lib/hive-exec-0.12.0.jar; REGISTER /home/shiva/hive-0.12.0/lib/hive-metastore-0.12.0.jar;


The complete log of the error message is below. Any help in fixing this would be appreciated. Thanks.

grunt> a = LOAD 'batting_data' USING org.apache.hcatalog.pig.HCatLoader();         
2015-01-01 01:06:33,849 [main] INFO  org.apache.hadoop.hive.metastore.HiveMetaStore - 0: Opening raw store with implemenation class:org.apache.hadoop.hive.metastore.ObjectStore
2015-01-01 01:06:33,865 [main] INFO  org.apache.hadoop.hive.metastore.ObjectStore - ObjectStore, initialize called
2015-01-01 01:06:34,049 [main] INFO  DataNucleus.Persistence - Property datanucleus.cache.level2 unknown - will be ignored
2015-01-01 01:06:34,365 [main] WARN  com.jolbox.bonecp.BoneCPConfig - Max Connections < 1. Setting to 20
2015-01-01 01:06:35,470 [main] INFO  org.apache.hadoop.hive.metastore.ObjectStore - Setting MetaStore object pin classes with hive.metastore.cache.pinobjtypes="Table,StorageDescriptor,SerDeInfo,Partition,Database,Type,FieldSchema,Order"
2015-01-01 01:06:35,501 [main] INFO  org.apache.hadoop.hive.metastore.ObjectStore - Initialized ObjectStore
2015-01-01 01:06:36,265 [main] WARN  com.jolbox.bonecp.BoneCPConfig - Max Connections < 1. Setting to 20
2015-01-01 01:06:36,506 [main] INFO  org.apache.hadoop.hive.metastore.HiveMetaStore - 0: get_database: NonExistentDatabaseUsedForHealthCheck
2015-01-01 01:06:36,506 [main] INFO  org.apache.hadoop.hive.metastore.HiveMetaStore.audit - ugi=shiva   ip=unknown-ip-addr  cmd=get_database: NonExistentDatabaseUsedForHealthCheck 
2015-01-01 01:06:36,512 [main] ERROR org.apache.hadoop.hive.metastore.RetryingHMSHandler - NoSuchObjectException(message:There is no database named nonexistentdatabaseusedforhealthcheck)
    at org.apache.hadoop.hive.metastore.ObjectStore.getMDatabase(ObjectStore.java:431)
    at org.apache.hadoop.hive.metastore.ObjectStore.getDatabase(ObjectStore.java:441)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:597)
    at org.apache.hadoop.hive.metastore.RetryingRawStore.invoke(RetryingRawStore.java:124)
    at com.sun.proxy.$Proxy6.getDatabase(Unknown Source)
    at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_database(HiveMetaStore.java:628)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:597)
    at org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:103)
    at com.sun.proxy.$Proxy7.get_database(Unknown Source)
    at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getDatabase(HiveMetaStoreClient.java:810)
    at org.apache.hcatalog.common.HiveClientCache$CacheableHiveMetaStoreClient.isOpen(HiveClientCache.java:277)
    at org.apache.hcatalog.common.HiveClientCache.get(HiveClientCache.java:147)
    at org.apache.hcatalog.common.HCatUtil.getHiveClient(HCatUtil.java:547)
    at org.apache.hcatalog.pig.PigHCatUtil.getHiveMetaClient(PigHCatUtil.java:150)
    at org.apache.hcatalog.pig.PigHCatUtil.getTable(PigHCatUtil.java:186)
    at org.apache.hcatalog.pig.HCatLoader.getSchema(HCatLoader.java:194)
    at org.apache.pig.newplan.logical.relational.LOLoad.getSchemaFromMetaData(LOLoad.java:175)
    at org.apache.pig.newplan.logical.relational.LOLoad.<init>(LOLoad.java:89)
    at org.apache.pig.parser.LogicalPlanBuilder.buildLoadOp(LogicalPlanBuilder.java:853)
    at org.apache.pig.parser.LogicalPlanGenerator.load_clause(LogicalPlanGenerator.java:3479)
    at org.apache.pig.parser.LogicalPlanGenerator.op_clause(LogicalPlanGenerator.java:1536)
    at org.apache.pig.parser.LogicalPlanGenerator.general_statement(LogicalPlanGenerator.java:1013)
    at org.apache.pig.parser.LogicalPlanGenerator.statement(LogicalPlanGenerator.java:553)
    at org.apache.pig.parser.LogicalPlanGenerator.query(LogicalPlanGenerator.java:421)
    at org.apache.pig.parser.QueryParserDriver.parse(QueryParserDriver.java:188)
    at org.apache.pig.PigServer$Graph.validateQuery(PigServer.java:1648)
    at org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1621)
    at org.apache.pig.PigServer.registerQuery(PigServer.java:575)
    at org.apache.pig.tools.grunt.GruntParser.processPig(GruntParser.java:1093)
    at org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:501)
    at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:198)
    at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:173)
    at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:69)
    at org.apache.pig.Main.run(Main.java:541)
    at org.apache.pig.Main.main(Main.java:156)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:597)
    at org.apache.hadoop.util.RunJar.main(RunJar.java:160)

2015-01-01 01:06:36,514 [main] INFO  org.apache.hadoop.hive.metastore.HiveMetaStore - 0: get_table : db=default tbl=batting_data
2015-01-01 01:06:36,514 [main] INFO  org.apache.hadoop.hive.metastore.HiveMetaStore.audit - ugi=shiva   ip=unknown-ip-addr  cmd=get_table : db=default tbl=batting_data 
2015-01-01 01:06:36,516 [main] INFO  DataNucleus.Datastore - The class "org.apache.hadoop.hive.metastore.model.MFieldSchema" is tagged as "embedded-only" so does not have its own datastore table.
2015-01-01 01:06:36,516 [main] INFO  DataNucleus.Datastore - The class "org.apache.hadoop.hive.metastore.model.MOrder" is tagged as "embedded-only" so does not have its own datastore table.
2015-01-01 01:06:36,795 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1115: Table not found : default.batting_data table not found
Details at logfile: /home/shiva/pig_1420054544179.log

Solution

  • Okay here goes.. I fixed it..

    1. I had not mentioned the PIG_OPTS with the correct address of the HIVE THRIFT server due to which PIG was not able to connect to HIVE metastore and therefoer table not found. Changed it to PIG_OPTS=-Dhive.metastore.uris=thrift://localhost:10000

    2. Start the HIVESERVER service using

      $ bin/hive --service hiveserver

    The above fixed the issue and now am able to connect PIG to HIVE. Thanks