Search code examples
hadoophiveapache-pighcatalog

Can not write to Hive table from pig


I have weird situation. When I'm running pig script as test1 user, script executes successfully:

 pig -param_file /tmp/pig_parameters.param -param DBNAME=default -param TABLENAME=test_pig_table_orc -param FPATH=/data/170622164344.csv /tmp/test.pig

2017-10-31 14:40:40,968 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Success!
2017-10-31 14:40:41,057 [Thread-7] INFO  hive.metastore - Closed a connection to metastore, current connections: 1
2017-10-31 14:40:41,058 [Thread-7] INFO  hive.metastore - Closed a connection to metastore, current connections: 0

Scripts simple load data from csv and stores data into hive table

But when I connect to the server as another user - test2, and run the same script, got this exception :

Pig Stack Trace
---------------
ERROR 1115: org.apache.hive.hcatalog.common.HCatException : 2001 : Error setting output information. Cause : org.apache.thrift.transport.TTransportException

org.apache.pig.impl.plan.VisitorException: ERROR 1115: 
<line 27, column 0> Output Location Validation Failed for: 'default.test_pig_table_orc More info to follow:
org.apache.hive.hcatalog.common.HCatException : 2001 : Error setting output information. Cause : org.apache.thrift.transport.TTransportException
    at org.apache.pig.newplan.logical.rules.InputOutputFileValidator$InputOutputFileVisitor.visit(InputOutputFileValidator.java:75)
    at org.apache.pig.newplan.logical.relational.LOStore.accept(LOStore.java:66)
    at org.apache.pig.newplan.DepthFirstWalker.depthFirst(DepthFirstWalker.java:64)
    at org.apache.pig.newplan.DepthFirstWalker.depthFirst(DepthFirstWalker.java:66)
    at org.apache.pig.newplan.DepthFirstWalker.depthFirst(DepthFirstWalker.java:66)
    at org.apache.pig.newplan.DepthFirstWalker.depthFirst(DepthFirstWalker.java:66)
    at org.apache.pig.newplan.DepthFirstWalker.depthFirst(DepthFirstWalker.java:66)
    at org.apache.pig.newplan.DepthFirstWalker.walk(DepthFirstWalker.java:53)
    at org.apache.pig.newplan.PlanVisitor.visit(PlanVisitor.java:52)
    at org.apache.pig.newplan.logical.rules.InputOutputFileValidator.validate(InputOutputFileValidator.java:45)
    at org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.compile(HExecutionEngine.java:311)
    at org.apache.pig.PigServer.compilePp(PigServer.java:1392)
    at org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:1317)
    at org.apache.pig.PigServer.execute(PigServer.java:1309)
    at org.apache.pig.PigServer.executeBatch(PigServer.java:387)
    at org.apache.pig.PigServer.executeBatch(PigServer.java:365)
    at org.apache.pig.tools.grunt.GruntParser.executeBatch(GruntParser.java:140)
    at org.apache.pig.tools.grunt.GruntParser.processScript(GruntParser.java:504)
    at org.apache.pig.tools.pigscript.parser.PigScriptParser.Script(PigScriptParser.java:1014)
    at org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:550)
    at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:198)
    at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:173)
    at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:69)
    at org.apache.pig.Main.run(Main.java:547)
    at org.apache.pig.Main.main(Main.java:158)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:606)
    at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
    at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
Caused by: org.apache.pig.PigException: ERROR 1115: org.apache.hive.hcatalog.common.HCatException : 2001 : Error setting output information. Cause : org.apache.thrift.transport.TTransportException
    at org.apache.hive.hcatalog.pig.HCatStorer.setStoreLocation(HCatStorer.java:196)
    at org.apache.pig.newplan.logical.rules.InputOutputFileValidator$InputOutputFileVisitor.visit(InputOutputFileValidator.java:68)
    ... 30 more
Caused by: org.apache.hive.hcatalog.common.HCatException : 2001 : Error setting output information. Cause : org.apache.thrift.transport.TTransportException
    at org.apache.hive.hcatalog.mapreduce.HCatOutputFormat.setOutput(HCatOutputFormat.java:220)
    at org.apache.hive.hcatalog.mapreduce.HCatOutputFormat.setOutput(HCatOutputFormat.java:70)
    at org.apache.hive.hcatalog.pig.HCatStorer.setStoreLocation(HCatStorer.java:191)
    ... 31 more
Caused by: org.apache.thrift.transport.TTransportException
    at org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:132)
    at org.apache.thrift.transport.TTransport.readAll(TTransport.java:86)
    at org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:429)
    at org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:318)
    at org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:219)
    at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:77)
    at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_get_table(ThriftHiveMetastore.java:1254)
    at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.get_table(ThriftHiveMetastore.java:1240)
    at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getTable(HiveMetaStoreClient.java:1263)
    at org.apache.hive.hcatalog.common.HCatUtil.getTable(HCatUtil.java:180)
    at org.apache.hive.hcatalog.mapreduce.HCatOutputFormat.setOutput(HCatOutputFormat.java:91)
    ... 33 more

Both users are members of supergroup and have equal permissions. Script runs from the same server. Tried to place script .pig file localy and on hdfs as well - the same error

Also important point, that it runs successfully from each worker, except master node. Cluster has kerberos authentication

Got stuck with this issue, pls suggest what I could try to fix it?


Solution

  • Solved, by removing hive-site.xml from test2 user home folder. Or just simply run script being in another directory

    In my case there was an old hive-site.xml without kerberos configuration parameters in test2 user home folder. When this user ran pig script, by default it applied file conf parameters from home folder (not only hive), if they are located there.