Following with this link https://cwiki.apache.org/confluence/display/Hive/HBaseIntegration#HBaseIntegration-HiveMAPtoHBaseColumnFamily
I'm trying to integrate hive and hbase, I have this configuration in hive-site.xml:
<property>
<name>hive.aux.jars.path</name>
<value>
file:///$HIVE_HOME/lib/hive-hbase-handler-2.0.0.jar,
file:///$HIVE_HOME/lib/hive-ant-2.0.0.jar,
file:///$HIVE_HOME/lib/protobuf-java-2.5.0.jar,
file:///$HIVE_HOME/lib/hbase-client-1.1.1.jar,
file:///$HIVE_HOME/lib/hbase-common-1.1.1.jar,
file:///$HIVE_HOME/lib/zookeeper-3.4.6.jar,
file:///$HIVE_HOME/lib/guava-14.0.1.jar
</value>
</property>
Then create a table named 'ts:testTable' in hbase:
hbase> create 'ts:testTable','pokes'
hbase> put 'ts:testTable', '10000', 'pokes:value','val_10000'
hbase> put 'ts:testTable', '10001', 'pokes:value','val_10001'
...
hbase> scan 'ts:testTable'
ROW COLUMN+CELL
10000 column=pokes:value, timestamp=1462782972084, value=val_10000
10001 column=pokes:value, timestamp=1462783514212, value=val_10001
....
And then create external table in hive:
Hive> CREATE EXTERNAL TABLE hbase_test_table(key int, value string )
STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
WITH SERDEPROPERTIES ("hbase.columns.mapping" = ":key, pokes:value")
TBLPROPERTIES ("hbase.table.name" = "ts:testTable",
"hbase.mapred.output.outputtable" = "ts:testTable");
So far so good. But when I tried to select data from the test table, exception was thrown:
Hive> select * from hbase_test_table;
FAILED: RuntimeException java.lang.ClassNotFoundException: NULL::character varying
Error: Error while compiling statement: FAILED: RuntimeException java.lang.ClassNotFoundException: NULL::character varying (state=42000,code=40000)
Am I missing anything?
I'm trying Hive 2.0.0 around with HBase 1.2.1
Ok, I figured it out, the "NULL::character varying" is not a part of hive, it is coming from Postgresql, as I'm using it as the back end of Metastore. But the problem is Hive doesn't recognizes this exception from Postgresql. We have the following code for Hive 2.0.0:
300: if (inputFormatClass == null) {
301: try {
302: String className = tTable.getSd().getInputFormat();
303: if (className == null) {
304: if (getStorageHandler() == null) {
305: return null;
306: }
307: inputFormatClass = getStorageHandler().getInputFormatClass();
308: } else {
309: inputFormatClass = (Class<? extends InputFormat>)
310: Class.forName(className, true, Utilities.getSessionSpecifiedClassLoader());
}
Line 302 will not return null which supposed to. so that the line 310 will try to load a non-existing class in. That's the reason why program failed.
I believe it is a compatible bug, the way to fix it is change the database which I hate to. So I just simply replaced 302 with
if (className == null || className.toLowerCase().startsWith("null::")) {
And do same thing to the getOutputFormat() method, then re-compile the jar, That's it.