I am trying to figure out since yesterday why my table creation is not working. Since I can't link my Impala to my Hbase I can't make queries on my twitter stream :/
Do I need a special JAR like Hive for the SerDe properties ?
Here is my command:
CREATE EXTERNAL TABLE HB_IMPALA_TWEETS ( id int, id_str string, text string, created_at timestamp, geo_latitude double, geo_longitude double, user_screen_name string, user_location string, user_followers_count string, user_profile_image_url string )
STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
WITH SERDEPROPERTIES ( "hbase.columns.mapping" = ":key,tweet:id_str,tweet:text,tweet:created_at,tweet:geo_latitude,tweet:geo_longitude, user:screen_name,user:location,user:followers_count,user:profile_image_url" ) TBLPROPERTIES("hbase.table.name" = "tweets");
But I got an error on: the strored by:
Query: create EXTERNAL TABLE HB_IMPALA_TWEETS ( id int, id_str string, text string, created_at timestamp, geo_latitude double, geo_longitude double, user_screen_name string, user_location string, user_followers_count string, user_profile_image_url string ) STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' WITH SERDEPROPERTIES ( "hbase.columns.mapping" = ":key,tweet:id_str,tweet:text,tweet:created_at,tweet:geo_latitude,tweet:geo_longitude, user:screen_name,user:location,user:followers_count,user:profile_image_url" ) TBLPROPERTIES("hbase.table.name" = "tweets") ERROR: AnalysisException: Syntax error in line 1: ...image_url string ) STORED BY 'org.apache.hadoop.hive.h...
Encountered: BY
Expected: AS
CAUSED BY: Exception: Syntax error
For info, I followed this page: https://github.com/AronMacDonald/Twitter_Hbase_Impala/blob/master/README.md
Thanks for helping me :)
Well, it seems that Impala still not support the SerDe (serialization/deserialisation).
"You create the tables on the Impala side using the Hive shell, because the Impala CREATE TABLE statement currently does not support custom SerDes and some other syntax needed for these tables: You designate it as an HBase table using the STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' clause on the Hive CREATE TABLE statement."
So, just run the command on the hive shell, or hue hive, then, in impala, type 'invalidate metadata', and then you can see your table with a 'show tables'.
So for this part the problem seems solved.