Search code examples
hadooploadhbaseflat-filebiginsights

How to load a flat file(not delimited file) into HBase?


I am new to hbase and I have a flat file(not delimited file) that I would like to load into a single hbase table.

Here is a preview of a row in my file:

0107E07201512310015071C11100747012015123100

I know fo example that from position 1 to 7 it's an id and from position 7 to 15 it's a date....

The problem is how to build a schema that correspond to my file or if there is a way to convert it to a delimited file or read such file using jaql because I'm working with Infosphere BigInsights.

Any help would be greatly appreciated.

Thanks in advance.


Solution

  • Create a Hive table using RegExSerDe

    CREATE EXTERNAL TABLE testtable ((col1 STRING, col2 STRING, col3 STRING)
    ROW FORMAT SERDE ‘org.apache.hadoop.hive.contrib.serde2.RegexSerDe’
    WITH SERDEPROPERTIES (“input.regex” = “(.{5})(.{6})(.{3}).*” )
    LOCATION ‘<hdfs-file-location>’;
    

    You can create hive table pointing to HBase Here are the instructions http://hortonworks.com/blog/hbase-via-hive-part-1/

    You can use insert overwrite table to load the data from hive table to HBase-table https://cwiki.apache.org/confluence/display/Hive/GettingStarted#GettingStarted-SELECTSandFILTERS