Search code examples
hadoophbaseapache-pigbigdata

Moving data to HBASE using Pig


I tried moving 851 data in my hbase for that i created hbase using below command

create 'customers', 'customers_data'

i moved the files using pig script. My pig script is

STOCK_A = LOAD '/user/cloudera/xxx' USING PigStorage('|');
data = FILTER STOCK_A BY ( $0 matches '.*MH.*');
MH_DATA = FOREACH data GENERATE $1, $3, $4;
STORE MH_DATA into 'hbase://customers' USING org.apache.pig.backend.hadoop.hbase.HBaseStorage('customers_data:firstname, customers_data:lastname, customers_data:age');

i got 851 data using my pig command. My data is

    (aman,george,22)
    (aman,george,22)
    (aman,george,22)
     .
     .
     .
     .
     .
    851 

but when i try to put this data in hbase using below command

PIG_CLASSPATH=/usr/lib/hbase/hbase.jar:/usr/lib/zookeeper/zookeeper-3.4.5-cdh4.4.0.jar /usr/bin/pig /home/cloudera/remot/pighl7

data that is getting stored in HBASE is

ROW                                         COLUMN+CELL                                                                                                                 
 \xB5~\x5C&                                 column=customers_data:firstname, timestamp=1478700582076, value=george
 \xB5~\x5C&                                 column=customers_data:lastname, timestamp=1478700582076, value=22

I cant find my 851 records as well as the third parameter. I don't know what i am doing wrong. Please help


Solution

  • After doing a lot of research and trail and error when i changed the row key from name to timestamp i solved my problem, As i am using using row key which is having same name as of others it always updates it.