Search code examples
sql-serverhadoophbasesqoop

how to see 3 version of Data in HBase


I have a SQL table with 6 columns, "row_id","customer_id","f_name","l_name","location","last_update_date".

1) I have created a HBase table through SQOOP for the above SQL table, below is the sqoop syntax

sqoop import --connect "jdbc:sqlserver://server:port;databaseName=db" --username xxx --password xxx --table xxx --hbase-table xxx --column-family amitesh --hbase-row-key row_id,customer_id --hbase-create-table -m 1

In the above sqooping, I have created a hbase row key by concatenating 2 columns, and its working fine, so far so good. Below is the Hbase "scan" output

hbase(main):036:0> scan 'xxx'
ROW                                          COLUMN+CELL
 111_emp1                                    column=amitesh:f_name, timestamp=1497365606380, value=dev
 111_emp1                                    column=amitesh:l_name, timestamp=1497365606380, value=saha
 111_emp1                                    column=amitesh:last_update_date, timestamp=1497365606380, value=2017-06-12
 111_emp1                                    column=amitesh:location, timestamp=1497365606380, value=hyd
 112_emp1                                    column=amitesh:f_name, timestamp=1497365606380, value=hari
 112_emp1                                    column=amitesh:l_name, timestamp=1497365606380, value=sri
 112_emp1                                    column=amitesh:last_update_date, timestamp=1497365606380, value=2017-06-13
 112_emp1                                    column=amitesh:location, timestamp=1497365606380, value=bng

2) when I "describe hbase_tbl", I found that the value of "VERSIONS =>1", AS YOU CAN SEE BELOW:

hbase(main):025:0> describe 'xxx'
Table HBASE_SQOOP is ENABLED
HBASE_SQOOP
COLUMN FAMILIES DESCRIPTION
{NAME => 'amitesh', BLOOMFILTER => 'ROW', VERSIONS => '1', IN_MEMORY => 'false', KEEP_DELETED_CELLS => 'FALSE', DATA_BLOCK_ENCODING => 'NONE', TTL => 'FOREVER', COMPRESSION
 => 'NONE', MIN_VERSIONS => '0', BLOCKCACHE => 'true', BLOCKSIZE => '65536', REPLICATION_SCOPE => '0'}

So, in order to change the value to 2 of version, I executed below HBase command.

alter 'xxx', {NAME => 'amitesh', VERSIONS => 2}

Which happen to run successfully, and the changed value now displayed by "describe" command for VERSION was 2.

3) Now to keep 2 version of the f_name, and l_name for the HBase row id 111_emp1, I updated the SQL Server table for row_id 111 twice, and re-sqooped it, but all I could see only the updated values, I do not see the current and past version of them, below is the "get" output"

hbase(main):038:0> get 'xxx', '111_emp1',{COLUMN=> 'amitesh:f_name',VERSION=>2}
COLUMN                                       CELL
 amitesh:f_name                              timestamp=1497365606380, value=dev
1 row(s) in 0.0040 seconds


hbase(main):047:0> get 'xx', '111_emp1',{COLUMN=> 'amitesh:f_name',VERSION=>2}
COLUMN                                       CELL
 amitesh:f_name                              timestamp=1497365863181, value=Raj
1 row(s) in 0.0110 seconds

As you can see above 2 "get" output, in the first "get", the value for f_name is "dev", and for 2nd "get", the value is "raj". But I expected to see "dev", and "raj" as my output. Since as per my "alter" command, the HBase should hold both of them together, but its not happening.

What am I missing?


Solution

  • It is VERSIONS => 2. The correct syntax to get more than one version would be

    get 'xxx', '111_emp1',{COLUMN=> 'amitesh:f_name',VERSIONS=>2}