Search code examples
hbasearrow-keys

HBASE - select distinct query against the rowkey


I have a hbase table called "users", rowkey consists of three parts:

  1. userid
  2. messageid
  3. timestamp

rowkey looks like: ${userid}_${messageid}_${timestamp}

Given I can hash the userid and make the length of the field fixed, is there anyway I can do a query like SQL query:

select distinct(userid) from users

If rowkey doesn't allow me to query like this, does that mean I need to create a separated table just contains all the user ids? I guess if I do something like that, it won't be atomic anymore when I insert a record in, becoz I am dealing with two tables without transaction.


Solution

  • You can do that but as a map/reduce job not a direct query