Search code examples
hbase

How to count rows with filter in Hbase?


I need to find out only the row count of the result in a scan. The problem with the code below is that it returns the row key and the column key, which makes it very slow when shipping the data to the client. I only need to send the row count from the scan result to the client. Is there a specific way to do this directly?

scan 'consumer-data', {FILTER => "
PrefixFilter('test_row') 
AND KeyOnlyFilter() 
AND FirstKeyOnlyFilter() 
AND ((ColumnPrefixFilter('test_col:test1') 
AND ValueFilter(=, 'binary:test 1')) 
"}

Any help would be appreciated.


Solution

  • The code you wrote is very very slow. First off scan works sequentially (no map/reduce) so that is slow to begin with. Then you use two slow filters one that looks at column names that and the worse filter that actually looks at values. - what you get is one by one sequential read that examines each each column and value for matching columns)

    If you expect to run queries like these on a regular basis you should rethink your key. Also re do this as a map/reduce job, so at least it will divide the work