Search code examples
nosqlhbasehbasestorage

How can I scan for rows based on a row pattern in HBase shell?


I want to scan rows in a HTable from the HBase shell using row matching some pattern.

For example, I have the following table data:

    row:r1_t1  column:cf:a, timestamp=1461911995948,value=v1
    row:r2_t2  column:cf:a, timestamp=1461911995949,value=v2
    row:s1_t1  column:cf:a, timestamp=1461911995950,value=q1
    row:s2_t2  column:cf:a, timestamp=1461911995951,value=q2

Based on the above data I want to find the rows that contain 't1' :

    row:r1_t1  column:cf:a, timestamp=1461911995948,value=v1
    row:s1_t1  column:cf:a, timestamp=1461911995950,value=q1

I know I can scan the table with PrefixFilter, but this method takes the rows that starts with the specified filter.

    scan 'test', {FILTER => "(PrefixFilter('s')"}

Is there a similar way of scanning the table based on filtering the rows with the pattern matching in the middle of the row name?


Solution

  • hbase(main):003:0> scan 'test', {ENDROW => 't1'}
    

    In general, Using a PrefixFilter can be slow because it performs a table scan until it reaches the prefix.

    Also can use RowFilter with SubstringComparator like below

    Can use RowFilter with SubstringComparator like below

    hbase(main):003:0> import org.apache.hadoop.hbase.filter.CompareFilter
    hbase(main):005:0> import org.apache.hadoop.hbase.filter.SubstringComparator
    hbase(main):006:0> scan 'test', {FILTER => org.apache.hadoop.hbase.filter.RowFilter.new(CompareFilter::CompareOp.valueOf('EQUAL'),SubstringComparator.new("searchkeyword"))}