I want to scan rows in a HTable from the HBase shell using row matching some pattern.
For example, I have the following table data:
row:r1_t1 column:cf:a, timestamp=1461911995948,value=v1
row:r2_t2 column:cf:a, timestamp=1461911995949,value=v2
row:s1_t1 column:cf:a, timestamp=1461911995950,value=q1
row:s2_t2 column:cf:a, timestamp=1461911995951,value=q2
Based on the above data I want to find the rows that contain 't1' :
row:r1_t1 column:cf:a, timestamp=1461911995948,value=v1
row:s1_t1 column:cf:a, timestamp=1461911995950,value=q1
I know I can scan the table with PrefixFilter, but this method takes the rows that starts with the specified filter.
scan 'test', {FILTER => "(PrefixFilter('s')"}
Is there a similar way of scanning the table based on filtering the rows with the pattern matching in the middle of the row name?
hbase(main):003:0> scan 'test', {ENDROW => 't1'}
In general, Using a PrefixFilter
can be slow because it performs a table scan until it reaches the prefix.
Also can use RowFilter with SubstringComparator like below
Can use RowFilter
with SubstringComparator
like below
hbase(main):003:0> import org.apache.hadoop.hbase.filter.CompareFilter
hbase(main):005:0> import org.apache.hadoop.hbase.filter.SubstringComparator
hbase(main):006:0> scan 'test', {FILTER => org.apache.hadoop.hbase.filter.RowFilter.new(CompareFilter::CompareOp.valueOf('EQUAL'),SubstringComparator.new("searchkeyword"))}