Search code examples
hadoopfilterhbase

HBase filter rows using SingleColumnValueFilter


I have a HBase Table where I have a column qualifier which stores the created time as a long(convert to bytes array). I need to count the number of rows by filtering all the rows where created time is between the specified dates. Below is my java code.

    int count = 0;
    SimpleDateFormat dateFormat = new SimpleDateFormat("YYYY-MM-DD");
    HTable table = (HTable)connection.getTable(TableName.valueOf(tableName));
    long startTime = dateFormat.parse(startDate).getTime();
    long endTime = dateFormat.parse(endDate).getTime();

    Scan scan = new Scan();
    SingleColumnValueFilter filter1 = new SingleColumnValueFilter(ConstantsTruthy.CF_DETAIL_BYTES, ConstantsTruthy.QUAL_CREATE_TIME_BYTES, CompareFilter.CompareOp.GREATER_OR_EQUAL, Bytes.toBytes(startTime));
    filter1.setFilterIfMissing(true);
    SingleColumnValueFilter filter2= new SingleColumnValueFilter(ConstantsTruthy.CF_DETAIL_BYTES, ConstantsTruthy.QUAL_CREATE_TIME_BYTES, CompareFilter.CompareOp.LESS_OR_EQUAL, Bytes.toBytes(endTime));
    filter2.setFilterIfMissing(true);
    FilterList fl = new FilterList( FilterList.Operator.MUST_PASS_ALL);
    fl.addFilter(filter1);
    fl.addFilter(filter2);
    scan.addFamily(ConstantsTruthy.CF_DETAIL_BYTES);
    scan.setFilter(fl);
    ResultScanner rs = table.getScanner(scan);
    for (Result result = rs.next(); result != null; result = rs.next()) {
        count++;
    }
    System.out.println("Count : " + count);
    rs.close();
    table.close();

This code runs without any errors. But the rows it returns only belong to a specific time. It does not contain all the rows in the time range. Could someone help me to figure out the problem with my filters.


Solution

  • The code looks perfect.

    We have tried the same and it is working even now also.

    Could you please verify your input "startDate" and "endDate" once again.

    Also what is the hbase version which you are using?