Search code examples
hadoophbasethrift

How to perform a simple select from HBase with criteria (Where clause)


I have the following simple table I've created from the following source: https://hortonworks.com/hadoop-tutorial/introduction-apache-hbase-concepts-apache-phoenix-new-backup-restore-utility-hbase/#start-hbase

using the following:

create 'driver_dangerous_event','events'
put 'driver_dangerous_event','4','events:driverId','78'
put 'driver_dangerous_event','4','events:driverName','Carl'
put 'driver_dangerous_event','4','events:eventTime','2016-09-23 03:25:03.567'
put 'driver_dangerous_event','4','events:eventType','Normal'
put 'driver_dangerous_event','4','events:latitudeColumn','37.484938'
put 'driver_dangerous_event','4','events:longitudeColumn','-119.966284'
put 'driver_dangerous_event','4','events:routeId','845'
put 'driver_dangerous_event','4','events:routeName','Santa Clara to San Diego'
put 'driver_dangerous_event','4','events:truckId','637'

I need to query this row but using a where filter (for future use), I have a rest api or thrift api running on my server.

I tried using rest api but failed to do it, is it possible?

also I tried using this nuget: https://hbasenet.codeplex.com/releases/view/133288 but I can't understand how to filter the data with where clause, I can only select a specific row:

Hbase.Client c = new Hbase.Client(serverHostName, port, 10000);
var res = c.Scan<Driver>("driver_dangerous_event", "events", "1");

Is there any option to do a simple filtered query with REST api/ Thrift API/ some other C# library?


Solution

  • I used Microsoft.HBase.Client for preforming a simple query (https://github.com/hdinsight/hbase-sdk-for-net)

    // Connection
    RequestOptions scanOptions = RequestOptions.GetDefaultOptions();
    scanOptions.Port = int.Parse(hbaseDataConnection.Port);
    scanOptions.AlternativeEndpoint = "/";
    var nodeIPs = new List<string>();
    nodeIPs.Add(hbaseDataConnection.Address);
    HBaseClient client = new HBaseClient(null, scanOptions, new LoadBalancerRoundRobin(nodeIPs));
    Scanner scanner = new Scanner { batch = 10 };
    ScannerInformation scannerInfo = await client.CreateScannerAsync(_tableName, scanner, scanOptions);
    var options = RequestOptions.GetDefaultOptions();
    options.Port = int.Parse(hbaseDataConnection.Port);
    options.AlternativeEndpoint = "/";
    options.AlternativeHost = scannerInfo.Location.Host;
    
    var f1 = new SingleColumnValueFilter(
    Encoding.UTF8.GetBytes(ColumnFamilyName),
    Encoding.UTF8.GetBytes("driverName"),
    CompareFilter.CompareOp.Equal,
    new SubstringComparator(fld.Values[0].ToString()))
    
    var filter = new FilterList(FilterList.Operator.MustPassAll, f1);
    scanner.filter = filter.ToEncodedString();
    
    ScannerInformation scanInfo = client.CreateScannerAsync(_tableName, scanner, scanOptions).Result;
    result = RetrieveResults(client, scanInfo, scanOptions).ToList();
    

    Make sure REST API is running on the HBase machine, e.g.

    hbase rest start -p 20050 --infoport 20051