Search code examples
restcurlhbase

Filter using HBase REST API


Does anyone know anything about the HBase REST API? Im currently writing a program which inserts and reads from HBase using curl commands. When trying to read I use the curl get command, e.g.

curl -X GET 'http://server:9090/test/Row-1/Action:ActionType/' -h 'Accept:application/json'

This returns the column Action:ActionType from Row-1. If I want to do the equivalent of a WHERE clause using the GET command I am stuck however. Im not sure its even possible? If I want to find all records where Action:ActionType =1 for example. Help is appreciated!


Solution

  • You can do this by using a filter (here a SingleColumnValueFilter) in your CURL request.

    First, create a XML file (myscanner.xml) describing your scan. Here we want to filter according to a qualifier value, with EQUAL operator) :

    <Scanner batch="10">
        <filter>
            {
                "type": "SingleColumnValueFilter",
                "op": "EQUAL",
                "family": "<FAMILY_BASE64>",
                "qualifier": "<QUALIFIER_BASE64>",
                "latestVersion": true,
                "comparator": {
                    "type": "BinaryComparator",
                    "value": "<SEARCHED_VALUE_BASE64>"
                }
            }
        </filter>
    </Scanner>
    

    You should replace <FAMILY_BASE64>, <QUALIFIER_BASE64> and <SEARCHED_VALUE_BASE64> with your own values (values must be converted to base64, you can do echo -en ${FAMILY} | base64.

    Then, submit a CURL request to HBase REST API with this XML file as data :

    curl -vi -X PUT \
        -H "Content-Type:text/xml" \
        -d @myscanner.xml \
        "http://${HOST}:${REST_API_PORT}/${TABLE_NAME}/scanner/"
    

    This request should return a Scanner object, like :

    [...]
    Location: http://${HOST}:${REST_API_PORT}/${TABLE_NAME}/scanner/149123344543470bea57a
    

    Then use the given scanner to iterate through results (request multiple times to iterate) :

    curl -vi -X GET \
        -H "Accept: text/xml" \
        "http://${HOST}:${REST_API_PORT}/${TABLE_NAME}/scanner/149123344543470bea57a"
    

    You can also accept "application/json" instead of XML. Notice that the results are base64 encoded.

    Sources :

    HBase REST Filter ( SingleColumnValueFilter )

    A list of filters you can use : https://gist.github.com/stelcheck/3979381

    Cloudera documentation about HBase REST API : https://www.cloudera.com/documentation/enterprise/5-9-x/topics/admin_hbase_rest_api.html