Search code examples
hbase

HBase table splitting


I am trying to create an HBase table with pre-splitting. Where I am having a row key as combination of bucket number, schema and pkid. I am able to do pre-splitting the table with the ranges as {'00000000000000000', '10000000000000000', '20000000000000000',.....'f0000000000000000'}. Is there a way that I can do this automatically using auto splitting policy. Which will help me in including schema name also to splitting like '0MD5(schema1)000000000...', '1MD5(schema1)000000000...', ...,'fMD5(schema1)000000000...', '0MD5(schema2)000000000...',.....

The above splitting will help me a defining a better design. And here we can't define number of schemas, right now we are creating the table for 10 schemas, and in future for some more schemas. We need to insert data into this table. So I am looking for a better design in the splitting policy.

I have also looked for KeyPrefixRegionSplitPolicy, it looks like this will help, and I am not very sure on this.

Can any one help me on this.


Solution

  • KeyPrefixRegionSplitPolicy can meet you needs
    here is some code example may help

        HBaseAdmin admin = new HBaseAdmin(conf);
        HTable hTable = new HTable(conf, "test");
        HTableDescriptor htd = hTable.getTableDescriptor();
        HTableDescriptor newHtd = new HTableDescriptor(htd);
        newHtd.setValue(HTableDescriptor.SPLIT_POLICY,     KeyPrefixRegionSplitPolicy.class.getName());
        newHtd.setValue("prefix_split_key_policy.prefix_length", "1");
        admin.disableTable("test");
        admin.modifyTable(Bytes.toBytes("test"), newHtd);
        admin.enableTable("test");
    

    now the table 'test' will auto split region with partition by the rowkey prefix 1