We're trying to scale up HBase writes on a cluster using Thrift. (Our HBase application is in Python, and hence needs Thrift.)
Despite increasing the number of nodes in the cluster, we are seeing the same write speeds.
First off, is the recommended strategy to run Thrift on: 1. The client? 2. The HBase master? 3. HBase region servers?
If on #1 or #2, will the client or HBase master take care of splitting the requests to the various region servers? It doesn't appear to in our case.
If #3, then I have to modify the client to write to the specific region servers, and randomize the writes. I can do this, but it seems to defeat the purpose of using HBase.
Any other tips on read/write scaling (especially with Thrift) are greatly appreciated.
I found the answer at these two questions, it looks like we'll go with #3 (write to the specific region servers, and randomize the writes):