Apache Phoenix allows to create salted tables that would distribute data accross the region servers. e.g.
CREATE TABLE table (a_key VARCHAR PRIMARY KEY, a_col VARCHAR) SALT_BUCKETS = 20;
In order to use this feature, a number of salt buckets must be chosen. How to choose this number of salt buckets? Should it be based in the number of region servers? What if I plan to later add more region servers?
HBase Table is divided into Regions. A RegionServer can hold couple of 100's of Regions. So, ideally, it should depend on:
How much random distribution you want in your data?
More buckets is proportional to random distribution a.k.a load balancing. But, you will also lose flexibility to do range based scan.
Theoretically, you should be able to increase "salt_buckets" in future. On the contrary, you wont be able to decrease "salt_buckets" in future. So, i would suggest to start with nominal number of buckets.(Note: I am not sure whether phoenix allows to increase number of bucket.)