Is there any text or known algorithms or strategies for Database sharding?

I was building a scalabale solution, and hence require sharding of my data. I know specific usage map of my present shard and based on that I wanted to break them and create new shards based on that usage map. [Higher usage key-range gets broken down into smaller parts and ditributed to different machine to equalize load across nodes].

Is there any theory/text/algo which gives the most efficient shardings strategy (sharding as such without breaking their sequence/index), if its known which key-ranges are used the most.

Solution

It is better to match sharding algorithms/strategies and business scenario.

There are some regular algorithms, such as: Hash, Range, Mod, Tag, HashMod, Time, etc.

And maybe we need more algorithms need to be customized, for example: use user_id mod for database sharding, and use order_id mod for table sharding.

Maybe you can have a look with Apache ShardingSphere, this project just defined some standard sharding algorithms and can permit developers customization.

The documentation related is: https://shardingsphere.apache.org/document/current/en/dev-manual/sharding/

The source code FYI: https://github.com/apache/shardingsphere/blob/master/shardingsphere-features/shardingsphere-sharding/shardingsphere-sharding-core/src/main/resources/META-INF/services/org.apache.shardingsphere.sharding.spi.ShardingAlgorithm