I was reading this post and author advises to have shard count as a power of two.
Benefit we get from it? Why cant it be a simple number like 500, 150 or 1000?
A typical growth pattern for distributed data systems is to double the cluster size when needed. This allows for more even rebalancing of data and minimizes the effect of any hotspots.
Here is an in-depth discussion on database sharding that you may find useful. (Disclosure: dbShards is one of my company's products)