Search code examples
c#scalabilityguidscalepartitioning

Can a guid be a good partition key?


I have to store many gigabytes of data across multiple machines. The files are uniquely identified by Guid and one file can be hosted on one machine only. I was wondering if I could use the Guid as a partition key to determine which machine should I use to store the data. If so, what would be my partition function?

Otherwise, how could I partition my data in such way that all the machine get a very similar load?

Thanks!

P.S. I am not using Sql Server, Oracle or any other DB. This is all in-house code. P.S.S. The Guid are generated using the .NET function Guid.NewGuid().


Solution

  • As James said in his comment, you need something that has a good, uniform distribution. Guids do not have this property. I would recommend a hash, even one as simple as a hash of the Guid itself.

    A SHA-1 hash has a good distribution. I wouldn't recommend even/odd hashing unless you plan on only distributing between 2 machines.