I have a structure which describes the address, it looks like:
class Address
{
public string AddressLine1 { get; set; }
public string AddressLine2 { get; set; }
public string City { get; set; }
public string Zip { get; set; }
public string Country { get; set; }
}
I'm looking for a way to create an unique identifier for this structure (I assume it should be also of a type of string
) which is depend on all the structure properties (e.g. change of AddressLine1
will also cause a change of the structure identifier).
I know, I could just concatenate all the properties together, but this gives too long identifier. I'm looking for something significantly shorter than this.
I also assume that the number of different addresses should not be more than 100M.
Any ideas on how this identifier can be generated?
Thanks in advance.
A prehistory of this:
There are several different tables in the database which hold some information + address data. The data is stored in the format similar to the one described above.
Unfortunately, moving the address data into a separate table is very costly right now, but I hope it will be done in the future.
I need to associate some additional properties with the address data, and going to create a separate table for this. That's why I need to unique identify the address data.
Serialize all fields to a large binary value. For example using concatenation with proper domain separation.
Then hash that value with a cryptographic hash of sufficient length. I prefer 256 bits, but 128 are probably fine. Collisions are extremely rare with good hashes, with a 256 bit hash like SHA-256 they're practically impossible.