I have a lot of urls that serve as keys in a HBase table. Since they "all" start by http://
, Hbase puts them in the same node. Thus I end with a node at +100% and the other idle.
So, I need to map the url to something hash-like, but reversible. Is there any simple, standard, and fast way to do that in JAVA8.
I look for random (linear) distribution of prefixes.
Note:
reversing the url is not interesting since a lot of urls end with / ? =
and risk to unbalance the distribution.
I do not need encryption, but I can accept it.
I do not look for compression, but it is welcome if possible :)
Thanks, Costin
There's not a single, standard way.
One thing you can do is to prefix the key with its hash. Something like:
a01cc0fe http://...
That's easily reversible (just snip off the hash chars, which you can make be a fixed length) and will get you good distribution.
The hash code for a string is stable and consistent across JVMs. The algorithm for computing it is specified in String.hashCode
's documentation, so you can consider it part of the contract of how a String works.