Search code examples
javajava-8hbasebijection

convert a string into something reversible, in Java


I have a lot of urls that serve as keys in a HBase table. Since they "all" start by http://, Hbase puts them in the same node. Thus I end with a node at +100% and the other idle.

So, I need to map the url to something hash-like, but reversible. Is there any simple, standard, and fast way to do that in JAVA8.

I look for random (linear) distribution of prefixes.

Note:

  • reversing the url is not interesting since a lot of urls end with / ? = and risk to unbalance the distribution.

  • I do not need encryption, but I can accept it.

  • I do not look for compression, but it is welcome if possible :)

Thanks, Costin


Solution

  • There's not a single, standard way.

    One thing you can do is to prefix the key with its hash. Something like:

    a01cc0fe http://...
    

    That's easily reversible (just snip off the hash chars, which you can make be a fixed length) and will get you good distribution.

    The hash code for a string is stable and consistent across JVMs. The algorithm for computing it is specified in String.hashCode's documentation, so you can consider it part of the contract of how a String works.