Search code examples
javahashcodemultiset

Java: get a unique property of an object (like hashcode, but collision proof)


I have a task for which it is necessary to generate a unique value for every object in a set. using the hashcode would be perfect, if collisions weren't allowed in the hashcode contract.

One idea: Record every object's hashcode into a multiset. Then, use hashcodes as the unique identifier, but if that hashcode is in the set more than once, use a different value that is also not in the set. But this feels bulky and awkward.

Better ideas?

Here's what I have already:

public static <V> void toGraphViz(final Graph<V, DefaultWeightedEdge> g, String filename) {

    // to avoid hashcode collisions
    final Set<Integer> hashcodes = new HashSet<Integer>(g.vertexSet().size());

    DOTExporter<V, DefaultWeightedEdge> dot = new DOTExporter<V, DefaultWeightedEdge>(new VertexNameProvider<V> () {

    // vertex name must be unqiue
    @Override
    public String getVertexName(V arg0) {
        int hash = arg0.hashCode();
        while (hashcodes.contains((hash))) {
            hash += 1;
        }
        return "" + hash;
    }
}

EDIT: I guess this wasn't originally clear, but the id number does somehow need to be a function of the object, because getVertexName(V) will get called several times, and it expects that for the same values of V, it will get the same results.

Also, the Vertex type is generic. So I can't make any modifications to a specific class to fix this.


Solution

  • What is the lifetime of this unique number? Just the lifetime of the program? In which case why not just a simple static counter in the class, accessed with suitable synchronisation? Increment it for each new object. No need to keep a list of the values you have used, just the highest value you have used.

    If unique across many executions (and maybe many simultaneous instances) then perhaps you can just use a Database which generates unqiue record ids.

    EDITED in response to clarification

    The piece I missed before was that we can't modify the class for which we want to generate the unique "hash".

    I think that working from the hash code of the class, which will have collisions is making life hard. Assuming that we can rely upon the Vertex classes in question having correctly implemented equals() then we can use the object itself as a key to the set of hashcodes we have used.

    public class Hasher {
    
        public  <V> void toGraphViz(final Graph<V, DefaultWeightedEdge> g, String filename) {
             final Map<V, Integer> hashcodes = new HashMap< V, Integer>();
             final int latestHashHolder[] = { 0 }; // array to allow access from inner class
    
             DOTExporter<V, DefaultWeightedEdge> dot 
                     = new DOTExporter<V, DefaultWeightedEdge>(new VertexNameProvider<V> ()) {
    
             // vertex name must be unqiue
                @Override
                public synchronized String getVertexName(V vertex) {
                    int hashcode;
                    if ( hashcodes.containsKey(vertex)){
                        hashcode = hashcodes.get(vertex);
                    } else {                
                        hashcode = latestHashHolder[0];
                        latestHashHolder[0]++;
                        hashcodes.put(vertex, (Integer)latestHashHolder[0]);
                    }
                    return "Vertex-" + hashcode;
                }
            };
        }
    }