Search code examples
javasortingcollectionstreeset

Ranking elements in TreeSet


I know a java treeset can't have identical elements and so I have to somehow differentiate an element from another even if they have the same "value". I want to be able to rank the elements and I'm noticing an interesting behavior.

TreeSet<Integer> set = new TreeSet<Integer>(new Comparator<Integer>()
        {
            public int compare(Integer arg0, Integer arg1) 
            {
                if(arg0 > arg1)
                    return -1;
                return 1;
            }
        });

        set.add(40);
            set.add(20);
        set.add(30);
            set.add(20);

        for(Integer i:set)
        {
            System.out.println("Rank: "+(set.headSet(i,false).size()+1)+" Number: "+i);
        } 

And this is the output:

Rank: 1 Number: 40
Rank: 3 Number: 30
Rank: 5 Number: 20
Rank: 5 Number: 20

This is what headset is supposed to do:

Returns a view of the portion of this set whose elements are less than (or equal to, if inclusive is true) toElement. The returned set is backed by this set, so changes in the returned set are reflected in this set, and vice-versa. The returned set supports all optional set operations that this set supports. 

I'm sorting in descending order so I think it should do the opposite. The first element has nothing greater than it, so it returns 0, and then I add 1 to get its rank. The second element has one thing larger than it, so I think it should return 1, adding 1 makes 2. That's kind of weird. I think I'm making a simple mistake. I also need to figure out how to deal with the two 20's. I want their rank to both be 3 but the treeset thinks they're different numbers. I suppose I could use TreeMultiSet or some other third party library.


Solution

  • I suppose I could use TreeMultiSet or some other third party library.

    Since you're violating one of the basic characteristics of a set, I'd say you shouldn't use a Set or TreeSet, at least directly. Options:

    • use a List (and keep it sorted with Collections.sort() and Collections.binarySearch())
    • use an IdentityHashMap and just use the value as the key mapped to itself
    • use a TreeMap and map the value to # of occurences (extract to a list and sort as needed)
    • use a 3rd party library (Bag or MultiSet)
    • implement your own bag/multiset

    Since I don't know more about the programming context, its hard to suggest a specific solution, but hopefully this brings up some other ideas to consider.