Search code examples
javasortingcomparatorpriority-queue

Issues of String sort with PriorityQueue (Java)


I was trying to use PriorityQueue to sort a list of Strings and remove the duplicates. Initially I used PriorityQueue, it doesn't change the order. After I change to TreeSet, it worked. However, I want to understand what is the issue with priority queue, with defined Comparator. Would love to hear some explanations.

Not working code:

public class RemoveDuplicateStrings {
    public static ArrayList<String> removeDuplicates(List<String> input) {
        PriorityQueue<String> pq = new PriorityQueue<>((a, b) -> a.compareTo(b));

        for (String s : input) {
            if (!pq.contains(s)) {
                pq.add(s);
            }
        }
        return new ArrayList<String>(pq);
    }

    public static void main(String[] args) {
        List<String> output = removeDuplicates(List.of("Hey", "Hi", "Hello", "Hey", "Hello"));
        System.out.println(output);
    }
}

Result I got: [Hello, Hi, Hey], the correct order should be: hello, hey, hi.

It worked after I changed the data structure to TreeSet with the same Comparator.


Solution

  • You are using ArrayList constructor that copies elements from the collection that is passed as an argument, and invokes toArray method on it. For PriorityQueue it just makes the copy of the underlying array and those elements are in no particular order. From the PriorityQueue::toArray docs :

    Returns an array containing all of the elements in this queue. The elements are in no particular order.

    However for a TreeSet::toArray (implementation inherited from AbstractCollection):

    Returns an array containing all of the elements in this collection. If this collection makes any guarantees as to what order its elements are returned by its iterator, this method must return the elements in the same order

    And actually the TreeSet makes guarantes about the order of elements returned by it's iterator. From TreeSet::iterator docs :

    Returns an iterator over the elements in this set in ascending order.

    That is why you get such results. To get what you want you would have to poll your queue to receive elements in comparator defined order :

    public static ArrayList<String> removeDuplicates(List<String> input) {
            PriorityQueue<String> pq = new PriorityQueue<>((a, b) -> a.compareTo(b));
    
            for (String s : input) {
                if (!pq.contains(s)) {
                    pq.add(s);
                }
            }
    
            ArrayList<String> result = new ArrayList<>();
            while (!pq.isEmpty()) {
                result.add(pq.poll());
            }
            return result;
    }
    

    The key here is that the iterator of PriorityQueue does not return elements i praticular order, however for TreeSet the order is ascending (taking into account the comparator).