I've implemented HashPriorityQueue
class, which combines HashMap
(for fast lookup) and PriorityQueue
(for sorting the data).
Another improvement I made is to guarantee ordered iteration.
Because PriorityQueue
does not promise such a thing, from JavaDocs:
The Iterator provided in method
iterator()
is not guaranteed to traverse the elements of the priority queue in any particular order.
I also want to guarantee ordered iteration and should work in a multithreaded environment.
To make HashPriorityQueue
class work I've already done the following steps:
HashMap
class.private
field PriorityQueue
.HashMap
so that I can add and remove values from my queue.poll()
and peek()
.Iterator
for this new data structure that internally makes a copy of the queue and uses poll()
in the next()
method to maintain ordered iteration.Code:
public class HashPriorityQueue<K, V> extends HashMap<K, V> implements Iterable<AbstractMap.SimpleEntry<K, V>>{
private final PriorityQueue<K> queue;
/* CONSTRUCTORS */
public HashPriorityQueue(Comparator<K> comparator) {
queue = new PriorityQueue(comparator);
}
public HashPriorityQueue() {
queue = new PriorityQueue();
}
/* QUEUE METHODS */
public AbstractMap.SimpleEntry<K, V> poll() {
K key = queue.poll();
V val = remove(key);
return new AbstractMap.SimpleEntry<K, V>(key, val);
}
public AbstractMap.SimpleEntry<K, V> peek() {
K key = queue.peek();
V val = get(key);
return new AbstractMap.SimpleEntry<K, V>(key, val);
}
@Override
public V remove(Object key) {
queue.remove(key);
return super.remove(key);
}
public V remove(AbstractMap.SimpleEntry<V, K> entry) {
return remove(entry.getKey());
}
@Override
public V put(K key, V value) {
queue.add(key);
return super.put(key, value);
}
@Override
public Iterator<AbstractMap.SimpleEntry<K, V>> iterator() {
return new PriorityIterator();
}
private class PriorityIterator implements Iterator<AbstractMap.SimpleEntry<K, V>>{
PriorityQueue<K> keys;
K cursor;
public PriorityIterator() {
keys = new PriorityQueue<>(HashPriorityQueue.this.queue);
}
@Override
public boolean hasNext() {
return !keys.isEmpty();
}
@Override
public AbstractMap.SimpleEntry<K, V> next() {
cursor = keys.poll();
V v = HashPriorityQueue.this.get(cursor);
return new AbstractMap.SimpleEntry<>(cursor,v);
}
@Override
public void remove() {
HashPriorityQueue.this.remove(cursor);
}
}
}
Currently, the iterator creates a copy of the queue and iterates over the keys by polling them from the copy of the queue. The corresponding value is being obtained from the map using get()
.
The iterator isn't aware of any structural and non-structural medications of the map.
So the actual question is:
How can I ensure that my Collection has not been modified during the process of iteration? I was thinking of adding a boolean
flag isChanged
, would it be a good approach?
I would also appreciate other suggestions regarding this implementation.
which will also guarantee ordered iteration and should work in a multi-threaded environment.
Firstly, let's start with a single-threaded implementation and then expand it.
Disclaimer: I don't treat this problem as having practical application, but rather as a project having a value from the educational perspective.
The rationale behind this remark would be clear if we enumerate the time complexities of the operations available with this custom Collection in comperison with TreeMap
:
iterator()
- O(n * log n) (worse than TreeMap
);remove()
- O(n) (worse than TreeMap
);poll()
- O(log n);put()
- O(log n);peek()
- O(1);get()
- O(1) (better than TreeMap
).Operations with Iterator
:
next()
- O(log n) (worse than iterator returned by the TreeSet
);hasNext()
- O(1);get()
- is the only method in this collection that performs better then corresponding methods of the TreeMap
. All the rest have the same or even worse performance due to being effected by the ovehead of maintaining the PriorityQueue
which has a logorithmic time complexity for enqueuing and dequeuing methods.
When you need to be able to use the functionality of a particular class in another class, inheritance isn't the first approach you have to think about, rather the last resort.
Why? Inheritance creates tight coupling, like in the code you've provided your new Collection happens to be completely dependent on the implementation of HashMap
.
In fact, your custom class is broken. You didn't override all the methods that allow to modify the map structurally, for instance compute()
, merge()
, etc. But there's more to that, since it's a subclass of HashMap
, it can be altered by removing elements from the collection of values obtained via values()
or by removing the entries from the entry set, in these case there would be not possible to change the PriorityQueue
accordingly.
Sure, values()
and entrySet()
can be overridden by making them return unmodifiable collections. So it turns out you need override almost each and every method. And nevertheless a new functionality which can be introduced in the future or changes of the existing methods might break your code.
For more information on that see the Item "Favor composition over inheritance" of the "Effective Java" by Joshua Bloch (former Sun-employee which has developed many features of Java including Collections framework).
The better approach would be to use Composition, i.e. replace IS-A relationship with HAS-A relationship by creating a class that uses a HashMap
, i.e. has a field of type HashMap
, rather than being a HashMap
itself. And that would be a Loose coupling because your Collection no longer depends on the HashMap
implementation, methods of your collection would delegate calls to the HashMap
allowing it to do whatever it does internally.
You are not bound with any contracts and can expose only a limited number of methods that allow to interact with your Collection.
Note that unfortunately you can't implement Map
interface because it might cause problems if interface would be extended.
That how it might be implemented:
class HashPriorityQueueNonSync<K, V> implements Iterable<Map.Entry<K, V>> {
private final PriorityQueue<K> queue;
private final Map<K, V> map = new HashMap<>();
private int modCount = 0;
/* CONSTRUCTORS */
public HashPriorityQueueNonSync(Comparator<K> comparator) {
queue = new PriorityQueue<>(comparator);
}
public HashPriorityQueueNonSync() {
queue = new PriorityQueue<>();
}
/* QUEUE METHODS */
public Map.Entry<K, V> poll() {
modCount++;
K key = queue.poll();
V val = remove(key);
return Map.entry(key, val);
}
public Map.Entry<K, V> peek() {
K key = queue.peek();
V val = map.get(key);
return Map.entry(key, val);
}
/* MAP METHODS */
public V get(Object key) {
return map.get(key);
}
public V put(K key, V value) {
modCount++;
queue.add(key);
return map.put(key, value);
}
public V remove(Object key) {
modCount++;
queue.remove(key);
return map.remove(key);
}
public V remove(Map.Entry<K, V> entry) {
modCount++;
queue.remove(entry.getKey());
return map.remove(entry.getKey());
}
@Override
public Iterator<Map.Entry<K, V>> iterator() {
return new PriorityIterator();
}
private class PriorityIterator implements Iterator<Map.Entry<K, V>> {
private PriorityQueue<K> keys;
private K cursor;
private final int expectedModCount;
public PriorityIterator() {
this.keys = new PriorityQueue<>(HashPriorityQueueNonSync.this.queue);
this.expectedModCount = HashPriorityQueueNonSync.this.modCount;
}
@Override
public boolean hasNext() {
return !keys.isEmpty();
}
@Override
public Map.Entry<K, V> next() {
if (expectedModCount != modCount) {
throw new ConcurrentModificationException();
}
cursor = keys.poll();
V v = HashPriorityQueueNonSync.this.get(cursor);
return Map.entry(cursor, v);
}
@Override
public void remove() {
if (expectedModCount != modCount) {
throw new ConcurrentModificationException();
}
HashPriorityQueueNonSync.this.remove(cursor);
}
}
}
The common practice to ensure that iterator would reflect the actual state of the collection is to introduce two counters of modifications: one as the instance field of the collection (modCount
in the code above), another in the Iterator
(expectedModCount
).
When these variables are not equal, that means that there has been a modification after the iterator was created that was done other than the means of iterator. All the implementations of Iterator
from the JDK would throw ConcurrentModificationException
in such a case.
First of all, there two concepts to consider:
Atomicity - when a thread is modifying the state of an object (in this case changes the underlying queue and map of the custom collection), all other threads can not observe intermediate changes, they can see either both queue and map fully modified or not. We can ensure that all actions on both queue and map would occur atomically by using synchronized
keyword on methods of this custom Collection. The approach shown below is very basic and can be observed in legacy collections like Vector
and Hashtable
, but because under the hood we have two separate underlying collections which need to be accessed atomically it hard apply anything more neat than synchronized
.
Happens-before Order - describes visibility of subsequent changes. If one action happens-before another, then the first is visible to and ordered before the second. One of the ways to ensure this is by using the volatile
keyword.
Note that methods of the iterator doesn't require synchronization (it's mean to be used withing a single thread only), but we need to synchronize the method of the collection responsible for obtaining the iterator.
That's how concurrent implementation might look like:
public class HashPriorityQueue<K, V> implements Iterable<Map.Entry<K, V>> {
private final PriorityQueue<K> queue;
private final Map<K, V> map = new HashMap<>();
private volatile int modCount = 0;
/* CONSTRUCTORS */
public HashPriorityQueue(Comparator<K> comparator) {
queue = new PriorityQueue<>(comparator);
}
public HashPriorityQueue() {
queue = new PriorityQueue<>();
}
/* QUEUE METHODS */
public synchronized Map.Entry<K, V> poll() {
modCount++;
K key = queue.poll();
V val = remove(key);
return Map.entry(key, val);
}
public synchronized Map.Entry<K, V> peek() {
K key = queue.peek();
V val = map.get(key);
return Map.entry(key, val);
}
/* MAP METHODS */
public synchronized V get(Object key) {
return map.get(key);
}
public synchronized V put(K key, V value) {
modCount++;
queue.add(key);
return map.put(key, value);
}
public synchronized V remove(Object key) {
modCount++;
queue.remove(key);
return map.remove(key);
}
public synchronized V remove(Map.Entry<K, V> entry) {
modCount++;
queue.remove(entry.getKey());
return map.remove(entry.getKey());
}
@Override
public synchronized Iterator<Map.Entry<K, V>> iterator() {
return new PriorityIterator();
}
private class PriorityIterator implements Iterator<Map.Entry<K, V>> {
private PriorityQueue<K> keys;
private K cursor;
private final int expectedModCount;
public PriorityIterator() {
this.keys = new PriorityQueue<>(HashPriorityQueue.this.queue);
this.expectedModCount = HashPriorityQueue.this.modCount;
}
@Override
public boolean hasNext() {
return !keys.isEmpty();
}
@Override
public Map.Entry<K, V> next() {
if (expectedModCount != modCount) {
throw new ConcurrentModificationException();
}
cursor = keys.poll();
V v = HashPriorityQueue.this.get(cursor);
return Map.entry(cursor, v);
}
@Override
public void remove() {
if (expectedModCount != modCount) {
throw new ConcurrentModificationException();
}
HashPriorityQueue.this.remove(cursor);
}
}
}
main()
public static void main(String[] args) {
HashPriorityQueue<Integer, Integer> hpq = new HashPriorityQueue<>();
ExecutorService executor = Executors.newFixedThreadPool(3);
executor.submit(() -> { for (int i = 3; i < 7; i++) hpq.put(i, 1 << i); });
executor.submit(() -> { for (int i = 0; i < 3; i++) hpq.put(i, 1 << i); });
executor.submit(() -> { for (int i = 7; i < 10; i++) hpq.put(i, 1 << i); });
try {
executor.awaitTermination(3, TimeUnit.SECONDS);
} catch (InterruptedException e) {
e.printStackTrace();
}
executor.shutdown();
hpq.forEach(System.out::println);
}
Output:
0=1
1=2
2=4
3=8
4=16
5=32
6=64
7=128
8=256
9=512