Javadoc from ConcurrentHashMap#computeIfAbsent
says
The computation should be short and simple, and must not attempt to update any other mappings of this map.
But, from what I see, using remove()
and clear()
methods inside mappingFunction
works fine. For example this
Key element = elements.computeIfAbsent(key, e -> {
if (usages.size() == maxSize) {
elements.remove(oldest);
}
return loader.load(key);
});
What bad consequences of using remove() method inside mappingFunction
could be?
The javadoc explains clearly the cause :
Some attempted update operations on this map by other threads may be blocked while computation is in progress, so the computation should be short and simple, and must not attempt to update any other mappings of this map.
You have not to forget that ConcurrentHashMap
is designed to provide a way of using a thread safe Map without being as locking as it is the case for older thread safe Map classes as HashTable
.
When a modification on the map occurs, it locks only the concerned mapping and not the whole map.
ConcurrentHashMap
is a hash table supporting full concurrency of retrievals and high expected concurrency for updates.
computeIfAbsent()
is a new method added in Java 8.
If badly used, that is, if in the body of computeIfAbsent()
that already locks the mapping of the key passed to the method, you lock another key, you enter in a path where you may defeat the purpose of the ConcurrentHashMap
as finally you will lock two mapping.
Imagine the problem if you lock more mapping inside computeIfAbsent()
and that the method is not short at all. Concurrency access on the map would become slow.
So the javadoc of computeIfAbsent()
stresses on this potential issue by reminding the principles of ConcurrentHashMap
: keep it simple and fast.
Here is example code illustrating the issue.
Suppose we have an instance of ConcurrentHashMap<Integer, String>
.
We will start two threads that use it :
thread1
that invokes computeIfAbsent()
with the key 1
thread2
that invokes computeIfAbsent()
with the key 2
thread1
executes a fast enough task but it doesn't follow advise of the computeIfAbsent()
javadoc : it updates the key 2
in computeIfAbsent()
, that is another mapping of which one used in the current context of the method (that is key 1
).
thread2
executes a long enough task. It invokes computeIfAbsent()
with the key 2
by following advises of the javadoc : it doesn't update any other mapping in the implementation of it.
To simulate the long task, we can use the Thread.sleep()
method with 5000
as parameter.
For this specific situation, if thread2
starts before thread1
, the invocation ofmap.put(2, someValue);
in thread1
will be blocked while thread2
is not returned of computeIfAbsent()
that locks the mapping of the key 2
.
Finally, we get a ConcurrentHashMap
instance that blocks the mapping of the key 2
during 5 seconds while computeIfAbsent()
is invoked with the mapping of the key 1
.
It is misleading, not effective and it goes against the ConcurrentHashMap
intention and the computeIfAbsent()
description which the intention is computing the value for the current key :
if the specified key is not already associated with a value, attempts to compute its value using the given mapping function and enters it into this map unless null
The sample code :
import java.util.concurrent.ConcurrentHashMap;
public class BlockingCallOfComputeIfAbsentWithConcurrentHashMap {
public static void main(String[] args) throws InterruptedException {
ConcurrentHashMap<Integer, String> map = new ConcurrentHashMap<>();
Thread thread1 = new Thread() {
@Override
public void run() {
map.computeIfAbsent(1, e -> {
String valueForKey2 = map.get(2);
System.out.println("thread1 : get() returns with value for key 2 = " + valueForKey2);
String oldValueForKey2 = map.put(2, "newValue");
System.out.println("thread1 : after put() returns, previous value for key 2 = " + oldValueForKey2);
return map.get(2);
});
}
};
Thread thread2 = new Thread() {
@Override
public void run() {
map.computeIfAbsent(2, e -> {
try {
Thread.sleep(5000);
} catch (Exception e1) {
e1.printStackTrace();
}
String value = "valueSetByThread2";
System.out.println("thread2 : computeIfAbsent() returns with value for key 2 = " + value);
return value;
});
}
};
thread2.start();
Thread.sleep(1000);
thread1.start();
}
}
As output we always get :
thread1 : get() returns with value for key 2 = null
thread2 : computeIfAbsent() returns with value for key 2 = valueSetByThread2
thread1 : after put() returns, previous value for key 2 = valueSetByThread2
This is written fast as reads on the ConcurrentHashMap
are not blocking :
thread1 : get() returns with value for key 2 = null
but this :
thread1 : after put() returns, previous value for key 2 = valueSetByThread2
is output only when thread2 is returned of computeIfAbsent()
.