I wanted to discuss a specific use I have of a concurrent map to sense check my logic...
If I used ConcurrentHashMap
, I can do the familar
private final ConcurrentHashMap<K, V> map = new ConcurrentHashMap<K, V>();
public V getExampleOne(K key) {
map.putIfAbsent(key, new Object());
return map.get(key);
}
but I realise that a race condition exists whereby if I remove the item from the map between the putIfAbsent
and the get
, the method above would return something that no longer exists in the collection. This may or may not be fine, but lets assume that for my use case, it's not ok.
What I'd really like is to have the whole thing atomic. So,
public V getExampleTwo(K key) {
return map.putIfAbsent(key, new Object());
}
but as this expands out to
if (!map.containsKey(key))
return map.put(key, value); [1]
return map.get(key);
which for line [1] will return null
for first usage (ie, map.put
will return the previous value, which for first time use is null
).
I can't have it return null in this instance
Which leaves me with something like;
public V getExampleThree(K key) {
Object object = new Object();
V value = locks.putIfAbsent(key, object);
if (value == null)
return object;
return value;
}
So, finally, my question; how do the examples above differ in semantics?. Does getExampleThree
ensure atomicity like getExampleTwo
but avoid the null return correctly? Are there other problems with getExampleThree
?
I was hoping for a bit of discussion around the choices. I realise I could use a non ConcurrentHashMap
and synchronize around clients calling my get
method and a method to remove from the map but that seems to defeat the purpose (non blocking nature) of the ConcurrentHashMap. Is that my only choice to keep the data accurate?
I guess that's a bit part of why you'd choose ConcurrentHashMap; that its visible/up-to-date/acurrate at the point you interact with it, but there may be an impact further down the line if old data is going to be a problem...
It sounds like you are trying to create a global lock object for a key.
Instead of deleting an entry with the possibility of have it re-created almost immediately, I would only delete the entry when you pretty sure its not needed ever again.
Otherwise, if you are using this for a lock, you can have two thread locking on different objects for the same key.
If its not ok, you can busy loop it.
public V getExampleOne(K key) {
for(Object o = null, ret = null; (ret = map.get(key)) == null; )
map.putIfAbsent(key, o == null ? o = new Object() : o);
return ret;
}
it can still be removed or replaced as soon as the loop exists so its effectively much the same as.
public V getExampleThree(K key) {
Object o = new Object();
map.putIfAbsent(key, o);
Object ret = map.get(key);
return ret == null ? o : ret;
}
So, finally, my question; how do the examples above differ in semantics?.
The difference is only the obvious.
Does getExampleThree ensure atomicity like getExampleTwo but avoid the null return correctly?
Yes.
Are there other problems with getExampleThree?
Only if you believe the very next call might not give you a different value (if you believe it can be removed in another thread)