Search code examples
javaconcurrencyconcurrenthashmap

Weakly consistent iterator by ConcurrentHashMap


The Java Concurrency in Practice mentions that:

The iterator returned by the ConcurrentHashMap are weakly consistent than fail-fast. A weakly consistent iterator can tolerate the concurrent modifications, traverses elements as they existed when the iterator was constructed, and may (but is not guaranteed to) reflect modifications to the collection after the construction of the iterator.

  1. How making the iterator weakly consistent or fail-safe helps in the concurrent environment because still state of the ConcurrentHashMap will be modified. The only thing is that it'll not throw the ConcurrentModificationException.
  2. Why fail-fast iterator is returned by the Collections when creating the fail-safe iterator is good for concurrency.

Solution

  • Correctness in your particular case

    Please keep in mind that Fail Fast iterator iterates over the original collection.

    In contrast Fail Safe (a.k.a weakly consistent) iterator iterates over a copy of the original collection. Therefore any changes to the original collection go unnoticed, and that's how it guarantees lack of ConcurrentModificationExceptions.


    To answer your questions:

    1. Using Fail Safe iterator helps concurrency as you don't have to block on the reading threads on the whole collection. Collection can be modified underneath while the reading happens. The drawback is that the reading thread will see the state of the collection as a snapshot taken at the time when the iterator got created.
    2. If the above limitation is not good for your particular use case (your readers should always see the same state of the collection) you have to use Fail Fast iterator and keep the concurrent access to the collection controlled tighter.

    As you can see it's a trade-off between correctness of your use case and speed.

    ConcurrentHashMap

    ConcurrentHashMap (CHM) exploits multiple tricks in order to increase concurrency of access.

    • Firstly CHM is actually a grouping of multiple maps; each MapEntry gets stored in one of the number of segments each itself being a hashtable which can be concurrently read (read methods do not block).
    • The number of segments is the last argument in the 3 argument constructor and it is called concurrencyLevel (default 16). The number of segments determines the number of concurrent writers across the whole of the data. The equal spread of entries between the segments is ensured by additional internal hashing algorithm.
    • Each HashMapEntrys value is volatile thereby ensuring fine grain consistency for contended modifications and subsequent reads; each read reflects the most recently completed update
    • Iterators and Enumerations are Fail Safe - reflecting the state at some point since the creation of iterator/enumeration; this allows for simultaneous reads and modifications at the cost of reduced consistency.