Search code examples
javamemory-leaksconcurrenthashmap

Memory Leakage during removal of an entry in concurrent hashmap


Lets say we remove an entry <k,v> from ConcurrentHashMap. The remove operation in ConcurrentHashMap will clone the nodes preceding the node which has to be deleted.

Now My question is how Java garbage collect the original nodes preceding the nodes which has to be deleted.

Lets say 1--->2--->3--->4--->5--->6 is a hashentry list which is maintained by ConcurrentHashMap. Now we want to remove 3.

The following code is the code snippet of the remove method in Java ConcurrentHashMap

HashEntry newFirst = e.next;
for (HashEntry p = first; p != e; p = p.next) {
    newFirst = new HashEntry(p.key, p.hash, newFirst, p.value);
    tab[index]= newFirst;
}
  • After 1st iteration

    1--->2--->3--->4--->5--->6

    1A--->2A--->4--->5--->6

    A new node 1A will be created which will point to 4. The Original Node 3 is still pointing to node 4. Thus node 4 is pointed by 2 nodes

  • After 2nd Iteration

    1--->2--->3--->4--->5--->6

    1A--->2A--->4--->5--->6

    The node 4 is a pointer to two list now (1--->2--->3) and (1A--->2A). The original nodes preceding Node 4 (1--->2--->3) are never removed from list of hashentries.

Ain't this is the case of memory leakage. How the GC will collect (1--->2--->3) as they are still being referenced by ConcurrentHashMap ?


Solution

  • I think you misread that code a bit. First of all, the loop looks like this:

    HashEntry newFirst = e.next; // the element after the deleted one, in our case: 4
    for (HashEntry p = first; p != e; p = p.next) {
        newFirst = new HashEntry(p.key, p.hash, newFirst, p.value);
    }
    tab[index]= newFirst; // this is outside the loop
    

    Secondly, this is a singly-linked list, so the iteration goes like this:

    Step 0: tab[index] --> 1 --> 2 --> 3 --> 4 --> 5 --> 6
              newFirst ----------------------^
    Step 1: tab[index] --> 1 --> 2 --> 3 --> 4 --> 5 --> 6
              newFirst --------------> 1A ---^
    Step 2: tab[index] --> 1 --> 2 --> 3 --> 4 --> 5 --> 6
              newFirst -------> 2A --> 1A ---^
    Step 3: tab[index] --> 2A--> 1A--> 4 --> 5 --> 6
                     1 --> 2 --> 3 ----^
    

    (Step 0 is the initial state, steps 1 and 2 are the two iterations of the loop and step 3 is tab[index] = newFirst)

    As you can see, after step 3 there's nothing pointing to 1, so it's eligible for GC and as a result, so are 2 and 3.

    P.s.: Also note how the order of 2A and 1A is reversed in the new list. This should have no practical impact unless you have way too many collisions, but in that case the collisions themselves are a far bigger problem than the slight timing inconsistency this introduces.