Search code examples
javaguava

HashBiMap get getting blocked


We are using HashBiMap in our application and we created it like this

HashBiMap<String, String> map = HashBiMap.create();

and we have guava-26.0-jre version of guava.

Recently noticed our application getting hung and not able to process any other requests. Got a thread dump and noticed things like these

Thread.State: RUNNABLE
native=false, suspended=false, blockCount=1, waitCount=4, cpu=9h, 12m, 45s, 434ms, user=9h, 10m, 59s, 990ms
    com.google.common.collect.HashBiMap.seekByKey(HashBiMap.java:223)
    com.google.common.collect.HashBiMap.get(HashBiMap.java:254)

Thread.State: RUNNABLE
native=false, suspended=false, blockCount=1, waitCount=6, cpu=9h, 11m, 49s, 453ms, user=9h, 10m, 3s, 760ms
    com.google.common.collect.HashBiMap.seekByKey(HashBiMap.java:223)
    com.google.common.collect.HashBiMap.get(HashBiMap.java:254)

Thread.State: RUNNABLE
native=false, suspended=false, blockCount=274, waitCount=6615, cpu=22h, 31m, 29s, 966ms, user=22h, 27m, 30s, 540ms
    com.google.common.collect.HashBiMap.seekByKey(HashBiMap.java:223)
    com.google.common.collect.HashBiMap.get(HashBiMap.java:254)

Thread.State: RUNNABLE
native=false, suspended=false, blockCount=91, waitCount=2443, cpu=22h, 29m, 51s, 541ms, user=22h, 25m, 54s, 140ms
    com.google.common.collect.HashBiMap.seekByKey(HashBiMap.java:223)
    com.google.common.collect.HashBiMap.get(HashBiMap.java:254)

There were several other threads like above that got blocked on a call to get, but the longest that was waiting was this with cpu=22h, 31m, 29s, 966ms

Thread.State: RUNNABLE
native=false, suspended=false, blockCount=3, waitCount=32, cpu=5h, 46m, 7s, 733ms, user=5h, 45m, 500ms
    com.google.common.collect.HashBiMap.seekByValue(HashBiMap.java:234)
    com.google.common.collect.HashBiMap.put(HashBiMap.java:274)
    com.google.common.collect.HashBiMap.forcePut(HashBiMap.java:301)

There was only one thread thats waiting on the forcePut like above.

Would there be any reason for why HashBiMap.get would go into a loop to find the value for a key and never returns.


Solution

  • TL;DR

    As suggested by Xaerxess, uses Maps#synchronizedBiMap in case the map is accessed by multiple threads. You never knows what can happens when there is multiple threads.

    For someone who is curious on whats happened

    Would there be any reason for why HashBiMap.get would go into a loop to find the value for a key and never returns.


    It is an interesting example on how multiple thread create "unexpected" result.
    Let's take a look in line HashBiMap.java:223 of method seekByKey

    private BiEntry<K, V> seekByKey(@Nullable Object key, int keyHash) {
      for (BiEntry<K, V> entry = hashTableKToV[keyHash & mask];
          entry != null;
          entry = entry.nextInKToVBucket) {
        if (keyHash == entry.keyHash && Objects.equal(key, entry.key)) {
          return entry;
        }
      }
      return null;
    }
    

    And line 223 is

          entry = entry.nextInKToVBucket) {
    

    Blocking in this line means there is an infinite loop, which is due to circular reference of entry and entry.nextInKToVBucket.

    One of the possible case is that: In the put method,

    private V put(@Nullable K key, @Nullable V value, boolean force) {
      ...
    
      BiEntry<K, V> newEntry = new BiEntry<>(key, keyHash, value, valueHash);
      if (oldEntryForKey != null) {
        ...
      } else {
        insert(newEntry, null);
        rehashIfNecessary();
        return null;
      }
    }
    

    suppose unfortunately there are two calls with same key and value from two threads simultaneously, created two new entry A and B. Then in insert method,

    private void insert(BiEntry<K, V> entry, @Nullable BiEntry<K, V> oldEntryForKey) {
      int keyBucket = entry.keyHash & mask; // 1
      entry.nextInKToVBucket = hashTableKToV[keyBucket]; // Step 2
      hashTableKToV[keyBucket] = entry; // 3
      ...
    }
    

    Suppose A completes first, hashTableKToV[keyBucket] = A, A.nextInKToVBucket = null. when B comes, and completes Step 2, B.nextInKToVBucket = A. Suppose before execute step 3, Thread of A is executing rehashIfNecessary, and unfortunately rehash is required.

    private void rehashIfNecessary() {
      BiEntry<K, V>[] oldKToV = hashTableKToV;
      if (Hashing.needsResizing(size, oldKToV.length, LOAD_FACTOR)) {
        int newTableSize = oldKToV.length * 2;
    
        this.hashTableKToV = createTable(newTableSize);  // Step 4
        ...
    
        for (BiEntry<K, V> entry = firstInKeyInsertionOrder; 
            entry != null;
            entry = entry.nextInKeyInsertionOrder) {
          insert(entry, entry); // step 5
        }
        ...
      }
    }
    

    When step 4 is completed, hashTableKToV is cleared. It is unlucky that Thread of B execute step 3 in this moment, and hashTableKToV[keyBucket] = B. Thread of A continues with step 5, which insert A again, and A.nextInKToVBucket = A after step 2, causing a circular reference. And hence infinite loop in seekByKey.

    Here is an example for how to reproduce above case (not 100%, may need to try several times):

    import java.util.ArrayList;
    import java.util.Collection;
    import java.util.List;
    import java.util.concurrent.Callable;
    import java.util.concurrent.ExecutorService;
    import java.util.concurrent.Executors;
    import java.util.concurrent.Future;
    
    import com.google.common.collect.BiMap;
    import com.google.common.collect.HashBiMap;
    
    public class HashBiMapConcurrentTest {
        public static void main(String[] args) throws InterruptedException {
            BiMap<String, String> biMap = HashBiMap.create();
            ExecutorService executors = Executors.newFixedThreadPool(4);
            Collection<Callable<Integer>> tasks = new ArrayList<>();
            Callable<Integer> task = () -> {
                for (int i = 0; i < 1000; i++) {
                    biMap.put("A" + i, "B" + i);
                    biMap.get("A" + i);
                }
                System.out.println("Done");
                return 0;
            };
            tasks.add(task);
            tasks.add(task);
            List<Future<Integer>> futures = executors.invokeAll(tasks);
            for (Future<Integer> future : futures) {
                while (!future.isDone()) {
                    Thread.sleep(10);
                }
            }
            executors.shutdown();
        }
    }