Search code examples
javamultithreadingconcurrencythread-safetydeadlock

Java Deadlock during a synchronized on a local resource?


I'm seeing a problem with multiple Threads deadlocking on the same line of code. I cannot reproduce the problem locally or in any test, but yet Thread Dumps from Production have show the problem quite clearly.

I can't see why the Threads would become blocked on the synchronized line below, since there is no other synchronization on the Object in the call stack or in any other Thread. Does anyone have any idea what is going on, or how I can even reproduce this issue (Currently trying with 15 Threads all hitting trim() in a loops, while processing 2000 tasks through my Queue - But unable to reproduce)

In the Thread dump below, I think the multiple Threads with the 'locked' status may be a manifestation of Java Bug: http://bugs.java.com/view_bug.do?bug_id=8047816 where JStack reports Threads in wrong state. (I'm using JDK Version: 1.7.0_51)

Cheers!

Here is a view of the Threads in the Thread dump.....

"xxx>Job Read-3" daemon prio=10 tid=0x00002aca001a6800 nid=0x6a3b waiting for monitor entry [0x0000000052ec4000]
   java.lang.Thread.State: BLOCKED (on object monitor)
    at com.mycompany.collections.CustomQueue.remove(CustomQueue.java:101)
    - locked <0x00002aae6465a650> (a java.util.ArrayDeque)
    at com.mycompany.collections.CustomQueue.trim(CustomQueue.java:318)
    at com.mycompany.collections.CustomQueue.itemProcessed(CustomQueue.java:302)
    at com.mycompany.collections.CustomQueue.trackCompleted(CustomQueue.java:147)
    at java.util.concurrent.ThreadPoolExecutor.afterExecute(Unknown Source)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
    at java.lang.Thread.run(Unknown Source)

   Locked ownable synchronizers:
    - <0x00002aaf5f9c2680> (a java.util.concurrent.ThreadPoolExecutor$Worker)

"xxx>Job Read-2" daemon prio=10 tid=0x00002aca001a5000 nid=0x6a3a waiting for monitor entry [0x0000000052d83000]
   java.lang.Thread.State: BLOCKED (on object monitor)
    at com.mycompany.collections.CustomQueue.remove(CustomQueue.java:101)
    -  locked <0x00002aae6465a650> (a java.util.ArrayDeque)
    at com.mycompany.collections.CustomQueue.trim(CustomQueue.java:318)
    at com.mycompany.collections.CustomQueue.itemProcessed(CustomQueue.java:302)
    at com.mycompany.collections.CustomQueue.trackCompleted(CustomQueue.java:147)
    at java.util.concurrent.ThreadPoolExecutor.afterExecute(Unknown Source)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
    at java.lang.Thread.run(Unknown Source)

   Locked ownable synchronizers:
    - <0x00002aaf5f9ed518> (a java.util.concurrent.ThreadPoolExecutor$Worker)

"xxx>Job Read-1" daemon prio=10 tid=0x00002aca00183000 nid=0x6a39 waiting for monitor entry [0x0000000052c42000]
   java.lang.Thread.State: BLOCKED (on object monitor)
    at com.mycompany.collections.CustomQueue.remove(CustomQueue.java:101)
    - waiting to lock <0x00002aae6465a650> (a java.util.ArrayDeque)
    at com.mycompany.collections.CustomQueue.trim(CustomQueue.java:318)
    at com.mycompany.collections.CustomQueue.itemProcessed(CustomQueue.java:302)
    at com.mycompany.collections.CustomQueue.trackCompleted(CustomQueue.java:147)
    at java.util.concurrent.ThreadPoolExecutor.afterExecute(Unknown Source)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
    at java.lang.Thread.run(Unknown Source)

   Locked ownable synchronizers:
    - <0x00002aaf5f9ecde8> (a java.util.concurrent.ThreadPoolExecutor$Worker)


"xxx>Job Read-0" daemon prio=10 tid=0x0000000006a83000 nid=0x6a36 waiting for monitor entry [0x000000005287f000]
   java.lang.Thread.State: BLOCKED (on object monitor)
        at com.mycompany.collections.CustomQueue.remove(CustomQueue.java:101)
    - waiting to lock <0x00002aae6465a650> (a java.util.ArrayDeque)
    at com.mycompany.collections.CustomQueue.trim(CustomQueue.java:318)
    at com.mycompany.collections.CustomQueue.itemProcessed(CustomQueue.java:302)
    at com.mycompany.collections.CustomQueue.trackCompleted(CustomQueue.java:147)
    at java.util.concurrent.ThreadPoolExecutor.afterExecute(Unknown Source)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
    at java.lang.Thread.run(Unknown Source)

Here is the Java code extracted, which shows where the error is...

public class Deadlock {
        final Deque<Object> delegate  = new ArrayDeque<>();
        final long maxSize = Long.MAX_VALUE;

        private final AtomicLong totalExec = new AtomicLong();
        private final Map<Object, AtomicLong> totals = new HashMap<>();
        private final Map<Object, Deque<Long>> execTimes = new HashMap<>();

        public void trim() {
            //Possible optimization is evicting in chunks, segmenting by arrival time
            while (this.totalExec.longValue() > this.maxSize) {
                final Object t = this.delegate.peek();
                final Deque<Long> execTime = this.execTimes.get(t);
                final Long exec = execTime.peek();
                if (exec != null && this.totalExec.longValue() - exec > this.maxSize) {
                    //If Job Started Inside of Window, remove and re-loop
                    remove();
                }
                else {
                    //Otherwise exit the loop
                    break;
                }
            }
        }

        public Object remove() {
            Object removed;
            synchronized (this.delegate) { //4 Threads deadlocking on this line !
                removed = this.delegate.pollFirst();
            }
            if (removed != null) {
                itemRemoved(removed);
            }
            return removed;
        }

        public void itemRemoved(final Object t) {
            //Decrement Total & Queue
            final AtomicLong catTotal = this.totals.get(t);
            if (catTotal != null) {
                if (!this.execTimes.get(t).isEmpty()) {
                    final Long exec = this.execTimes.get(t).pollFirst();
                    if (exec != null) {
                        catTotal.addAndGet(-exec);
                        this.totalExec.addAndGet(-exec);
                    }
                }
            }
        }
    }

Solution

  • Thanks to the responses here, it became clear that the issue was none Thread Safe usage of multiple Collections.

    To resolve the issue, I've made the trim method synchronized and replaced usage of HashMap with ConcurrentHashMap and ArrayDeque with LinkedBlockingDeque (Concurrent Collections FTW!)

    A further planned enhancement is to change the usage of 2 separate Maps into a single Map containing a Custom Object, that way keeping the operations (in itemRemoved) atomic.