Search code examples
ignite

Ignite service hangs when call cache remove in another cache's invoke processor, " Possible starvation in striped pool"?


Ignite logs have starvation waringings and stop to provide service:

[12:55:22,080][WARNING][grid-timeout-worker-#71][G] >>> Possible starvation in striped pool.
    Thread name: sys-stripe-25-#26
    Deadlock: false
    Completed: 16272032
Thread [name="sys-stripe-25-#26", id=51, state=WAITING, blockCnt=79, waitCnt=15616666]
    at sun.misc.Unsafe.park(Native Method)
    at java.util.concurrent.locks.LockSupport.park(LockSupport.java:304)
    at o.a.i.i.util.future.GridFutureAdapter.get0(GridFutureAdapter.java:177)
    at o.a.i.i.util.future.GridFutureAdapter.get(GridFutureAdapter.java:140)
    at o.a.i.i.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.remove0(GridDhtAtomicCache.java:716)
    at o.a.i.i.processors.cache.GridCacheAdapter.remove(GridCacheAdapter.java:3084)
    at o.a.i.i.processors.cache.GridCacheAdapter.remove(GridCacheAdapter.java:3065)
    at o.a.i.i.processors.cache.IgniteCacheProxyImpl.remove(IgniteCacheProxyImpl.java:1131)
    at o.a.i.i.processors.cache.GatewayProtectedCacheProxy.remove(GatewayProtectedCacheProxy.java:998)
    at com.test.info.TestInfoBasicExecutor.handleCurrentLevel(TestInfoBasicExecutor.java:281)
    at com.test.info.TestInfoBasicExecutor$infoEntryProcessor.process(TestInfoBasicExecutor.java:514)
    at com.test.info.TestInfoBasicExecutor$infoEntryProcessor.process(TestInfoBasicExecutor.java:453)
    at o.a.i.i.processors.cache.GridCacheMapEntry$AtomicCacheUpdateClosure.runEntryProcessor(GridCacheMapEntry.java:5142)
    at o.a.i.i.processors.cache.GridCacheMapEntry$AtomicCacheUpdateClosure.call(GridCacheMapEntry.java:4550)
    at o.a.i.i.processors.cache.GridCacheMapEntry$AtomicCacheUpdateClosure.call(GridCacheMapEntry.java:4367)
    at o.a.i.i.processors.cache.persistence.tree.BPlusTree$Invoke.invokeClosure(BPlusTree.java:3051)
    at o.a.i.i.processors.cache.persistence.tree.BPlusTree$Invoke.access$6200(BPlusTree.java:2945)
    at o.a.i.i.processors.cache.persistence.tree.BPlusTree.invokeDown(BPlusTree.java:1717)
    at o.a.i.i.processors.cache.persistence.tree.BPlusTree.invoke(BPlusTree.java:1600)
    at o.a.i.i.processors.cache.IgniteCacheOffheapManagerImpl$CacheDataStoreImpl.invoke(IgniteCacheOffheapManagerImpl.java:1199)
    at o.a.i.i.processors.cache.persistence.GridCacheOffheapManager$GridCacheDataStore.invoke(GridCacheOffheapManager.java:1357)
    at o.a.i.i.processors.cache.IgniteCacheOffheapManagerImpl.invoke(IgniteCacheOffheapManagerImpl.java:345)
    at o.a.i.i.processors.cache.GridCacheMapEntry.innerUpdate(GridCacheMapEntry.java:1767)
    at o.a.i.i.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.updateSingle(GridDhtAtomicCache.java:2420)
    at o.a.i.i.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.update(GridDhtAtomicCache.java:1883)
    at o.a.i.i.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.updateAllAsyncInternal0(GridDhtAtomicCache.java:1736)
    at o.a.i.i.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.updateAllAsyncInternal(GridDhtAtomicCache.java:1628)
    at o.a.i.i.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.processNearAtomicUpdateRequest(GridDhtAtomicCache.java:3055)
    at o.a.i.i.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.access$400(GridDhtAtomicCache.java:130)
    at o.a.i.i.processors.cache.distributed.dht.atomic.GridDhtAtomicCache$5.apply(GridDhtAtomicCache.java:266)
    at o.a.i.i.processors.cache.distributed.dht.atomic.GridDhtAtomicCache$5.apply(GridDhtAtomicCache.java:261)
    at o.a.i.i.processors.cache.GridCacheIoManager.processMessage(GridCacheIoManager.java:1060)
    at o.a.i.i.processors.cache.GridCacheIoManager.onMessage0(GridCacheIoManager.java:579)
    at o.a.i.i.processors.cache.GridCacheIoManager.handleMessage(GridCacheIoManager.java:378)
    at o.a.i.i.processors.cache.GridCacheIoManager.handleMessage(GridCacheIoManager.java:304)
    at o.a.i.i.processors.cache.GridCacheIoManager.access$100(GridCacheIoManager.java:99)
    at o.a.i.i.processors.cache.GridCacheIoManager$1.onMessage(GridCacheIoManager.java:293)
    at o.a.i.i.managers.communication.GridIoManager.invokeListener(GridIoManager.java:1555)
    at o.a.i.i.managers.communication.GridIoManager.processRegularMessage0(GridIoManager.java:1183)
    at o.a.i.i.managers.communication.GridIoManager.access$4200(GridIoManager.java:126)
    at o.a.i.i.managers.communication.GridIoManager$9.run(GridIoManager.java:1090)
    at o.a.i.i.util.StripedExecutor$Stripe.run(StripedExecutor.java:505)
    at java.lang.Thread.run(Thread.java:745)

I use invoke to update Cache A, and in the etnryprocessor of cache A, I konw the processor is already invoked wiht a lock, and i just doing update for another cacher base this entry, I have checked the value of Cache A, and based on the value, do update to cache B entries, i.e. put or remove, in my test, put is ok, but for remove it seems the remove cause service hangs:

    at com.test.info.TestInfoBasicExecutor.handleCurrentLevel(TestInfoBasicExecutor.java:281)
    at com.test.info.TestInfoBasicExecutor$infoEntryProcessor.process(TestInfoBasicExecutor.java:514)
    at com.test.info.TestInfoBasicExecutor$infoEntryProcessor.process(TestInfoBasicExecutor.java:453)

======================================================

Update 0702:

To prevent the starvation, i changed my code:

In Ignite Service A's excute function:

cacheA.invoke(record){ // do process to record

igniteQueue.put(processed_record);

}

In Ignite Service B's excute function:

saved_processed_record = igniteQueue.take();

=================

I have try to use this way to prevent the starvation, It runs smoothly when the old code with starvation(TPS is low), but when i running with high TPS, the "Possible starvation in striped pool" back again,

It seems I use igniteQueue in cache.invoke is also not correct vs. previous cache in cache.invoke

When i want is do process for each record in cache, and then base the processed record to update other caches, but it seems it's not possilbe?


Solution

  • You should avoid doing cache operations within the entry processor, even if those operations belong to other caches. The reason for that is that all these operations will use the same thread pool - this can cause starvation.