Search code examples
springehcachespring-cacheehcache-3

Ehcache - java.io.EOFException for disk persistent cache for removeAll operation


We are using ehcache in our spring boot application. Our spring boot version is 2.0.3.RELEASE and spring-boot-starter-cache 2.0.3.RELEASE uses ehcache 3.5.2.

Our motivation to use ehcache was that it is both jsr107 compliant and provides offheap support.

Below is our spring config:

@Configuration
@ConditionalOnWebApplication
@EnableCaching
public class CacheConfig {
    @Autowired
    private ApplicationContext context;

    @Bean
    public JCacheManagerFactoryBean jCacheManagerFactoryBean() throws IOException {
        JCacheManagerFactoryBean jCacheManagerFactoryBean = new JCacheManagerFactoryBean();
        Resource resource = context.getResource("classpath:mts/ehcache.xml");
        jCacheManagerFactoryBean.setCacheManagerUri(resource.getURI());
        return jCacheManagerFactoryBean;
    }

    @Bean
    public JCacheCacheManager ehCacheCacheManager() throws IOException {
        Properties props = System.getProperties();
        props.setProperty(Caching.JAVAX_CACHE_CACHING_PROVIDER, "org.ehcache.jsr107.EhcacheCachingProvider");
        JCacheCacheManager jCacheCacheManager = new JCacheCacheManager();
        jCacheCacheManager.setCacheManager(jCacheManagerFactoryBean().getObject());
        jCacheCacheManager.setTransactionAware(true);
        return jCacheCacheManager;

    }

}

The problem we face in production is, for a disk persistent cache with moderately big size, we have the below java.io.EOFException error in removeAll operation :

Error : RuntimeException: java.io.EOFException 
java.lang.RuntimeException: java.io.EOFException
        at org.terracotta.offheapstore.disk.storage.FileBackedStorageEngine$FileChunk.readKeyBuffer(FileBackedStorageEngine.java:541) ~[ehcache-3.5.2.jar!/:3.5.2 7941fa2573343b31ae56a12564404552c6d6eff0]
        at org.terracotta.offheapstore.disk.storage.FileBackedStorageEngine.readKeyBuffer(FileBackedStorageEngine.java:265) ~[ehcache-3.5.2.jar!/:3.5.2 7941fa2573343b31ae56a12564404552c6d6eff0]
        at org.terracotta.offheapstore.storage.PortabilityBasedStorageEngine.readKey(PortabilityBasedStorageEngine.java:119) ~[ehcache-3.5.2.jar!/:3.5.2 7941fa2573343b31ae56a12564404552c6d6eff0]
        at org.terracotta.offheapstore.OffHeapHashMap$DirectEntry.<init>(OffHeapHashMap.java:1540) ~[ehcache-3.5.2.jar!/:3.5.2 7941fa2573343b31ae56a12564404552c6d6eff0]
        at org.terracotta.offheapstore.OffHeapHashMap$EntryIterator.create(OffHeapHashMap.java:1518) ~[ehcache-3.5.2.jar!/:3.5.2 7941fa2573343b31ae56a12564404552c6d6eff0]
        at org.terracotta.offheapstore.OffHeapHashMap$EntryIterator.create(OffHeapHashMap.java:1511) ~[ehcache-3.5.2.jar!/:3.5.2 7941fa2573343b31ae56a12564404552c6d6eff0]
        at org.terracotta.offheapstore.OffHeapHashMap$HashIterator.next(OffHeapHashMap.java:1407) ~[ehcache-3.5.2.jar!/:3.5.2 7941fa2573343b31ae56a12564404552c6d6eff0]
        at org.terracotta.offheapstore.AbstractLockedOffHeapHashMap$LockedEntryIterator.next(AbstractLockedOffHeapHashMap.java:399) ~[ehcache-3.5.2.jar!/:3.5.2 7941fa2573343b31ae56a12564404552c6d6eff0]
        at org.terracotta.offheapstore.AbstractLockedOffHeapHashMap$LockedEntryIterator.next(AbstractLockedOffHeapHashMap.java:392) ~[ehcache-3.5.2.jar!/:3.5.2 7941fa2573343b31ae56a12564404552c6d6eff0]
        at org.terracotta.offheapstore.concurrent.AbstractConcurrentOffHeapMap$AggregateIterator.next(AbstractConcurrentOffHeapMap.java:553) ~[ehcache-3.5.2.jar!/:3.5.2 7941fa2573343b31ae56a12564404552c6d6eff0]
        at org.ehcache.impl.internal.store.offheap.AbstractOffHeapStore$1.next(AbstractOffHeapStore.java:499) ~[ehcache-3.5.2.jar!/:3.5.2 7941fa2573343b31ae56a12564404552c6d6eff0]
        at org.ehcache.impl.internal.store.offheap.AbstractOffHeapStore$1.next(AbstractOffHeapStore.java:489) ~[ehcache-3.5.2.jar!/:3.5.2 7941fa2573343b31ae56a12564404552c6d6eff0]
        at org.ehcache.core.EhcacheBase$Jsr107CacheBase.removeAll(EhcacheBase.java:708) ~[ehcache-3.5.2.jar!/:3.5.2 7941fa2573343b31ae56a12564404552c6d6eff0]
        at org.ehcache.jsr107.Eh107Cache.removeAll(Eh107Cache.java:304) ~[ehcache-3.5.2.jar!/:3.5.2 7941fa2573343b31ae56a12564404552c6d6eff0]
        at com.mycompany.myproject.services.cache.service.impl.CacheService.doClearCacheWithName(CacheService.java:56) ~[MyProjectServicesCache_classes.jar!/:?]  

There is nothing special in the code calling the removeAll operation. Just gets the cache with name and calls clear all:

private void doClearCacheWithName(String cacheName) {
    Cache<Object, Object> cache = cacheManager.getCache(cacheName);
    if (cache == null) {
        throw new MyProjectException(String.format("Cache with name : %s does not exist!", cacheName));
    }
    logger.info(String.format("Clearing cache with name : %s", cacheName));
    cache.removeAll();
}

Here is the production config for ourBigCache:

<cache alias="ourBigCache">
        <expiry>
            <ttl unit="seconds">21600</ttl> 
        </expiry>
        <resources>
            <heap unit="entries">1000</heap>
            <disk unit="MB">4096</disk>
        </resources>
    </cache>

We could not reproduce this neither locally nor for test environment.

Please note that this cache has very a high usage (read count is very high in production), but I guess that should not make any difference .

I could not find any similar reported issue. There are some very old disk problems menitoned but they are too old and not similar:

https://sourceforge.net/p/ehcache/discussion/322278/thread/e7a62df3/ http://forums.terracotta.org/forums/posts/list/2694.page

Any help would be greatly appreciated.

Regards


Solution

  • Copied from my reply on ehcache-users:

    Most likely this is just another symptom of https://github.com/ehcache/ehcache3/issues/2542 which was fixed in 74239a93e14eb7477841fffa36c971ef9e930686

    Unfortunately this fix hasn’t been merged anywhere yet. If you want to pick this up you could cut your own build of master (which would probably be a good thing to verify). Otherwise you'll have to wait for the first dot-release on the 3.7 line (timeline unknown at this point).

    Chris