Search code examples
javacachinginfinispan

Infinispan 9, Replicated Cache is Expiring Entries but never allows them to be removed from JVM heap


Was doing some internal testing about a clustering solution on top of infinispan/jgroups and noticed that the expired entries were never becoming eligible for GC, due to a reference on the expiration-reaper, while having more than 1 nodes in the cluster with expiration enabled / eviction disabled. Due to some system difficulties the below versions are being used :

  • JDK 1.8
  • Infinispan 9.4.20
  • JGroups 4.0.21

In my example I am using a simple Java main scenario, placing a specific number of data, expecting them to expire after a specific time period. The expiration is indeed happening, as it can be confirmed both while accessing the expired entry and by the respective event listener(if its configured), by it looks that it is never getting removed from the available memory, even after an explicit GC or while getting close to an OOM error.

So the question is :

Is this really expected as default behavior, or I am missing a critical configuration as per the cluster replication / expiration / serialization ?

Example :

Cache Manager :

return new DefaultCacheManager("infinispan.xml");

infinispan.xml :

  <jgroups>
     <stack-file name="udp" path="jgroups.xml" />
  </jgroups>

  <cache-container default-cache="default">
     <transport stack="udp" node-name="${nodeName}" />
     <replicated-cache name="myLeakyCache" mode="SYNC">
        <expiration interval="30000" lifespan="3000" max-idle="-1"/>
     </replicated-cache>
  </cache-container>

Default UDP jgroups xml as in the packaged example :

.....

<UDP
        mcast_addr="${jgroups.udp.mcast_addr:x.x.x.x}"
        mcast_port="${jgroups.udp.mcast_port:46655}"
        bind_addr="${jgroups.bind.addr:y.y.y.y}"
        tos="8"
        ucast_recv_buf_size="200k"
        ucast_send_buf_size="200k"
        mcast_recv_buf_size="200k"
        mcast_send_buf_size="200k"
        max_bundle_size="64000"
        ip_ttl="${jgroups.udp.ip_ttl:2}"
        enable_diagnostics="false"
        bundler_type="old"
        thread_naming_pattern="pl"
        thread_pool.enabled="true"
        thread_pool.max_threads="30"
        />

The dummy cache entry :

public class CacheMemoryLeak implements Serializable {
    private static final long serialVersionUID = 1L;
    Date date = new Date();
}

An example usage from the "service" :

Cache<String, Object> cache = cacheManager.getCache("myLeakyCache");
cache.put(key, new CacheMemoryLeak());

Some info / tryouts :

  • When there is only one node in the cluster or restarting them sequentially the references are getting cleared.
  • Enabling Max-idle shows the same behavior (makes sense expiration reaper is the same)
  • Enabling eviction does not resolve the issue, just keeps the "expired" references count between the max limit. In case this is reached pretty fast, random eviction is happening on the live entries as well(default remove strategy)!!
  • If i change the Cache entry to be a native String, then, the infinispan.MortalCacheEntries are getting removed from the heap space on the next GC cycle, upon getting expired and marked from expiration reaper, compared to the custom object!!
  • Enabling the expiration reaper only in one node didn't resolve the issue, and might break the failover mechanism.
  • Upgraded to infinispan 10.1.8 Final , but faced the same issue.

Solution

  • As it seems noone else had the same issue or using primitive objects as cache entries, thus haven't noticed the issue. Upon replicating and fortunately traced the root cause, the below points are coming up :

    • Always implement Serializable / hashCode / equals for custom objects that are going to end been transmitted through a replicated/synchronized cache.
    • Never put primitive arrays, as the hashcode / equals would not be calculated - efficiently-
    • Dont enable eviction with remove strategy on replicated caches, as upon reaching the maximum limit, the entries are getting removed randomly - based on TinyLFU - and not based on the expired timer and never getting removed from the JVM heap.