Search code examples
cachingdistributedinfinispaninvalidation

Infinispan cluster in Invalidation mode - get(key) returns NULL though some nodes have the value


I have the following topology: Infinispan cluster in Invalidation mode, puts are performed on one node and gets are performed on the other ones. When cluster consists of only two nodes, everything works well: when key/value is inserted to one node, the other one, when asked for the first time, queries that node and fetches the value from there. If key is updated/removed an invalidation message is sent.

The problems start when there is more then two nodes in the cluster: after the key is inserted to one node, when the other one is asked for that key and its value it some times return the value and some times return NULL.

This makes sense, from certain point of view, since the node queries its neighbours and some of them have the value and the others don't. Whichever replies first, will define whether the response shall be NULL or the real value.

Though making sense, this behaviour renders this operation mode quite useless, which lead me to think that perhaps I'm missing something. Here is my configuration:

<infinispan xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="urn:infinispan:config:7.0 http://www.infinispan.org/schemas/infinispan-config-7.0.xsd"  xmlns="urn:infinispan:config:7.0">
    <jgroups>
        <stack-file name="tcp" path="jgroups-tcp.xml" />
    </jgroups>
   <cache-container name="SampleCacheManager" statistics="true" default-cache="invalidatedWithClusterCacheLoaderCache" shutdown-hook="DEFAULT">
     <transport stack="tcp" cluster="clustered" node-name="NodeA"/>
     <serialization marshaller="org.infinispan.marshall.core.VersionAwareMarshaller"            version="1.0">
     </serialization>
     <jmx domain="org.infinispan" />    
     <invalidation-cache name="invalidatedWithClusterCacheLoaderCache" mode="SYNC" remote-timeout="20000" >
        <persistence>
                <cluster-loader remote-timeout="20000" preload="false" ></cluster-loader>
        </persistence> 
     </invalidation-cache>
   </cache-container>
</infinispan>

jgroups-tcp.xml:

<config xmlns="urn:org:jgroups"
        xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
        xsi:schemaLocation="urn:org:jgroups http://www.jgroups.org/schema/JGroups-3.4.xsd">
    <TCP bind_port="7800" port_range="10"
         recv_buf_size="20000000"
         send_buf_size="640000"
         loopback="false"
         max_bundle_size="64k"
         bundler_type="sender-sends-with-timer"
         enable_diagnostics="true"
         thread_naming_pattern="cl"

         timer_type="new"
         timer.min_threads="4"
         timer.max_threads="10"
         timer.keep_alive_time="3000"
         timer.queue_max_size="1000"
         timer.wheel_size="200"
         timer.tick_time="50"

         thread_pool.enabled="true"
         thread_pool.min_threads="2"
         thread_pool.max_threads="8"
         thread_pool.keep_alive_time="5000"
         thread_pool.queue_enabled="true"
         thread_pool.queue_max_size="100000"
         thread_pool.rejection_policy="discard"

         oob_thread_pool.enabled="true"
         oob_thread_pool.min_threads="1"
         oob_thread_pool.max_threads="8"
         oob_thread_pool.keep_alive_time="5000"
         oob_thread_pool.queue_enabled="false"
         oob_thread_pool.queue_max_size="100"
         oob_thread_pool.rejection_policy="discard"/>

   <MPING bind_addr="${jgroups.bind_addr:127.0.0.1}" break_on_coord_rsp="true"
          mcast_addr="${jgroups.mping.mcast_addr:228.2.4.6}"
          mcast_port="${jgroups.mping.mcast_port:43366}"
          ip_ttl="${jgroups.udp.ip_ttl:2}"
          num_initial_members="2" timeout="2000"/>

    <MERGE3/>

    <FD_SOCK/>
    <FD_ALL interval="2000" timeout="5000" />
    <VERIFY_SUSPECT timeout="500"  />
    <BARRIER />
    <pbcast.NAKACK use_mcast_xmit="false"
                   retransmit_timeout="100,300,600,1200"
                   discard_delivered_msgs="true" />
    <UNICAST3 conn_expiry_timeout="0"/>

    <pbcast.STABLE stability_delay="1000" desired_avg_gossip="50000"
                   max_bytes="10m"/>
    <pbcast.GMS print_local_addr="true" join_timeout="5000"
                max_bundling_time="30"
                view_bundling="true"/>
    <MFC max_credits="2M"
         min_threshold="0.4"/>
    <FRAG2 frag_size="60000"  />
    <pbcast.STATE_TRANSFER  />
</config>

To summarize my question: is it supposed to work this way or is it misconfigured in my case?


Solution

  • Invalidation cache does not retrieve remote values. It is described here [1]. It only will retrieve values locally in memory.

    The remote lookup is done by your cluster-loader you have configured in your persistence configuration. This will ask all the other nodes in the cluster for the value. I tweaked one of the existing Infinispan tests to have more than 2 caches and as you experienced there was a miss in the remote lookup. It appears that the cache loader returns null if a node without the value returns before one that has the value (it takes the first response).

    I logged [2] to look into this.

    [1] http://infinispan.org/docs/7.0.x/user_guide/user_guide.html#_invalidation_mode [2] https://issues.jboss.org/browse/ISPN-5134