Search code examples
gridgain

Gridgain: how to redistribute partitioned-mode cache when one node brought down


I am new to gridgain and we are doing a POC using gridgain. We did some simple examples using partitioned cache, it works well however we found that when we bring a node down, cache from that node was gone. so my questions is: if we keep using patitioned mode, is there any way to re-distributed cache when a node (or several nodes) is undeployed. if not, is there any good way to do it? Thanks!

configuration Code:

<context:component-scan base-package="com.test" />
 <bean id="hostGrid" class="org.gridgain.grid.GridSpringBean">
    <property name="configuration">
       <bean class="org.gridgain.grid.GridConfiguration">
    <property name="localHost" value="127.0.0.1"/>
    <property name="peerClassLoadingEnabled" value="false"/>
    <property name="marshaller">
        <bean class="org.gridgain.grid.marshaller.optimized.GridOptimizedMarshaller">  
            <property name="requireSerializable" value="false"/>
        </bean>
    </property
    <property name="cacheConfiguration">
        <list>
            <bean class="org.gridgain.grid.cache.GridCacheConfiguration">
                <property name="name" value="CACHE"/>
                <property name="cacheMode" value="PARTITIONED"/>
                <property name="store" >
                    <bean class="com.test.CacheJdbcPOCStore"></bean>
                </property>
            </bean>

        </list>
    </property>
</bean>
     </property>
 </bean> 

We deployed the same war (using above configuration) to 3 tomcat 7 server. we did not specify number of backup so it should be 1 by default.

follow up

I solved this problem by putting backups= 1 in configuration. looks like previously it did not create backup copy. however it should make 1 copy since it is by default. also, when i tried to bring down 2 nodes at one time, i saw part of cache was gone, so I set backups=2 and found no cache loss this time. so it looks like if in a very bad case where all nodes except for the main node crash, we need to have # of nodes -1 backups to prevent data loss. but if I do so then it is just like replicated mode and replicated mode has less restriction on query and transactions. So my question is : if we need to take the advantage of parallel computation and at mean time want to prevent data loss when nodes crash what is the best practice?

Thanks!


Solution

    1. Number of backups is 0 by default. The documentation has been fixed.
    2. You are right about REPLICATED mode. If you are worried about any data loss, the REPLICATED mode is the only way to guarantee it. The disadvantage here is that writes will get slower, as all the nodes in the cluster will be updated. The advantage is that the data is available on every node, so you can easily access it from your computations without worrying which node to send them to.