java cassandra datastax ignite datastax-java-driver

datastax driver connection = apache ignite and cassandra(you may want to increase the driver number of per-host connections)

Components: apache ignite + apache cassandra. Use defaut datastax driver. After doing several operation(about 3-5 billions entities put to cache) we get a situation when datastax driver always reconnects to cassandra from ignite.

2017-02-16 13:29:21.287  INFO 160487 --- [ sys-#441%null%] m.r.t.d.c.m.p.c.c.p.RetryPolicyImpl      :  init cluster
2017-02-16 13:29:21.288  INFO 160487 --- [ sys-#441%null%] com.datastax.driver.core.Cluster         : New Cassandra host <our host> added
2017-02-16 13:29:21.307  INFO 160487 --- [ sys-#441%null%] m.r.t.d.c.m.p.c.c.p.RetryPolicyImpl      :  close cluster
2017-02-16 13:29:23.516  INFO 160487 --- [ sys-#441%null%] com.datastax.driver.core.ClockFactory    : Using native clock to generate timestamps.
2017-02-16 13:29:23.537  INFO 160487 --- [ sys-#441%null%] c.d.d.c.p.DCAwareRoundRobinPolicy        : Using data-center name 'datacenter1' for DCAw

And this process is endless and it can be interrupted by server restarted.

Infrastructure : 1 server ignite - ~Xmx30g and 8 cores. 25 clients ignites ~Xmx1g and 8 cores. 1 node cassandra. Batch size(entities which will be put to cache and then cassandra) is about 1-2M.

Config datasource =>

 public DataSource dataSource() {
        DataSource dataSource = new DataSource();
        dataSource.setUser(login);
        dataSource.setPassword(pass);
        dataSource.setPort(port);
        dataSource.setContactPoints(host);
        dataSource.setRetryPolicy(retryPolicy);
        dataSource.setFetchSize(10_000);
        dataSource.setReconnectionPolicy(new ConstantReconnectionPolicy(1000));
        dataSource.setLoadBalancingPolicy(DCAwareRoundRobinPolicy.builder().withUsedHostsPerRemoteDc(0).build());
        dataSource.setSocketOptions(new SocketOptions().setReadTimeoutMillis(100_000).setConnectTimeoutMillis(100_00));
        return dataSource;
    }

config cache =>

 CacheConfiguration<KeyIgnite, Long> cfg = new CacheConfiguration<>();
        cfg
                .setName(area)
                .setRebalanceMode(CacheRebalanceMode.SYNC)
                .setStartSize(1_000_000)
                .setAtomicityMode(CacheAtomicityMode.ATOMIC)
                .setIndexedTypes(KeyIgnite.class, Long.class)
                .setCacheMode(CacheMode.PARTITIONED)
                .setBackups(0);

        if (!clientMode) {

            CassandraCacheStoreFactory<KeyIgnite, Long> csFactory = new CassandraCacheStoreFactory<>();
            csFactory.setDataSource(ds);
            csFactory.setPersistenceSettings(kv);

//            CassandraCacheStoreFactoryDwh<KeyIgnite, Long> csFactory = new CassandraCacheStoreFactoryDwh<>(ds, kv,params);

            cfg
                    .setCacheStoreFactory(csFactory)
                    .setReadThrough(true)
                    .setWriteThrough(true);
        }

        cfg.setExpiryPolicyFactory(TouchedExpiryPolicy.factoryOf(new Duration(TimeUnit.DAYS, 5)));
        return cfg;

ignite config =>:

   TcpDiscoveryMulticastIpFinder finder = new TcpDiscoveryMulticastIpFinder();
        finder.setAddresses(adresses);

        return Ignition.start(
                new IgniteConfiguration()
                        .setClientMode(clientMode)
                        .setDiscoverySpi(new TcpDiscoverySpi().setIpFinder(finder).setNetworkTimeout(10000))
                        .setFailureDetectionTimeout(50000)
                        .setPeerClassLoadingEnabled(false)
                        .setLoadBalancingSpi(new RoundRobinLoadBalancingSpi())

        );

When we have done several iteration of putting items to cache we have gotten this case.

after including debug level i have gotten this record:

2017-02-17 17:48:41.570 DEBUG 24816 --- [ sys-#184%null%] com.datastax.driver.core.RequestHandler  : [1071994376-1] Error querying ds-inmemorydb-02t/10.216.28.34:9042 : com.datastax.driver.core.exceptions.DriverException: Timeout while trying to acquire available connection (you may want to increase the driver number of per-host connections)

Solution

The timeout increase on the driver helped