So I have configured my Apache Ignite.NET instance to run as server:
var cfg = new IgniteConfiguration
{
CommunicationSpi = new TcpCommunicationSpi
{
LocalPort = config.CommunicationPort,
LocalPortRange = config.CommunicationPortRange,
MaxConnectTimeout = TimeSpan.FromMilliseconds(10000),
ConnectTimeout = TimeSpan.FromMilliseconds(1000)
},
AutoGenerateIgniteInstanceName = true,
ClientMode = false,
IsActiveOnStart = true,
DiscoverySpi = new TcpDiscoverySpi
{
LocalPort = config.DiscoveryPort,
LocalPortRange = config.DiscoveryPortRange,
ForceServerMode = true,
LocalAddress = localAddress,
IpFinder = new TcpDiscoveryStaticIpFinder
{
Endpoints = config.ClusterEndPoints
}
},
Localhost = config.LocalAddress,
};
I use the ForceServerMode = true and in the DiscoverySpi.Endpoints I have my local ip along with a list of IP of my cluster.
What I'm seeing is that for some reason the Join calls by ignite timeout. Here's the exception log I get:
Level: [Error], Message:[Exception on direct send: connect timed out] Native:[java.net.SocketTimeoutException: connect timed out
at java.net.DualStackPlainSocketImpl.waitForConnect(Native Method)
at java.net.DualStackPlainSocketImpl.socketConnect(DualStackPlainSocketImpl.java:85)
at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:172)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
at java.net.Socket.connect(Socket.java:589)
at org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi.openSocket(TcpDiscoverySpi.java:1376)
at org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi.openSocket(TcpDiscoverySpi.java:1339)
at org.apache.ignite.spi.discovery.tcp.ServerImpl.sendMessageDirectly(ServerImpl.java:1159)
at org.apache.ignite.spi.discovery.tcp.ServerImpl.sendJoinRequestMessage(ServerImpl.java:1006)
at org.apache.ignite.spi.discovery.tcp.ServerImpl.joinTopology(ServerImpl.java:851)
at org.apache.ignite.spi.discovery.tcp.ServerImpl.spiStart(ServerImpl.java:358)
at org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi.spiStart(TcpDiscoverySpi.java:1834)
at org.apache.ignite.internal.managers.GridManagerAdapter.startSpi(GridManagerAdapter.java:297)
at org.apache.ignite.internal.managers.discovery.GridDiscoveryManager.start(GridDiscoveryManager.java:837)
at org.apache.ignite.internal.IgniteKernal.startManager(IgniteKernal.java:1770)
at org.apache.ignite.internal.IgniteKernal.start(IgniteKernal.java:977)
at org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.start0(IgnitionEx.java:1896)
at org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.start(IgnitionEx.java:1648)
at org.apache.ignite.internal.IgnitionEx.start0(IgnitionEx.java:1076)
at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:574)
at org.apache.ignite.internal.processors.platform.PlatformAbstractBootstrap.start(PlatformAbstractBootstrap.java:48)
at org.apache.ignite.internal.processors.platform.PlatformIgnition.start(PlatformIgnition.java:76)
]
So that's fine, maybe there is some network issue, partitioning, firewall etc.. I can figure that out.
What I don't understand is why does the call to start ingite node hang. I expect it to try to connect to those endpoints and if not able to, it should just start local node. Here's how I start my node
Ignition.Start(cfg);
Instead what I see is that it keeps trying to join those timeout logs are written, and it never stops and the application hangs indefinitely.
I am missing some configuration to make Ignite give up trying to connect and just start local mode, or just fail altogether.
[Edit] This only happens when I already have other apps with ignite running in a cluster and this new node tries to join the existing cluster via static ips (and it's VM has a bad network config which prevents it from talking to the existing cluster). If I try to start this new node and there are no ignite instances already running, it does NOT hang, it just goes ahead and starts local ignite node.