I am using jgroups reliable multicast to communicate. I have a structure as follows.
When I start the applications, first joiner app in each node can join the cluster without problem and can communicate each other. Remaining nodes are unstable that their behaviour changes for each startup. Sometimes they can join the cluster, sometimes not. I could not find any pattern.
A log that I can share is the one saying sth like "... is not a member, discarding message". So, it is seen that it cannot join the cluster. The protocol stack I applied
PING
MERGE2
FD_SOCK
FD_ALL with values "timeout"=12000, "interval"=3000
VERIFY_SUSPECT
BARRIER
NAKACK
UNICAST2
STABLE
GMS
UFC
MFC
FRAG2
How can I handle this problem? (Version 3.6.1.Final)
I upgraded to jgroups version 4.* and updated protocols with newer versions with one new protocol as follows.
PING
MERGE3
FD_SOCK
FD_ALL with values "timeout"=15000, "interval"=3000
VERIFY_SUSPECT
BARRIER
NAKACK2
UNICAST3
STABLE
GMS with value "max_join_attempts"=0
UFC
MFC
FRAG2
STATE_TRANSFER
Not sure which solved problem but "max_join_attempts"=0 part may be the core point as nodes do not give up trying to join the cluster this way.