Two Node cluster Node A , Node B .
Service X running on Node A, Node B is DC.
We are using stack corosync with Pacemaker. Failure Timeout is 10 sec . Target-Role is started .
Events happens like this Node A sends event to Node B Service X is down Node B prints Ignoring expired failure for Service X After this Service X is never restarted by the Cluster.
Now questions are:
One Reason for this may be time difference between two servers (DC and Other Machine) .
So , DC thinks that this event is old and ignore it . Please sync time and then try to re-create the issue .