Both nodes stay in island mode even though they are reachable

From DocWiki

Jump to: navigation, search

Both nodes stay in island mode even though they are reachable

Problem Summary Due to network instability and high packet loss (>30%), communication failure may occur between both nodes, and they can remain in island mode.

CVD tries to connect to the remote node every 10 seconds if the remote node is not yet connected. If the network connectivity is not stable and there is high packet loss, it can leave the cluster in a state where it cannot converge back.

Error Message

Subscriber node tries to join.

MCVD-LIB_TPL-7-UNK:source=com.hazelcast.cluster.TcpIpJoiner, message=
[10.78.92.214] :5900 [UccxCvdCluster-1382423947000] sending join request for
Address[10.78.92.213]:5900 MCVD-LIB_TPL-7-UNK:source=com.hazelcast.cluster.TcpIpJoiner, message=
[10.78.92.214] :5900 [UccxCvdCluster-1382423947000] Address[10.78.92.214]:5900
couldn't find a master! but there was connections
available: [Address[10.78.92.213]: 5900] MCVD-LIB_TPL-7-UNK:source=com.hazelcast.cluster.TcpIpJoiner, message=
[10.78.92.214] :5900 [UccxCvdCluster-1382423947000] Rebooting after 10 seconds.


Remote node fails to process JOIN request sent by the joining node.

MCVD-LIB_TPL-7-UNK:source=com.hazelcast.cluster.ClusterService, message=
[10.78.92.213]: 5900 [UccxCvdCluster-1382423947000] Handling join from
Address[10.78.92.214]:5900, inProgress: true, timeToStart: -51788455 MCVD-LIB_TPL-7-UNK:source=com.hazelcast.cluster.ClusterService, message=
[10.78.92.213]: 5900 [UccxCvdCluster-1382423947000] Handling join from
Address[10.78.92.214]:5900, inProgress: true, timeToStart: -51827630


Possible Cause


The problem might be due to network failure.


Recommended Action

Fix any network issue if present.

Release Release 10.0(1)
Associated CDETS # N/A

Rating: 0.0/5 (0 votes cast)

Personal tools