How do you programmatically configure hazelcast for the multicast discovery mechanism?
The problem appearently is that the cluster starts (and stops) and doesn't wait till enough members are in the cluster. You can set the hazelcast.initial.min.cluster.size property, to prevent this from happening.
You Can set 'hazelcast.initial.min.cluster.size' programmatically using:
Config config = new Config();
config.setProperty("hazelcast.initial.min.cluster.size","3");
Your configuration is correct BUT you have set a very long multicast timeout of 200 sec where the default is 2 sec. setting a smaller value will solve it.
From Hazelcast Java API Doc: MulticastConfig.html#setMulticastTimeoutSeconds(int)
Specifies the time in seconds that a node should wait for a valid multicast response from another node running in the network before declaring itself as master node and creating its own cluster. This applies only to the startup of nodes where no master has been assigned yet. If you specify a high value, e.g. 60 seconds, it means until a master is selected, each node is going to wait 60 seconds before continuing, so be careful with providing a high value. If the value is set too low, it might be that nodes are giving up too early and will create their own cluster.
It seems you are using TCP/IP clustering, so that is good. Try the following (from the hazelcast book)
If you are making use of iptables, the following rule can be added to allow for outbound traffic from ports 33000-31000:
iptables -A OUTPUT -p TCP --dport 33000:31000 -m state --state NEW -j ACCEPT
and to control incoming traffic from any address to port 5701:
iptables -A INPUT -p tcp -d 0/0 -s 0/0 --dport 5701 -j ACCEPT
and to allow incoming multicast traffic:
iptables -A INPUT -m pkttype --pkt-type multicast -j ACCEPT
Connectivity test If you are having troubles because machines won't join a cluster, you might check the network connectity between the 2 machines. You can use a tool called iperf for that. On one machine you execute: iperf -s -p 5701 This means that you are listening at port 5701.
At the other machine you execute the following command:
iperf -c 192.168.1.107 -d -p 5701
Where you replace '192.168.1.107' by the ip address of your first machine. If you run the command and you get output like this:
------------------------------------------------------------
Server listening on TCP port 5701
TCP window size: 85.3 KByte (default)
------------------------------------------------------------
------------------------------------------------------------
Client connecting to 192.168.1.107, TCP port 5701
TCP window size: 59.4 KByte (default)
------------------------------------------------------------
[ 5] local 192.168.1.105 port 40524 connected with 192.168.1.107 port 5701
[ 4] local 192.168.1.105 port 5701 connected with 192.168.1.107 port 33641
[ ID] Interval Transfer Bandwidth
[ 4] 0.0-10.2 sec 55.8 MBytes 45.7 Mbits/sec
[ 5] 0.0-10.3 sec 6.25 MBytes 5.07 Mbits/sec
You know the 2 machines can connect to each other. However if you are seeing something like this:
Server listening on TCP port 5701
TCP window size: 85.3 KByte (default)
------------------------------------------------------------
connect failed: No route to host
Then you know that you might have a network connection problem on your hands.