What does it mean to bind a multicast (UDP) socket?
To bind a UDP socket when receiving multicast means to specify an address and port from which to receive data (NOT a local interface, as is the case for TCP acceptor bind). The address specified in this case has a filtering role, i.e. the socket will only receive datagrams sent to that multicast address & port, no matter what groups are subsequently joined by the socket. This explains why when binding to INADDR_ANY (0.0.0.0) I received datagrams sent to my multicast group, whereas when binding to any of the local interfaces I did not receive anything, even though the datagrams were being sent on the network to which that interface corresponded.
Quoting from UNIX® Network Programming Volume 1, Third Edition: The Sockets Networking API by W.R Stevens. 21.10. Sending and Receiving
[...] We want the receiving socket to bind the multicast group and port, say 239.255.1.2 port 8888. (Recall that we could just bind the wildcard IP address and port 8888, but binding the multicast address prevents the socket from receiving any other datagrams that might arrive destined for port 8888.) We then want the receiving socket to join the multicast group. The sending socket will send datagrams to this same multicast address and port, say 239.255.1.2 port 8888.
The "bind" operation is basically saying, "use this local UDP port for sending and receiving data. In other words, it allocates that UDP port for exclusive use for your application. (Same holds true for TCP sockets).
When you bind to "0.0.0.0" (INADDR_ANY
), you are basically telling the TCP/IP layer to use all available adapters for listening and to choose the best adapter for sending. This is standard practice for most socket code. The only time you wouldn't specify 0 for the IP address is when you want to send/receive on a specific network adapter.
Similarly if you specify a port value of 0 during bind, the OS will assign a randomly available port number for that socket. So I would expect for UDP multicast, you bind to INADDR_ANY on a specific port number where multicast traffic is expected to be sent to.
The "join multicast group" operation (IP_ADD_MEMBERSHIP
) is needed because it basically tells your network adapter to listen not only for ethernet frames where the destination MAC address is your own, it also tells the ethernet adapter (NIC) to listen for IP multicast traffic as well for the corresponding multicast ethernet address. Each multicast IP maps to a multicast ethernet address. When you use a socket to send to a specific multicast IP, the destination MAC address on the ethernet frame is set to the corresponding multicast MAC address for the multicast IP. When you join a multicast group, you are configuring the NIC to listen for traffic sent to that same MAC address (in addition to its own).
Without the hardware support, multicast wouldn't be any more efficient than plain broadcast IP messages. The join operation also tells your router/gateway to forward multicast traffic from other networks. (Anyone remember MBONE?)
If you join a multicast group, all the multicast traffic for all ports on that IP address will be received by the NIC. Only the traffic destined for your binded listening port will get passed up the TCP/IP stack to your app. In regards to why ports are specified during a multicast subscription - it's because multicast IP is just that - IP only. "ports" are a property of the upper protocols (UDP and TCP).
You can read more about how multicast IP addresses map to multicast ethernet addresses at various sites. The Wikipedia article is about as good as it gets:
The IANA owns the OUI MAC address 01:00:5e, therefore multicast packets are delivered by using the Ethernet MAC address range 01:00:5e:00:00:00 - 01:00:5e:7f:ff:ff. This is 23 bits of available address space. The first octet (01) includes the broadcast/multicast bit. The lower 23 bits of the 28-bit multicast IP address are mapped into the 23 bits of available Ethernet address space.
Correction for What does it mean to bind a multicast (udp) socket? as long as it partially true at the following quote:
The "bind" operation is basically saying, "use this local UDP port for sending and receiving data. In other words, it allocates that UDP port for exclusive use for your application
There is one exception. Multiple applications can share the same port for listening (usually it has practical value for multicast datagrams), if the SO_REUSEADDR
option applied. For example
int sock = socket(AF_INET, SOCK_DGRAM, IPPROTO_UDP); // create UDP socket somehow
...
int set_option_on = 1;
// it is important to do "reuse address" before bind, not after
int res = setsockopt(sock, SOL_SOCKET, SO_REUSEADDR, (char*) &set_option_on,
sizeof(set_option_on));
res = bind(sock, src_addr, len);
If several processes did such "reuse binding", then every UDP datagram received on that shared port will be delivered to each of the processes (providing natural joint with multicasts traffic).
Here are further details regarding what happens in a few cases:
attempt of any bind ("exclusive" or "reuse") to free port will be successful
attempt to "exclusive binding" will fail if the port is already "reuse-binded"
attempt to "reuse binding" will fail if some process keeps "exclusive binding"