Hi,
We're experiencing a strange behaviour with multicast igmp using udpcast. - sender (quite old 20081213) is running on a Linux SLES 11 64 bit on a VMware ESX guest - receivers (same version) are running on WinPE 2.0 x86 computers. - sender and receivers are on the same vlan
This use to work amazingly well until for some reasons (correcting a bug actually), there was a firmware update on a 10GB Dell switch where the blade server is running all the VMware VMs. Now it doesn't work anymore: - the CMD_HELLO part woks as espected, and receivers are given their ID and the multicast data adress to join in. - then when the receiver actually sends CMD_GO, receivers don't even see the transfer has yet begun and sender finally quit after the normal timeout with no answer from participants.
It seems clients never receive data sent to the multicast data address. So we made some tests using iperf as a multicast packet generator and rtpqual as a receiver; we could see that rtpqual was able to receive the packets only if we were also running an rtpqual on the machine we were generating the packet with iperf ! Then we made the same test with udp-sender and udp-receiver (2 receivers) and guess what ? it started to work as soon as we started an rtpqual process on the sender side !
we could observe that when rtpqual is not running on the sender machine igmp cuts the tree on the Dell switch and receivers aren't getting the stream anymore.
It seems udp-sender should also bind to the multicast data address and portbase. Seen on udpcast source code (socklib.c):
/** * Set socket to listen on given multicast address Not 100% clean, it * would be preferable to make a new socket, and not only subscribe it * to the multicast address but also _bind_ to it. Indeed, subscribing * alone is not enough, as we may get traffic destined to multicast * address subscribed to by other apps on the machine. However, for * the moment, we skip this concern, as udpcast's main usage is * software installation, and in that case it runs on an otherwise * quiet system. */ static int mcastListen(int sock, net_if_t *net_if, struct sockaddr_in *addr) { return mcastOp(sock, net_if, getSinAddr(addr), IP_ADD_MEMBERSHIP, "Subscribe to multicast group"); }
However mcastListen is called from makeSocket() (socklib.c) which actually binds the socket. Latest version of udpcast (20100130) doesn't seem to bring any change and behaves the same.
So, I tried to create an additional socket in startSender() (udps_negociate.c) and bind to RECEIVER_PORT(portBase):
mysock = makeSocket(ADDR_TYPE_MCAST, net_config->net_if, net_config->dataMcastAddr, RECEIVER_PORT(net_config->portBase));
But I got the same behaviour :(
So I pasted the whole openMC() function from rtpqual.c to socklib.c:
#define NOERROR(val,msg) {if (((int)(val)) < 0) {perror(msg);exit(1);}} int openMC(name, port) char *name; int port; { struct sockaddr_in sin; struct ip_mreq mreq; struct hostent *hp; int fd, one=1;
bzero(&sin, sizeof(struct sockaddr_in)); if (isdigit(*name)) { sin.sin_addr.s_addr = inet_addr(name); } else if (hp = gethostbyname(name)) { bcopy(hp->h_addr, (char *)&sin.sin_addr, hp->h_length); } else { printf("I Don't understand session name %s\n",name); exit(1); } sin.sin_family = AF_INET; sin.sin_port = port;
if (!IN_MULTICAST(ntohl(sin.sin_addr.s_addr))) { printf("%s is not a multicast session\n", name); exit(1); } mreq.imr_multiaddr = sin.sin_addr; mreq.imr_interface.s_addr = INADDR_ANY;
NOERROR(fd = socket(AF_INET, SOCK_DGRAM, 0), "socket"); NOERROR(setsockopt(fd, SOL_SOCKET, SO_REUSEADDR, &one, sizeof(one)), "SO_REUSEADDR");
if (bind(fd, (const struct sockaddr *) &sin, sizeof(sin)) == -1) { perror("Using INADDR_ANY because"); sin.sin_addr.s_addr = INADDR_ANY; NOERROR(bind(fd, (const struct sockaddr *) &sin, sizeof(sin)), "bind"); }
NOERROR(setsockopt(fd, IPPROTO_IP, IP_ADD_MEMBERSHIP, &mreq, sizeof(mreq)), "IP_ADD_MEMBERSHIP");
return(fd); }
And called it from the main() function from udp-sender.c myMcastDataAddr is a string copy from the command line argument because net_config structure only stores the sock_addr_in values and so that I can directly use the openMC() function without any modification.
/* -- */ printf("addr = %s\n", myMcastDataAddr); printf("port = %d\n", RECEIVER_PORT(net_config.portBase)); printf("openMC()\n"); mysock = openMC(myMcastDataAddr, RECEIVER_PORT(net_config.portBase)); /* -- */
Indeed, it works now, despite this is a really ugly hack! However openMC() seems to do the same as makeSocket do: create the socket, bind to the specified multicast address and port and call a setsockopt for IP_ADD_MEMBERSHIP.
So now it works for me, but it would be great if the correct patch (not mine) was integrated to the source code (maybe others would need it).
EDIT: it works with makeSocket only if I call htons() to the port number:
mysock = makeSocket(ADDR_TYPE_MCAST, net_config->net_if, &net_config->dataMcastAddr, htons(RECEIVER_PORT(net_config->portBase)));
Other calls to makeSocket() doesn't use htons to swap the bytes so I initially I wasn't doing either ...
thanks Alban
-- Alban Rodriguez Centre de Ressources Informatiques Université de La Rochelle http://cri.univ-lr.fr
Hello Rodriguez
You really should try with the most recent version of udpcast to check whether this problem has already been solved over the past 2 years.
Additionally, I would recommend to triple-check the switch' configuration. From my experience in 99 out of 100 cases there is a network misconfiguration when experiencing problems with udpcast.
Just my 0,5€.
Kind regards Jens
On Tue, Feb 9, 2010 at 3:49 PM, Rodriguez Alban alban.rodriguez@univ-lr.fr wrote:
Hi, We're experiencing a strange behaviour with multicast igmp using udpcast.
- sender (quite old 20081213) is running on a Linux SLES 11 64 bit on a
VMware ESX guest
- receivers (same version) are running on WinPE 2.0 x86 computers.
- sender and receivers are on the same vlan
This use to work amazingly well until for some reasons (correcting a bug actually), there was a firmware update on a 10GB Dell switch where the blade server is running all the VMware VMs. Now it doesn't work anymore:
- the CMD_HELLO part woks as espected, and receivers are given their ID and
the multicast data adress to join in.
- then when the receiver actually sends CMD_GO, receivers don't even see the
transfer has yet begun and sender finally quit after the normal timeout with no answer from participants. It seems clients never receive data sent to the multicast data address. So we made some tests using iperf as a multicast packet generator and rtpqual as a receiver; we could see that rtpqual was able to receive the packets only if we were also running an rtpqual on the machine we were generating the packet with iperf ! Then we made the same test with udp-sender and udp-receiver (2 receivers) and guess what ? it started to work as soon as we started an rtpqual process on the sender side ! we could observe that when rtpqual is not running on the sender machine igmp cuts the tree on the Dell switch and receivers aren't getting the stream anymore. It seems udp-sender should also bind to the multicast data address and portbase. Seen on udpcast source code (socklib.c): /** * Set socket to listen on given multicast address Not 100% clean, it * would be preferable to make a new socket, and not only subscribe it * to the multicast address but also _bind_ to it. Indeed, subscribing * alone is not enough, as we may get traffic destined to multicast * address subscribed to by other apps on the machine. However, for * the moment, we skip this concern, as udpcast's main usage is * software installation, and in that case it runs on an otherwise * quiet system. */ static int mcastListen(int sock, net_if_t *net_if, struct sockaddr_in *addr) { return mcastOp(sock, net_if, getSinAddr(addr), IP_ADD_MEMBERSHIP, "Subscribe to multicast group"); } However mcastListen is called from makeSocket() (socklib.c) which actually binds the socket. Latest version of udpcast (20100130) doesn't seem to bring any change and behaves the same.
So, I tried to create an additional socket in startSender() (udps_negociate.c) and bind to RECEIVER_PORT(portBase): mysock = makeSocket(ADDR_TYPE_MCAST, net_config->net_if, net_config->dataMcastAddr, RECEIVER_PORT(net_config->portBase));
But I got the same behaviour :( So I pasted the whole openMC() function from rtpqual.c to socklib.c: #define NOERROR(val,msg) {if (((int)(val)) < 0) {perror(msg);exit(1);}} int openMC(name, port) char *name; int port; { struct sockaddr_in sin; struct ip_mreq mreq; struct hostent *hp; int fd, one=1; bzero(&sin, sizeof(struct sockaddr_in)); if (isdigit(*name)) { sin.sin_addr.s_addr = inet_addr(name); } else if (hp = gethostbyname(name)) { bcopy(hp->h_addr, (char *)&sin.sin_addr, hp->h_length); } else { printf("I Don't understand session name %s\n",name); exit(1); } sin.sin_family = AF_INET; sin.sin_port = port; if (!IN_MULTICAST(ntohl(sin.sin_addr.s_addr))) { printf("%s is not a multicast session\n", name); exit(1); } mreq.imr_multiaddr = sin.sin_addr; mreq.imr_interface.s_addr = INADDR_ANY; NOERROR(fd = socket(AF_INET, SOCK_DGRAM, 0), "socket"); NOERROR(setsockopt(fd, SOL_SOCKET, SO_REUSEADDR, &one, sizeof(one)), "SO_REUSEADDR"); if (bind(fd, (const struct sockaddr *) &sin, sizeof(sin)) == -1) { perror("Using INADDR_ANY because"); sin.sin_addr.s_addr = INADDR_ANY; NOERROR(bind(fd, (const struct sockaddr *) &sin, sizeof(sin)), "bind"); } NOERROR(setsockopt(fd, IPPROTO_IP, IP_ADD_MEMBERSHIP, &mreq, sizeof(mreq)), "IP_ADD_MEMBERSHIP"); return(fd); } And called it from the main() function from udp-sender.c myMcastDataAddr is a string copy from the command line argument because net_config structure only stores the sock_addr_in values and so that I can directly use the openMC() function without any modification. /* -- */ printf("addr = %s\n", myMcastDataAddr); printf("port = %d\n", RECEIVER_PORT(net_config.portBase)); printf("openMC()\n"); mysock = openMC(myMcastDataAddr, RECEIVER_PORT(net_config.portBase)); /* -- */ Indeed, it works now, despite this is a really ugly hack! However openMC() seems to do the same as makeSocket do: create the socket, bind to the specified multicast address and port and call a setsockopt for IP_ADD_MEMBERSHIP. So now it works for me, but it would be great if the correct patch (not mine) was integrated to the source code (maybe others would need it). EDIT: it works with makeSocket only if I call htons() to the port number: mysock = makeSocket(ADDR_TYPE_MCAST, net_config->net_if, &net_config->dataMcastAddr, htons(RECEIVER_PORT(net_config->portBase))); Other calls to makeSocket() doesn't use htons to swap the bytes so I initially I wasn't doing either ...
thanks Alban
-- Alban Rodriguez Centre de Ressources Informatiques Université de La Rochelle http://cri.univ-lr.fr
Udpcast mailing list Udpcast@udpcast.linux.lu https://udpcast.linux.lu/cgi-bin/mailman/listinfo/udpcast
Rodriguez Alban wrote:
EDIT: it works with makeSocket only if I call htons() to the port number:
mysock = makeSocket(ADDR_TYPE_MCAST, net_config->net_if, &net_config->dataMcastAddr, htons(RECEIVER_PORT(net_config->portBase)));
Other calls to makeSocket() doesn't use htons to swap the bytes so I initially I wasn't doing either ...
makeSocket already does the byte-swapping itself (in the called function initSockAddress). So, by doing it here you really make it bind to a different port than the intended one.
The effect would be the same as doing:
mysock = makeSocket(ADDR_TYPE_MCAST, net_config->net_if, &net_config->dataMcastAddr, RECEIVER_PORT(net_config->portBase)+2);
... or even suppressing that call altogether... Could you try whether these 2 changes (or one of them) work with your hardware?
Regards,
Alain