Le Mar 10 Déc 2002 23:45, vous avez écrit :
What do you mean by "server"? The sender, or just the DHCP server?
The sender. The disk images will be stored on a machine one one subnet and all the receivers will be on another subnet.
ok
- If you mean the sender of data, you must be aware that multicast
works best within the same subnet. It is possible to get it to work in different subnets by using the --ttl version on both the sender and the receivers. This sets the "time to live" on the multicast packets, allowing the to traverse routers. However, this only works if the router supports multicast routing, but most unfortunately don't :-(
Our router does support multicast, it is a Cisco 6500. I worked out the problem. The address range 224.0.0.0/24 is a reserved, non forwardable range. I had to chose and address outside this range for the routers to forward the packets. I used the address 239.0.0.1.
I also came across some other problems. Sometimes, all the receivers timeout on the sender. I increased the timeout but that did not help. I think what is happening is the receiver misses a packet and sends a notification to the sender. That packet gets lost though and both sender and receiver are then left waiting for the other to make a move.
The udpcast server sends packets in batches ("slices") of up to 1024 packets. When a slice is transmitted, the sender "asks" the receivers to send their confirmations. Receivers reply either with a confirmation that everything has been received (good), or with a bitmap of missing packets. Missing packets are then transmitted, and confirmation is again asked of those receivers which weren't ok the first time around. If a receiver does not reply at all, the sender will re-ask for confirmation every couple of seconds, until all confirmations are in. If a receiver has not replied for about a minute, the sender drops that receiver from its list, in order to be able to finish the transmission with the others.
If I limit the bit rate, this problem is less frequent so I think the router has bandwidth restrictions and once they are exceeded it drops packets. This would not be a problem though if the session did not timeout.
Unfortunately for the moment the timeout duration is hardwired, and is 200. This is specified in line 921 of senddata.c:
if(rexmitSlice->rxmitId > 200)
You can change it there to a higher value. In a future version, this will become configurable.
If this is the case, how hard do you think it would be for the receiver to retransmit whatever it needs to get the transmission going again, if say it has not got a reply from the sender in so many milliseconds?
John.
Obviously, after a while, the router is dropping all the packets in one direction (at least) after a while. After such an occurrence, no matter what the receiver sends will not unblock the situation.
In order to debug this, you can try running the command-line variant of udpcast on an server and (already installed) receiver(s), and run tcdpump along with it to check where the packets are dropped.
Alain