Le Mar 10 Déc 2002 23:45, vous avez écrit :
What do you
mean by "server"? The sender, or just the DHCP server?
The sender. The disk images will be stored on a machine one one subnet
and all the receivers will be on another subnet.
1. If you mean
the sender of data, you must be aware that multicast
works best within the same subnet. It is possible to get it to work in
different subnets by using the --ttl version on both the sender and
the receivers. This sets the "time to live" on the multicast packets,
allowing the to traverse routers. However, this only works if the
router supports multicast routing, but most unfortunately don't :-(
Our router does support multicast, it is a Cisco 6500. I worked out the
problem. The address range 220.127.116.11/24 is a reserved, non forwardable
range. I had to chose and address outside this range for the routers to
forward the packets. I used the address 18.104.22.168.
I also came across some other problems. Sometimes, all the receivers
timeout on the sender. I increased the timeout but that did not help.
I think what is happening is the receiver misses a packet and sends a
notification to the sender. That packet gets lost though and both
sender and receiver are then left waiting for the other to make a move.
The udpcast server sends packets in batches ("slices") of up to 1024
packets. When a slice is transmitted, the sender "asks" the receivers
to send their confirmations. Receivers reply either with a
confirmation that everything has been received (good), or with a
bitmap of missing packets.
Missing packets are then transmitted, and confirmation is again asked
of those receivers which weren't ok the first time around.
If a receiver does not reply at all, the sender will re-ask for
confirmation every couple of seconds, until all confirmations are in.
If a receiver has not replied for about a minute, the sender drops
that receiver from its list, in order to be able to finish the
transmission with the others.
If I limit the bit rate, this problem is less frequent
so I think the
router has bandwidth restrictions and once they are exceeded it drops
packets. This would not be a problem though if the session did not
Unfortunately for the moment the timeout duration is hardwired, and is
200. This is specified in line 921 of senddata.c:
if(rexmitSlice->rxmitId > 200)
You can change it there to a higher value. In a future version, this
will become configurable.
If this is the case, how hard do you think it would be
for the receiver
to retransmit whatever it needs to get the transmission going again, if
say it has not got a reply from the sender in so many milliseconds?
Obviously, after a while, the router is dropping all the packets in
one direction (at least) after a while. After such an occurrence, no
matter what the receiver sends will not unblock the situation.
In order to debug this, you can try running the command-line variant
of udpcast on an server and (already installed) receiver(s), and run
tcdpump along with it to check where the packets are dropped.