Hi all,
I was wondering if anyone knows or could tell what the bottleneck is for udpcast multicast speeds?
We have been using udpcast satisfactory for a while now in combination with systemimager. In our previous situation the maximum speed seemed to be around 20 Mbps (MAX_BITRATE). Setting the speed any higher would result in dropped slices in the cast and some receivers not getting the image in the first cast. This was on a 100 Mbps network.
We now have a new setup where we image machines over a 1000 Mbps network, and the maximum speed seems to be 40 Mbps for udpcast? If we cast any faster the number of machines failing to successfully receive the cast dramatically increases (even though in the first tests, even at 40 Mbps there are 3 out of 40 still dropping).
Now this is on machines with 15.000 rpm scsi disk's. The hd and network both should be able to go as fast as 100 MByte/sec in theory. However we can't even get a stable cast at a 10th of this speed. It's not a very big issue since we save a lot of time by imaging all machines at the same time, but I was still wondering what exactly the bottleneck for the multicast's speed is.
Is it the udp protocol, or the multicast technique, or could it still be a hardware issue?
Any opinions on the subject are appreciated, perhaps some of the authors of udpcast could give some insight?
Kind regards,
-- Ramon Bastiaans SARA - Academic Computing Services Kruislaan 415 1098 SJ Amsterdam
On Thursday 23 September 2004 09:31, Ramon Bastiaans wrote:
Hi all,
I was wondering if anyone knows or could tell what the bottleneck is for udpcast multicast speeds?
We have been using udpcast satisfactory for a while now in combination with systemimager. In our previous situation the maximum speed seemed to be around 20 Mbps (MAX_BITRATE). Setting the speed any higher would result in dropped slices in the cast and some receivers not getting the image in the first cast. This was on a 100 Mbps network.
Wow, that's slow. In comparison, Udpcast can almost saturate a 100 Mbps network (80 Mbps or beyond are easily achievable). Often, the bottleneck is not the network, but the local hard disk, especially when operating in compressed mode.
We now have a new setup where we image machines over a 1000 Mbps network, and the maximum speed seems to be 40 Mbps for udpcast? If we cast any faster the number of machines failing to successfully receive the cast dramatically increases (even though in the first tests, even at 40 Mbps there are 3 out of 40 still dropping).
I've no personal experience with 1000 Mbps networks, but from reports I got it seems that: * it cannot saturate the network * but 500 Mbps is achievable * paradoxically, "optimum" speed is achieved by limiting the bitrate (!). i.e. add --max-bitrate 600m
The reason for this seems to be that if data is sent too fast, 1000 Mbps equipment randomly drops frames, rather than using flow control, which seems to affect speed in a much worse way than if the bitrate is already limited at the source. Fortunately, no such phenomenon exists on most 100 Mbps switches.
To find the optimal limit on a 1000 Mbps network, start with 400m, and then increase the limit (by increments of 100m for instance), until the speed you get out of it no longer raises (or even falls).
Now this is on machines with 15.000 rpm scsi disk's. The hd and network both should be able to go as fast as 100 MByte/sec in theory. However we can't even get a stable cast at a 10th of this speed.
A tenth of the speed? This is very suspicious, as it puts the speed at 80 Mbps, i.e. a speed you could get on a 100 Mbps network. Make sure that there are no 100 Mbps devices directly or indirectly connected to the switch (even if they are not actually participating in the cast). Because if flow control _is_ enabled, the switch might slow down communication so that even these "slow" ports can follow. Yes, if IGMP snooping is enabled this should theoretically not be an issue, but there are a scary number of switches out there with buggy or missing IGMP support (i.e. even if a menu item "IGMP snooping" is present in the switch's management interface, it might be a no-op!). To find out whether this is the case, try disconnecting any equipment from the switch which is not directly participating in the multicast. If this is not feasible, try to swithch off flow control just on the "slow" ports (many switches support a per-port flow control setting).
It's not a very big issue since we save a lot of time by imaging all machines at the same time, but I was still wondering what exactly the bottleneck for the multicast's speed is.
If the issue is not 100 Mbps devices slowing down the transfer, try adding --max-bitrate 600m to the sender, and it should work.
Is it the udp protocol, or the multicast technique, or could it still be a hardware issue?
[if the issue is not the 100 Mbps device issue], probably some kind of hardware issue, but I couldn't yet exactly pinpoint what the issue is. As said, the problem mostly seems to arise on Gbps networks. On 100 Mbps networks, udpcast can saturate the network just fine, without any special tweaks.
[the "slow device" issue, OTOH, also happens on 100 Mbps networks, mostly in connection with network printers that only support 10 Mbps]
Any opinions on the subject are appreciated, perhaps some of the authors of udpcast could give some insight?
Kind regards,
Regards,
Alain
On Thu, 23 Sep 2004, Ramon Bastiaans wrote:
I was wondering if anyone knows or could tell what the bottleneck is for udpcast multicast speeds?
[...]
Is it the udp protocol, or the multicast technique, or could it still be a hardware issue?
Any opinions on the subject are appreciated, perhaps some of the authors of udpcast could give some insight?
Disclaimer: I'm not an author of udpcast, but I have experience with multicasting large amounts of data in clusters. Furthermore, I wrote a reliable multicast protocol many years ago and more recently a tool similar to udpcast, which works technically different (Dolly [1]).
There are a number of possible bottlenecks in such a scenario: First, there are the trivial bottlenecks like disk speeds and network throughput. With Gigabit Ethernet the network will almost certainly not be the bottleneck. Second, there are the more complex bottlenecks like CPU, memory and PCI bus, or complexity.
Personlly I think that when using IP multicast (as udpcast does), the complexity of the whole protocol might be a limiting factor, because a single sender has to coordinate so many receivers. However, I don't have any data to substantiate this claim. The problem is that the sender has to send the data at the speed of the slowest receiver. The slowest receiver is not necessarily known in advance and it might also change during the transmission. Adapting the speed correctly is not an easy task.
Thus, for our own cloning tool Dolly -- I'm sorry for the shameless plug on this mailinglist -- we use TCP to transfer large data files (like whole partitions or disks) to many nodes in a cluster. Since TCP works only between a single sender and a single receiver, it can much better adapt to the maximal transmission throughput as well as to changing conditions. To link all the participating nodes together, we simply form a virtual ring with TPC connections. The data is then sent around this link concurrently. It sounds to be against intuition, but works remarkably well (and in fact better than any IP-multicast-based approach I have heard of so far).
For example, in a cluster of 16 nodes with 1-GHz PentiumII processors interconnected by Gigabit Ethernet, we could get up to approximately 60 MByte/s throughput with Dolly (for benchmark reasons and to eliminate the trivial bottleneck without actually accessing the disks). With udpcast we got about 45 MByte/s (also without accessing the disks) after tweaking with the parameters (sometimes udpcast simply stopped transmission).
Please note that I'm not saying udpcast is bad. It just has different application areas. Udpcast is much better (or even the only solution) if the network is not switched, asymmetric or even unidirectional. For a tightly interconnected, switched high-speed network in a cluster, Dolly achieves usually better throughput. Therefore, Dolly is by the way used as cloning tool for the Xibalba 128-node cluster at ETH Zurich [2].
In short, to find the bottleneck in such a scenario is more complex than it might seem at first. If you are interested, you will find some research papers at [3].
- Felix
[1] http://www.cs.inf.ethz.ch/CoPs/patagonia/#dolly [2] http://www.xibalba.inf.ethz.ch/ [3] http://www.cs.inf.ethz.ch/CoPs/patagonia/#relmat
I think Alain provided an excellent tip on how to deal with the megabit ethernet. However I just want to add that in my experience the target client disk drive is always the bottleneck. It really depends on what you are imaging and whether compression is used.
Remember that the transfer rate is showing the packets transfer rate, and not the rate being written to the disk.
We use a 100 Mbit network to image 40 GB client notebook drives. Given that most of the drive on the master machine has been zero'ed, it compresses very well to less than 3GB of image. When we send out the image to the client notebooks, the sender machine disconnects from all clients, and the clients remain writing data (mostly highly compressed zeros) to disk from their 512 MB available RAM for the next 4 or 5 minutes, which covers about 5 or 6 GB of client disk writing.
The only way I can imagine the network being a bottleneck is if: - the network is already saturated with other traffic - the image is not created as compressed - the master disk is nearly full - the master disk is not zeroed in the empty sectors
--Donald Teed
On Thu, 23 Sep 2004, Ramon Bastiaans wrote:
Hi all,
I was wondering if anyone knows or could tell what the bottleneck is for udpcast multicast speeds?
We have been using udpcast satisfactory for a while now in combination with systemimager. In our previous situation the maximum speed seemed to be around 20 Mbps (MAX_BITRATE). Setting the speed any higher would result in dropped slices in the cast and some receivers not getting the image in the first cast. This was on a 100 Mbps network.
We now have a new setup where we image machines over a 1000 Mbps network, and the maximum speed seems to be 40 Mbps for udpcast? If we cast any faster the number of machines failing to successfully receive the cast dramatically increases (even though in the first tests, even at 40 Mbps there are 3 out of 40 still dropping).
Now this is on machines with 15.000 rpm scsi disk's. The hd and network both should be able to go as fast as 100 MByte/sec in theory. However we can't even get a stable cast at a 10th of this speed. It's not a very big issue since we save a lot of time by imaging all machines at the same time, but I was still wondering what exactly the bottleneck for the multicast's speed is.
Is it the udp protocol, or the multicast technique, or could it still be a hardware issue?
Any opinions on the subject are appreciated, perhaps some of the authors of udpcast could give some insight?
Kind regards,
-- Ramon Bastiaans SARA - Academic Computing Services Kruislaan 415 1098 SJ Amsterdam
Udpcast mailing list Udpcast@udpcast.linux.lu http://udpcast.linux.lu/mailman/listinfo/udpcast