-----BEGIN PGP SIGNED MESSAGE-----
On 24 Apr 2008 at 11:06, Michael Holroyd wrote:
Date sent: Thu, 24 Apr 2008 11:06:03 -0400
From: Michael Holroyd <meekohi(a)cs.virginia.edu>
To: "Richard W.M. Jones" <rjones(a)redhat.com>
Copies to: udpcast(a)udpcast.linux.lu
Subject: Re: [Udpcast] Scaling udpcast
I tried to solve a problem much smaller than yours but still had
incredible difficulty. I was moving 10GB datasets out to 64 receivers
over a flat switched network using multicast. Unfortunately, for reasons
I never tracked down, files of this size would always get corrupted
along the way even though all the receivers had received all packets
(i.e. the md5sum would be different across all the different machines).
Eventually I ended up using small bittorrent clients instead of udpcast
since it checks the hash of each block. This also makes the process take
about twice as long, but better to get correct data slow than corrupt
I had a similar problem recently on a much smaller scale. I was testing
udpcast with a classroom to sent a just under 7GB ntfsclone image file to
varios machines. I had one sender and 4 receivers and it worked fine twice.
Then I tried with 8 receivers, and it failed. No error messages, and did it
twice, with same error. Not sure what is the cause?
I am planning on doing some more testing, and it might be a problem with
SELINUX and ports. To get it to work, I had to open port 9000 and 9001 on
the sender with udp, and port 9000 on the receiver with udp on the receiver.
Perhaps receiver also needs 9001?
The files on all the receivers seems to be that same size. Before this, I was
using a script to down the file via ftp using ncftp. On the successful runs, it
would udpcast the files to the linux partition, and then run the scipt to restore
the new XP partition. The script woulld then show the file as being the same,
and skip the download, and go straight to the retore. On the error batch, it
started downloading the file via ftp, since they were not the same.
I've used udpcast to image 19 machines from one sender with no errors
usign udpcast images, and have noticed any errors, and have systems run
the disk test on boot with no errors.
So, don't know if it is a size problem, or ports, or kernel option, or option or
something else? I'll try some more things, and try to see what is different
between a good file and a bad one, and see if it was always the same.
Hope you have better luck,
Richard W.M. Jones wrote:
I'm looking at using udpcast to broadcast
large disk images (10+ GB)
to a very large network of machines (1,000-10,000 receivers) over a
mostly switched, partially segmented gig-ethernet network.
Needless to say, the network of machines is all production-critical
and I cannot get access to perform real testing. However testing it
on my home network I can see some potential problems:
- If _any_ receiver is misbehaving or unreachable then this stops
all transmissions. Is there a way to get udpcast to drop
troublesome receivers in this situation (other than unicast)?
- Has anyone used the --ttl option to multicast over routers?
Does it work (the manpage is unclear)? Does it need special
- Any other scaling tips? Should I try to go for the full set of
machines at once or break up the broadcast into groups of machines?
If anyone has used udpcast on such large networks, can you share any
Udpcast mailing list
Michael D. Setzer II - Computer Science Instructor
Guam Community College Computer Center
Guam - Where America's Day Begins
Number of Seti Units Returned: 19,471
Processing time: 32 years, 290 days, 12 hours, 58 minutes
(Total Hours: 287,489)
SETI 5,269,727.070797 | EINSTEIN 1,573,038.609732 | ROSETTA
-----BEGIN PGP SIGNATURE-----
Version: PGP 6.5.8 -- QDPGP 2.61c
-----END PGP SIGNATURE-----