Hi,
On 11/03/2023 06:13, Robert Theron Brockman II wrote:
Howdy! I found a strange bug in the new version of Udpcast. When sending packets at speeds over around 80m, the program locks up. I traced the problem to the doRateLimit function in rate-limit.c (I've added some printfs):
sleepTime = rateLimit->queueSize * 8 * MILLION / rateLimit->bitrate;
printf("sleepTime1 %lu\n",sleepTime); printf("rateLimit->queueSize %lu\n",rateLimit->queueSize);
if(sleepTime > 40000 || rateLimit->queueSize >= 100000) { printf("sleepTime2 %lu\n",sleepTime);
sleepTime -= 10000;
printf("sleepTime3 %lu\n",sleepTime); printf("sleepTime3 % 10000 %u\n",sleepTime % 10000);
sleepTime -= sleepTime % 10000; printf("sleepTime4 %lu\n",sleepTime); usleep((useconds_t)sleepTime); }
What is happening is that sometimes rateLimit->queueSize >= 100000 but sleepTime is small, so subtracting 10000 from it wraps around to near maxint. The strange thing is that when the program then does sleepTime -= sleepTime % 10000 the result is *different* from the 20200328 version because of the change to long unsigned integers. In the old version sleepTime would become zero, but now it stays huge, leading to an extremely long usleep.
Indeed. I've now fixed this by adding another comparison to make sure that sleepTime is at least 20000, even if we are taking the branch due to large queueSize. Smaller values than 20000 lead to 0 usleep (not a catastrophe, but may throw timing off, because it still takes some time just to do the system call), and smaller values than 10000 lead to integer wraparound as you observed (worrysome, as this will lead to an extremely long wait).
if(sleepTime > 40000 || (rateLimit->queueSize >= 100000 && sleepTime >= 20000 )) {
This is now in version 20230319
[...]
It looks to me like the older version actually had offsetting bugs: sleepTime % 10000 *should* return a value less than 10000 but in the old code for some reason it just returned sleepTime, which then subtracted from itself gave zero.
In the old version, sleepTime was declared as a signed integer (even though a signed integer makes no sense as a parameter to usleep). However, this had the side effect of modulo (%) giving a *negative* result if sleeptime itself was negative. And this nicely brought it back to 0, rather than causing the wraparound.
In the new version the wraparound error from the sleepTime -= 10000 has been exposed, causing the hang.
Anyway, hopefully this isn't too hard to fix:
Indeed, the only thing needed was to add another test against 20000. I'm even wondering whether replacing the entire condition with just if(sleepTime > 20000) might work, but unfortunately this bit of code has been there since the oldest version in subversion, so no commit comment on why it was there.
your program is super-useful for us, and the old version merrily blasts udp packets at gigabit speeds!
I'm glad to hear that you like it, thanks,
Thanks,
Robert II
Regards,
Alain