[mirror] Suboptimal (for mirrors) download pattern by opensuse clients

21 Feb 2020

      Hi,

am I the only mirror admin that finds the current behavior of opensuse 
clients suboptimal?

Requests by "ZYpp 17.11.4 (curl 7.60.0) openSUSE-Leap-15.1-x86_64" etc 
seem to be done with 256 kb chunk size, always, as an example:

GET bytes=0-262143 /mirror/opensuse.org/tumbleweed/repo/oss/x86_64/libqt5-qtwebengine-5.14.1-1.5.x86_64.rpm
GET bytes=262144-524287 /mirror/opensuse.org/tumbleweed/repo/oss/x86_64/libqt5-qtwebengine-5.14.1-1.5.x86_64.rpm

That's a silly small size, since TCP won't be able to ramp window 
sizes and get good speed before those 256k are done. Also, we get 
int($filesize/256k) entries in our logs for each download.

To make matters worse, the thing seems to do some kind of round robin 
between sites, with this pattern being the most ineffective looking 
from a mirror admin standpoint:

GET bytes=2097152-2359295 /mirror/opensuse.org/tumbleweed/repo/oss/x86_64/libqt5-qtwebengine-5.14.1-1.5.x86_64.rpm
GET bytes=2621440-2883583 /mirror/opensuse.org/tumbleweed/repo/oss/x86_64/libqt5-qtwebengine-5.14.1-1.5.x86_64.rpm

Since the OS normally does read-ahead on file system reads, it will 
read-ahead after byte 2359295 in preparation for the next read(). In 
this case though, that's in vain as the request never comes but the 
next data read is instead byte 2621440 and forward... OS read-ahead is 
most commonly in the 64kB-1MB range, so it's not unlikely that the 
entire 256k gap inbetween is read from disk without being used...

Downloading files this way is just plain stupid, IMHO.

I don't know what problem this behavior is supposed so solve, but it's 
definitely not beneficial for us as a mirror, and I think it's hurting 
your end users as well.

If you want more bandwidth from us, request larger chunks (or whole 
files). The TCP window will grow and you'll get the performance 
(within the limits of 10 gigabit networking for one download).

If you want to spread the load between mirrors, use larger chunks, and 
specifically avoid small chunks and striped access.

In any case, merge requests! If you're going to request a number of 
consecutive chunks, do it in one request, preferrably as one range, to 
make the most of the tcp connection you've set up.

My minimum suggestion would be to bump the chunk-size to multiple 
megabytes at the minimum, possibly varying depending on download 
performance, aiming for each GET taking at least a couple of seconds 
to allow for TCP to ramp speed (and reduce the noise in our logs). In 
extreme cases we're seeing multiple tens of GET:s each second for some 
downloads, I'm guessing the rate throttles due to the RTT latency 
(ping time) and not some real bandwidth limit...

/Nikke - admin of ftp.acc.umu.se
-- 
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
  Niklas Edmundsson, Admin @ {acc,hpc2n}.umu.se      |     nikke@acc.umu.se
---------------------------------------------------------------------------
  NOW (n), adv: A moment in time that has already passed.
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
-- 
To unsubscribe, e-mail: mirror+unsubscribe@opensuse.org
To contact the owner, email: mirror+owner@opensuse.org