On Thu, Sep 10, 2009 at 07:27:17AM -0500, Jon Nelson wrote:
On Thu, Sep 10, 2009 at 6:28 AM, Michael Matz
wrote: Hi,
On Thu, 10 Sep 2009, Klaus Kaempf wrote:
* Duncan Mac-Vicar Prett
[Sep 10. 2009 13:05]: ZYpp head (what will be present in 11.2 and SLE SP1) has a policy for this, so it can be switched to download first.
Great ! But how does it download ? All in 'one go' or with a separate http connection for each package ?
iirc, establishing the http connection can take considerable time.
If used correctly curl does all of this for you. If the server allows persistent connections (and most of them do) curl uses that. Don't know what the aria lib is doing. (and that feature might or might not interact with the relocation that download.o.o is doing).
The TCP and HTTP overhead itself is so small as to be nearly free. The presence or absence of persistent (keepalive) connections helps but only at the very extreme - over thousands of downloads worth less than a few seconds in most cases. (That is not to say that it's not a very good idea to use it whenever possible for lots of *other* reasons).
You underestimate the effects of introduced latency. While your picture is correct for near, low-latency connections, it looks completely different for long-distance networks. The latency can be considerable, and is incurred for each protocol step. Take 500 ms for a cross-atlantic latency, for instance, and you do see remarkable wall clock delays for traffic. Keepalive makes a huge difference there. It's quite normal to see 50% loss of speed when not using HTTP Keepalive. [...]
My off-the-cuff guess is that the redirection that download.o.o is doing is at least partially to blame. Taking a look at a single request, it appears that d.o.o takes approx 0.29s to issue the redirection. 0.29 times 60 is 18 seconds. My guess is that 0.29 is too low as an average, especially during periods of high load.
No, what you see is latencies. d.o.o needs merely 0.3 ms to completely respond to your request. Thus, it's a thousand times faster than it seems to you; the rest of time is spent in shuffling the little bits forth and back over the network.
In the case of downloading a *small* .rpm (approx 18K or so) fulfilling the request actually took about 0.03s whereas the redirect was 10 times that.
That's why we don't redirect for tiny files (< 2K). They are just sent directly.
Perhaps it would not be unreasonable to consider looking at the 302 redirection mechanism that is in place and try to find some way to improve the response time.
Where are you? In the US? That causes about those 0.3s of response time that you see. The only way to make this faster is to set up an additional server in the US. The concept is easy: put a server on each continent, and use GeoDNS to make the clients use the one that's closest to them. Alas, the openSUSE project lacks the resources to implement that. Thus, we live with a single server in Europe, which is slower to reach from other places. That's one reason why I have often suggested to do as few HTTP requests as possible, and avoid unneeded ones, because the cost is high (and the effect on users considerable). And Keepalive to d.o.o _does_ help, which is what yum is doing (I think). aria2c is spawned per request, which doesn't help in this regard. I think here is room for improvement. Peter -- "WARNING: This bug is visible to non-employees. Please be respectful!" SUSE LINUX Products GmbH Research & Development