Mailinglist Archive: zypp-devel (23 mails)

< Previous Next >
Re: [zypp-devel] yum vs zypp performance
  • From: Peter Poeml <poeml@xxxxxxx>
  • Date: Thu, 10 Sep 2009 16:41:25 +0200
  • Message-id: <20090910144125.GC23473@xxxxxxx>
On Thu, Sep 10, 2009 at 07:27:17AM -0500, Jon Nelson wrote:
On Thu, Sep 10, 2009 at 6:28 AM, Michael Matz <matz@xxxxxxx> wrote:
Hi,

On Thu, 10 Sep 2009, Klaus Kaempf wrote:

* Duncan Mac-Vicar Prett <dmacvicar@xxxxxxx> [Sep 10. 2009 13:05]:

ZYpp head (what will be present in 11.2 and SLE SP1) has a policy for
this, so
it can be switched to download first.

Great !
But how does it download ? All in 'one go' or with a separate http
connection for each package ?

iirc, establishing the http connection can take considerable time.

If used correctly curl does all of this for you.  If the server allows
persistent connections (and most of them do) curl uses that.  Don't know
what the aria lib is doing.  (and that feature might or might not interact
with the relocation that download.o.o is doing).

The TCP and HTTP overhead itself is so small as to be nearly free. The
presence or absence of persistent (keepalive) connections helps but
only at the very extreme - over thousands of downloads worth less than
a few seconds in most cases. (That is not to say that it's not a very
good idea to use it whenever possible for lots of *other* reasons).

You underestimate the effects of introduced latency. While your picture
is correct for near, low-latency connections, it looks completely
different for long-distance networks. The latency can be considerable,
and is incurred for each protocol step. Take 500 ms for a cross-atlantic
latency, for instance, and you do see remarkable wall clock delays for
traffic. Keepalive makes a huge difference there. It's quite normal to
see 50% loss of speed when not using HTTP Keepalive.

[...]

My off-the-cuff guess is that the redirection that download.o.o is
doing is at least partially to blame. Taking a look at a single
request, it appears that d.o.o takes approx 0.29s to issue the
redirection. 0.29 times 60 is 18 seconds. My guess is that 0.29 is
too low as an average, especially during periods of high load.

No, what you see is latencies.

d.o.o needs merely 0.3 ms to completely respond to your request.

Thus, it's a thousand times faster than it seems to you; the rest of
time is spent in shuffling the little bits forth and back over the
network.

In the case of downloading a *small* .rpm (approx 18K or so)
fulfilling the request actually took about 0.03s whereas the redirect
was 10 times that.

That's why we don't redirect for tiny files (< 2K). They are just sent
directly.

Perhaps it would not be unreasonable to consider looking at the 302
redirection mechanism that is in place and try to find some way to
improve the response time.

Where are you? In the US? That causes about those 0.3s of response time
that you see.

The only way to make this faster is to set up an additional server in
the US.

The concept is easy: put a server on each continent, and use GeoDNS to
make the clients use the one that's closest to them. Alas, the openSUSE
project lacks the resources to implement that. Thus, we live with a
single server in Europe, which is slower to reach from other places.

That's one reason why I have often suggested to do as few HTTP requests
as possible, and avoid unneeded ones, because the cost is high (and the
effect on users considerable).

And Keepalive to d.o.o _does_ help, which is what yum is doing (I think).

aria2c is spawned per request, which doesn't help in this regard. I
think here is room for improvement.

Peter
--
"WARNING: This bug is visible to non-employees. Please be respectful!"

SUSE LINUX Products GmbH
Research & Development
< Previous Next >