Mailinglist Archive: opensuse-features (365 mails)

< Previous Next >
[openFATE 120340] Run download and install in parallel
  • From: fate_noreply@xxxxxxx
  • Date: Mon, 1 Mar 2010 15:47:10 +0100 (CET)
  • Message-id: <feature-120340-137@xxxxxxxxxxxxxx>
Feature changed by: Duncan Mac-Vicar (dmacvicar)
Feature #120340, revision 137
Title: Run download and install in parallel

openSUSE-10.2: Rejected by Klaus Kämpf (kwk)
reject date: 2006-08-09 11:36:10
reject reason: Not possible in 10.2 timeframe
Priority
Requester: Important

openSUSE-10.3: Rejected by Stanislav Visnovsky (visnov)
reject date: 2007-08-01 10:25:01
reject reason: Out of time.
Priority
Requester: Important
Projectmanager: Important

openSUSE-11.0: Rejected by Jiri Srain (jsrain)
reject date: 2008-03-28 13:56:11
reject reason: Out of resources for 11.0.
Priority
Requester: Important
Projectmanager: Important

openSUSE-11.1: Rejected by Stanislav Visnovsky (visnov)
reject date: 2008-07-11 12:06:35
reject reason: This needs to wait. Postponing.
Priority
Requester: Important
Projectmanager: Important

openSUSE-11.2: Rejected by Christoph Thiel (cthiel1)
reject date: 2009-06-03 08:43:03
reject reason: No resources for 11.2.
Priority
Requester: Important
Projectmanager: Important

openSUSE-11.3: Evaluation
Priority
Requester: Important
Projectmanager: Important

Requested by: Klaus Kämpf (kwk)
Requested by: Peter Poeml (poeml)
Requested by: Reinhard Max (rmax)

Description:
Network installation could be improved by running package download and
package installation in parallel.

References:
https://bugzilla.novell.com/show_bug.cgi?id=60844
https://bugzilla.novell.com/show_bug.cgi?id=209799
https://bugzilla.novell.com/show_bug.cgi?id=128050
https://bugzilla.novell.com/show_bug.cgi?id=370457
https://bugzilla.novell.com/show_bug.cgi?id=370054
https://bugzilla.novell.com/show_bug.cgi?id=385711
http://en.opensuse.org/Libzypp/Failover
http://metalinker.org/
Fate #300660: Download package groups before install

Discussion:
#2: Klaus Kämpf (kwk) (2006-08-09 11:36:42)
Michael already looked at this. Should be further evaluated for 10.3

#3: Edith Parzefall (emapedl) (2007-07-31 16:56:12)
Problem is: what do we do if we lost the network connection when only
half the packages are installed?

#7: Reinhard Max (rmax) (2008-04-10 11:42:43) (reply to #3)
I think this is a general problem for network based installations. Can
you elaborate if/why you see this as a special problem when download
and installation are running in parallel?

#8: Peter Poeml (poeml) (2008-05-19 14:49:53) (reply to #7)
For reliable operation of a business critical server, it is a must that
packages are downloaded first, before the installation/update is
started. Otherwise a network outage can lead to a broken system which
is half updated.
So if parallel download/install is implemented (for desktop users,
maybe...) then please make sure that it is possible to disable it.

#9: Reinhard Max (rmax) (2008-05-19 16:01:15) (reply to #8)
As we just sorted out on IRC, the current implementation also doesn't
allow to complete download before starting installation and the feature
proposed here wouldn't make this worse.
But as the wish of server admins to have all packages downloaded before
starting to install them is very valid, I think we should extend this
feature request to contain that requirement as well, and probably
rename it to "decouple download and installation".
If download and installation are properly decoupled, it should just be
a matter of the parameters the algorithm is called with to get one of
the following behaviours:
* Download and install packages one by one (as it is now)
* Keep downloading subsequent packages while installing (as originally
proposed here)
* Use a fixed-size download cache for flow control between download an
installations (in situations where download is faster than installation
and disk space is limited)
* Download all packages of an update or installation and install them
afterwards (as requested by Peter for server admins)


#10: Jiri Srain (jsrain) (2008-05-20 10:22:10) (reply to #9)
We already do have similar feature: #300660: Download package groups
before install

#4: Edith Parzefall (emapedl) (2007-07-31 16:57:22)
Please reject for 10.3.

#5: Duncan Mac-Vicar (dmacvicar) (2008-01-10 16:23:46)
This is included in the ZYpp plan for 11.0, however implementation will
start after beta. Will keep in evaluation, it depends on other
milestones for beta1

#14: Chris Hills (chaz6) (2009-02-16 16:42:06)
Is there a feature request to install using both the base and updates
repo so that the latest packages available at install-time are used,
thus saving bandwidth?

#15: Jiri Srain (jsrain) (2009-02-17 08:45:38) (reply to #14)
This is how it works for quite some time. Repositories providing
patches also provide the packages so that they can be installed
directly from the update repositories (provided they contain newer
version).

#16: Duncan Mac-Vicar (dmacvicar) (2009-06-02 15:18:00)
For 11.2 we will focus on flexibility of the download and install order
(like installing after everything is downloaded), however parallel
stuff should be done after we have this in place.
Please reject.

#20: Jörn Knie-von Allmen (phisiker) (2009-08-05 12:12:29) (reply to
#16)
Please do not reject. With a downloadprotocol this should be solvable. The
Installtask can poll an that protocol (or be informed by a Signal when
a download has ended. This is better then polling. Or if prefering an
IPC-Framework: D-BUS. Never mind!). Such problems I had some times
often to solve (but in Java) and generally, it isn't so dificult.

#21: Ralph Ulrich (ulenrich) (2009-08-05 13:03:25) (reply to #20)
Please reject forever: As operating system are sort of highly
integrated databases there should be a commit like funcutionality:
First download then install (like debian). Otherwise repositories
representing a rolling release (factory) will potentially break an
updating system.

#22: Reinhard Max (rmax) (2009-08-06 16:18:54) (reply to #21)
Yes, there are situations where this (optional) feature should better
not be used, but that does not mean it should not be implemented,
because there are many situations where it can be very useful.
BTW, how would "download then install" give you anything near commit-
like functionality? Factory could still change on the server while you
are downloading and packages could still fail during installation,
leaving you with an inconsistent installation.

#24: Ralph Ulrich (ulenrich) (2009-08-11 23:42:59) (reply to #22)
Yes, "still fail during installation": there is no commit-like
functionality
But in case of a changed Factory repository there would be an error:
zypper dup --download-then-install || ( zypper ref && zypper dup) ||
...
In this more often case there would be a commit-like functionality


#25: Reinhard Max (rmax) (2009-08-12 11:22:08) (reply to #24)
I think the most common case is end users installing from mostly static
repositories such as the original release and the official updates.
That's what I had in mind when I first suggested this feature many many
years ago when there was no openSUSE or Factory.
So, please don't insist in this feature being rejected just because you
don't have a usecase for it yourself.
Nobody will force you to use it if you don't want to.

#23: Reinhard Max (rmax) (2009-08-06 16:20:58) (reply to #16)
As I've explained in comment #9, the different modes of operation would
have been possible with a single implementation and different runtime
parameters.
I think this would actually be easier than implementing the different
modes separately, so why hasn't it been considered?

#17: Michal Papis (mpapis) (2009-07-04 20:58:23)
I think great example here may be Gentoo, there You can chose the way
of getting packages, by default download and installation go in
paraler, but it is possible to get first sources and then run
build/installation.
If you are on the default and something breaks, ex. network, Yuo can
always resume the last download&installation task.

#18: Giuseppe Salinaro (superpeppo89) (2009-07-16 19:37:27) (reply to
#17)
This idea of gentoo is good...

#19: Ján Kupec (jkupec) (2009-07-20 10:59:31) (reply to #18)
Guys, as Duncan wrote in c#16, we are already developing the "download
all, install all" mode (currently we only have "download one, install
one, ..."), and yes, users will be able to choose one of these. This
particular feature would be an improvement of the current mode:
"download one, install one while downloading others, ...". I hope this
clarifies the current status a bit.

#26: Eduard Avetisyan (dich) (2009-09-15 09:16:15) (reply to #19)
Sorry Jan, I think both "download one, install one, download next,
install next..." and " download all, install all" are both MUCH slower
than " download one, install one, download next in parallel ". Come on
guys, you focused on two SLOW modes and left the only fast one for
future, and that's since 5 versions! Why not reserve a couple of
hundred MB (customizable) of diskspace and cache the downloads there
while installing in the meantime. Hope it's not too late to think of it
for os11.2...

#27: Michael Kromer (mkromer) (2009-10-23 01:11:45) (reply to #26)
Absolutely: I agree with you Eduard. The optimized way is to run a
download process while rpm -{i,U} is running in the background, as all
resources (Disk I/O, CPU, Net) can get used parallel. The only trick is
to keep the installation of following packages on hold, if there is
some kind of package which takes longer (like kernel-{default,source};
to keep --requires safe; example: kmp's which rely on the specific
kernel version to be installed first, however even this could be
developed in an optimized way, as package lists and therefore also
package requirements are available at time of download). I think there
would even be real great customization options possible like an option
such as --download-threads. So if the installation/upgrade can keep up
with the downloadspeed you will get *significant* performance. And even
if not -> it could never get slower.

#31: Bart Otten (bartotten) (2009-11-08 13:46:41) (reply to #26)
Guess it is too late for for 11.2 but I think Eduard and Michael made a
good point. This request is old, even before openFATE was for
public...

#32: Ján Kupec (jkupec) (2009-11-09 11:25:45) (reply to #26)
Well, whether we should do one thing first and then the other, that's
debatable, but the point is we do want this downloading while
installing . The only issue here are the resources and priorities. Now
that we have the first, we can hopefully move on to the next for 11.3

#37: Robert Davies (robopensuse) (2009-12-18 15:19:01) (reply to #32)
#306966: [Beta2] No Shutdown/Suspend During Package Update
https://features.opensuse.org/306966 is proposing making the slow disk
Hog option DownloadInAdvance the default claiming practical integrity
benefits, so this most popular feature request is needed to be
implemented for 11.3 to mitigate end-user annoyance.

#38: Duncan Mac-Vicar (dmacvicar) (2009-12-18 16:03:54) (reply to #37)
Yeah, like if disk space would be a big problem nowadays compared with
battery duration

#39: Robert Davies (robopensuse) (2009-12-18 16:40:21) (reply to #38)
Some people have netbooks with relatively limited space, SSD's are
likely to become more popular to.  So it is not just older machines
with smaller disks.  Improving battery life by reducing the time the
disk needs to be spun up rather than idle, is just another reason to
offer parallel download/install option.

#28: Reza Davoudi (rd1381) (2009-10-27 14:35:07)
plz make it so that it be an option in GUI not just command line
i am sick of going through cmdline flags and options

#29: John Bester (johnbester) (2009-10-29 07:13:05)
I think the proposed sulotion will help, but to me this is not the real
important issue. The most frustrating issue around installing a small
package using YaST is the repository upgrade that must complete before
you can do anything. (Unfortunately we do not all have the internet
speeds as are the norm in Europe and US). The worst is actually when
you have your laptop somewhere where you do not have internet access
and have to wait for connection timeouts before you can do an
installation. It happened a few times that I just wanted to install a
small tool which slipped my mind when I did a clean install (such as
midnight commander) quickly so that I can continue with what I am
doing. At this point I am not fussed about having the absolute latest
security fix - I just need to get going.
In any case, I would propose that when you open the package manager, it
should kick off a thread to download repository updates into a temp
folder. The next time you start the package manader, it can simply
replace those repositories for which there are updates available in the
temp folder. My reasoning is: If I opened the package manager yesterday
and I open it again today (and maybe again later today), why should I
always wait a few minutes before I can start selecting packages?
Switching off the refresh feature in "Software repostories" is not an
option, because it takes too long and you typically do not want it
switched off.

#30: Reinhard Max (rmax) (2009-10-29 09:59:10) (reply to #29)
Please open a new feature request for your proposal, as it is unrelated
to the feature that is being discussed here.

#33: Fco. Javier Nacher (xiscoj) (2009-12-07 09:39:04)
hi,
why don't we learn from another package managers?. I usually use smart
( http://labix.org/smart (http://labix.org/smart) ), I think it working
mode is the apropiate. It downloads in parallel and after downloading
all packages, installs them sequentially, also allows to update
channels pressing a button only on demand, so it opens faster.
I think that parallel install is not a good choice but parallel
download is more than recommended.

#34: Duncan Mac-Vicar (dmacvicar) (2009-12-07 11:01:03) (reply to #33)
You missunderstood the feature. The feature talks about downloading
multiple files at the same time, and installing what is possible to
install at the same time. It does not implies that multiple packages
are installed in parallel (however it does not discard it).
You give no reason when stating that "parallel install is not a good
choice".
Downloading in parallel is something we will do anyways. However, as we
have DownloadInAdvance and DownloadInHeaps, we can download group of
packages that form a minimal dependency transaction. I doubt smart has
this information. This would allow us to download the maximum amount of
packages (either sequential or in paralell) that can be downloaded and
installed while leaving the system in a consistent state. Those could
be installed in parallel while the next "heap" or group is downloaded.
Any error in the next download would leave the system still in a
consistent state.
So yes, we should learn from smart, yum and others, but we should not
just copy, as we don't have all the limitations they do have, and that
means we can actually try to improve things.

#35: Robert Davies (robopensuse) (2009-12-07 21:21:45) (reply to #34)
It is possible to grab the small "odds & sods" together that start
being downloaded early in install even if they are logically
independant?  The trouble I notice on  network install, is that there's
many small packages not particularly important packages, which don't
let the TCP/IP download get to decent speed, making things crawl early
on at nowhere near usual bandwidth.
So a strategy, like a browser of getting small files in parrallel,
whilst mainly wanting those large consistent sets you mention, might
make things look pleasingly snappy, rather than naive.  Of course if a
large package were to be downloaded, as the small ones arrive and are
installed noone would complain.

#40: Fco. Javier Nacher (xiscoj) (2010-01-28 09:15:41) (reply to #34)
hi,
I think parallel download good be a good start point, so once you have
all the packages you can install without dependency problems and you'll
be sure that the system will be consistent.
Of course if some packages are not dependant on others could be
installed while downloading the rest. But I think this could be more
difficult to implement so as first step for this features could be
parallel download of packages and the next step parallel install, and
maybe the first step would be easier and faster to implement for 11.3
and the second step can be delayed for next opensuse version.
Improving is always good :)
-  


#36: Robert Davies (robopensuse) (2009-12-10 16:21:39)
Testing out Kubuntu 9.10, and one of early things it does on starting
Konqi, is suggest installing "restricted" multia-media codecs.  The
downloads were fast, stating off at full 10Mb/s, just tailing off a
touch after a while.  Similarly installing 110 bug fix updates took
only a few minutes, despite requiring full kernel download.
By comparison, though zypper itself is fast searching in DB, getting
the repo info and sequence of small files, can really keep the end user
waiting with alot of under-utilised bandwidth.



--
openSUSE Feature:
https://features.opensuse.org/120340

< Previous Next >
This Thread
  • No further messages