[openFATE 120340] Run download and install in parallel
Feature changed by: Thiago Sayao (sayao) Feature #120340, revision 144 Title: Run download and install in parallel openSUSE-10.2: Rejected by Klaus Kämpf (kwk) reject date: 2006-08-09 11:36:10 reject reason: Not possible in 10.2 timeframe Priority Requester: Important openSUSE-10.3: Rejected by Stanislav Visnovsky (visnov) reject date: 2007-08-01 10:25:01 reject reason: Out of time. Priority Requester: Important Projectmanager: Important openSUSE-11.0: Rejected by Jiri Srain (jsrain) reject date: 2008-03-28 13:56:11 reject reason: Out of resources for 11.0. Priority Requester: Important Projectmanager: Important openSUSE-11.1: Rejected by Stanislav Visnovsky (visnov) reject date: 2008-07-11 12:06:35 reject reason: This needs to wait. Postponing. Priority Requester: Important Projectmanager: Important openSUSE-11.2: Rejected by Christoph Thiel (cthiel1) reject date: 2009-06-03 08:43:03 reject reason: No resources for 11.2. Priority Requester: Important Projectmanager: Important openSUSE-11.3: Evaluation Priority Requester: Important Projectmanager: Important Requested by: Klaus Kämpf (kwk) Requested by: Peter Poeml (poeml) Requested by: Reinhard Max (rmax) Developer: (Novell) Description: Network installation could be improved by running package download and package installation in parallel. References: https://bugzilla.novell.com/show_bug.cgi?id=60844 https://bugzilla.novell.com/show_bug.cgi?id=209799 https://bugzilla.novell.com/show_bug.cgi?id=128050 https://bugzilla.novell.com/show_bug.cgi?id=370457 https://bugzilla.novell.com/show_bug.cgi?id=370054 https://bugzilla.novell.com/show_bug.cgi?id=385711 http://en.opensuse.org/Libzypp/Failover http://metalinker.org/ Fate #300660: Download package groups before install Related Fate #307862: Download multiple packgaes in parallel Discussion: #2: Klaus Kämpf (kwk) (2006-08-09 11:36:42) Michael already looked at this. Should be further evaluated for 10.3 #3: Edith Parzefall (emapedl) (2007-07-31 16:56:12) Problem is: what do we do if we lost the network connection when only half the packages are installed? #7: Reinhard Max (rmax) (2008-04-10 11:42:43) (reply to #3) I think this is a general problem for network based installations. Can you elaborate if/why you see this as a special problem when download and installation are running in parallel? #8: Peter Poeml (poeml) (2008-05-19 14:49:53) (reply to #7) For reliable operation of a business critical server, it is a must that packages are downloaded first, before the installation/update is started. Otherwise a network outage can lead to a broken system which is half updated. So if parallel download/install is implemented (for desktop users, maybe...) then please make sure that it is possible to disable it. #9: Reinhard Max (rmax) (2008-05-19 16:01:15) (reply to #8) As we just sorted out on IRC, the current implementation also doesn't allow to complete download before starting installation and the feature proposed here wouldn't make this worse. But as the wish of server admins to have all packages downloaded before starting to install them is very valid, I think we should extend this feature request to contain that requirement as well, and probably rename it to "decouple download and installation". If download and installation are properly decoupled, it should just be a matter of the parameters the algorithm is called with to get one of the following behaviours: * Download and install packages one by one (as it is now) * Keep downloading subsequent packages while installing (as originally proposed here) * Use a fixed-size download cache for flow control between download an installations (in situations where download is faster than installation and disk space is limited) * Download all packages of an update or installation and install them afterwards (as requested by Peter for server admins) #10: Jiri Srain (jsrain) (2008-05-20 10:22:10) (reply to #9) We already do have similar feature: #300660: Download package groups before install #4: Edith Parzefall (emapedl) (2007-07-31 16:57:22) Please reject for 10.3. #5: Duncan Mac-Vicar (dmacvicar) (2008-01-10 16:23:46) This is included in the ZYpp plan for 11.0, however implementation will start after beta. Will keep in evaluation, it depends on other milestones for beta1 #14: Chris Hills (chaz6) (2009-02-16 16:42:06) Is there a feature request to install using both the base and updates repo so that the latest packages available at install-time are used, thus saving bandwidth? #15: Jiri Srain (jsrain) (2009-02-17 08:45:38) (reply to #14) This is how it works for quite some time. Repositories providing patches also provide the packages so that they can be installed directly from the update repositories (provided they contain newer version). #16: Duncan Mac-Vicar (dmacvicar) (2009-06-02 15:18:00) For 11.2 we will focus on flexibility of the download and install order (like installing after everything is downloaded), however parallel stuff should be done after we have this in place. Please reject. #20: Jörn Knie-von Allmen (phisiker) (2009-08-05 12:12:29) (reply to #16) Please do not reject. With a downloadprotocol this should be solvable. The Installtask can poll an that protocol (or be informed by a Signal when a download has ended. This is better then polling. Or if prefering an IPC-Framework: D-BUS. Never mind!). Such problems I had some times often to solve (but in Java) and generally, it isn't so dificult. #21: Ralph Ulrich (ulenrich) (2009-08-05 13:03:25) (reply to #20) Please reject forever: As operating system are sort of highly integrated databases there should be a commit like funcutionality: First download then install (like debian). Otherwise repositories representing a rolling release (factory) will potentially break an updating system. #22: Reinhard Max (rmax) (2009-08-06 16:18:54) (reply to #21) Yes, there are situations where this (optional) feature should better not be used, but that does not mean it should not be implemented, because there are many situations where it can be very useful. BTW, how would "download then install" give you anything near commit- like functionality? Factory could still change on the server while you are downloading and packages could still fail during installation, leaving you with an inconsistent installation. #24: Ralph Ulrich (ulenrich) (2009-08-11 23:42:59) (reply to #22) Yes, "still fail during installation": there is no commit-like functionality But in case of a changed Factory repository there would be an error: zypper dup --download-then-install || ( zypper ref && zypper dup) || ... In this more often case there would be a commit-like functionality #25: Reinhard Max (rmax) (2009-08-12 11:22:08) (reply to #24) I think the most common case is end users installing from mostly static repositories such as the original release and the official updates. That's what I had in mind when I first suggested this feature many many years ago when there was no openSUSE or Factory. So, please don't insist in this feature being rejected just because you don't have a usecase for it yourself. Nobody will force you to use it if you don't want to. #23: Reinhard Max (rmax) (2009-08-06 16:20:58) (reply to #16) As I've explained in comment #9, the different modes of operation would have been possible with a single implementation and different runtime parameters. I think this would actually be easier than implementing the different modes separately, so why hasn't it been considered? #17: Michal Papis (mpapis) (2009-07-04 20:58:23) I think great example here may be Gentoo, there You can chose the way of getting packages, by default download and installation go in paraler, but it is possible to get first sources and then run build/installation. If you are on the default and something breaks, ex. network, Yuo can always resume the last download&installation task. #18: Giuseppe Salinaro (superpeppo89) (2009-07-16 19:37:27) (reply to #17) This idea of gentoo is good... #19: Ján Kupec (jkupec) (2009-07-20 10:59:31) (reply to #18) Guys, as Duncan wrote in c#16, we are already developing the "download all, install all" mode (currently we only have "download one, install one, ..."), and yes, users will be able to choose one of these. This particular feature would be an improvement of the current mode: "download one, install one while downloading others, ...". I hope this clarifies the current status a bit. #26: Eduard Avetisyan (dich) (2009-09-15 09:16:15) (reply to #19) Sorry Jan, I think both "download one, install one, download next, install next..." and " download all, install all" are both MUCH slower than " download one, install one, download next in parallel ". Come on guys, you focused on two SLOW modes and left the only fast one for future, and that's since 5 versions! Why not reserve a couple of hundred MB (customizable) of diskspace and cache the downloads there while installing in the meantime. Hope it's not too late to think of it for os11.2... #27: Michael Kromer (mkromer) (2009-10-23 01:11:45) (reply to #26) Absolutely: I agree with you Eduard. The optimized way is to run a download process while rpm -{i,U} is running in the background, as all resources (Disk I/O, CPU, Net) can get used parallel. The only trick is to keep the installation of following packages on hold, if there is some kind of package which takes longer (like kernel-{default,source}; to keep --requires safe; example: kmp's which rely on the specific kernel version to be installed first, however even this could be developed in an optimized way, as package lists and therefore also package requirements are available at time of download). I think there would even be real great customization options possible like an option such as --download-threads. So if the installation/upgrade can keep up with the downloadspeed you will get *significant* performance. And even if not -> it could never get slower. #31: Bart Otten (bartotten) (2009-11-08 13:46:41) (reply to #26) Guess it is too late for for 11.2 but I think Eduard and Michael made a good point. This request is old, even before openFATE was for public... #32: Ján Kupec (jkupec) (2009-11-09 11:25:45) (reply to #26) Well, whether we should do one thing first and then the other, that's debatable, but the point is we do want this downloading while installing . The only issue here are the resources and priorities. Now that we have the first, we can hopefully move on to the next for 11.3 #37: Robert Davies (robopensuse) (2009-12-18 15:19:01) (reply to #32) #306966: [Beta2] No Shutdown/Suspend During Package Update https://features.opensuse.org/306966 is proposing making the slow disk Hog option DownloadInAdvance the default claiming practical integrity benefits, so this most popular feature request is needed to be implemented for 11.3 to mitigate end-user annoyance. #38: Duncan Mac-Vicar (dmacvicar) (2009-12-18 16:03:54) (reply to #37) Yeah, like if disk space would be a big problem nowadays compared with battery duration #39: Robert Davies (robopensuse) (2009-12-18 16:40:21) (reply to #38) Some people have netbooks with relatively limited space, SSD's are likely to become more popular to. So it is not just older machines with smaller disks. Improving battery life by reducing the time the disk needs to be spun up rather than idle, is just another reason to offer parallel download/install option. #28: Reza Davoudi (rd1381) (2009-10-27 14:35:07) plz make it so that it be an option in GUI not just command line i am sick of going through cmdline flags and options #29: John Bester (johnbester) (2009-10-29 07:13:05) I think the proposed sulotion will help, but to me this is not the real important issue. The most frustrating issue around installing a small package using YaST is the repository upgrade that must complete before you can do anything. (Unfortunately we do not all have the internet speeds as are the norm in Europe and US). The worst is actually when you have your laptop somewhere where you do not have internet access and have to wait for connection timeouts before you can do an installation. It happened a few times that I just wanted to install a small tool which slipped my mind when I did a clean install (such as midnight commander) quickly so that I can continue with what I am doing. At this point I am not fussed about having the absolute latest security fix - I just need to get going. In any case, I would propose that when you open the package manager, it should kick off a thread to download repository updates into a temp folder. The next time you start the package manader, it can simply replace those repositories for which there are updates available in the temp folder. My reasoning is: If I opened the package manager yesterday and I open it again today (and maybe again later today), why should I always wait a few minutes before I can start selecting packages? Switching off the refresh feature in "Software repostories" is not an option, because it takes too long and you typically do not want it switched off. #30: Reinhard Max (rmax) (2009-10-29 09:59:10) (reply to #29) Please open a new feature request for your proposal, as it is unrelated to the feature that is being discussed here. #33: Fco. Javier Nacher (xiscoj) (2009-12-07 09:39:04) hi, why don't we learn from another package managers?. I usually use smart ( http://labix.org/smart (http://labix.org/smart) ), I think it working mode is the apropiate. It downloads in parallel and after downloading all packages, installs them sequentially, also allows to update channels pressing a button only on demand, so it opens faster. I think that parallel install is not a good choice but parallel download is more than recommended. #34: Duncan Mac-Vicar (dmacvicar) (2009-12-07 11:01:03) (reply to #33) You missunderstood the feature. The feature talks about downloading multiple files at the same time, and installing what is possible to install at the same time. It does not implies that multiple packages are installed in parallel (however it does not discard it). You give no reason when stating that "parallel install is not a good choice". Downloading in parallel is something we will do anyways. However, as we have DownloadInAdvance and DownloadInHeaps, we can download group of packages that form a minimal dependency transaction. I doubt smart has this information. This would allow us to download the maximum amount of packages (either sequential or in paralell) that can be downloaded and installed while leaving the system in a consistent state. Those could be installed in parallel while the next "heap" or group is downloaded. Any error in the next download would leave the system still in a consistent state. So yes, we should learn from smart, yum and others, but we should not just copy, as we don't have all the limitations they do have, and that means we can actually try to improve things. #35: Robert Davies (robopensuse) (2009-12-07 21:21:45) (reply to #34) It is possible to grab the small "odds & sods" together that start being downloaded early in install even if they are logically independant? The trouble I notice on network install, is that there's many small packages not particularly important packages, which don't let the TCP/IP download get to decent speed, making things crawl early on at nowhere near usual bandwidth. So a strategy, like a browser of getting small files in parrallel, whilst mainly wanting those large consistent sets you mention, might make things look pleasingly snappy, rather than naive. Of course if a large package were to be downloaded, as the small ones arrive and are installed noone would complain. #40: Fco. Javier Nacher (xiscoj) (2010-01-28 09:15:41) (reply to #34) hi, I think parallel download good be a good start point, so once you have all the packages you can install without dependency problems and you'll be sure that the system will be consistent. Of course if some packages are not dependant on others could be installed while downloading the rest. But I think this could be more difficult to implement so as first step for this features could be parallel download of packages and the next step parallel install, and maybe the first step would be easier and faster to implement for 11.3 and the second step can be delayed for next opensuse version. Improving is always good :) #36: Robert Davies (robopensuse) (2009-12-10 16:21:39) Testing out Kubuntu 9.10, and one of early things it does on starting Konqi, is suggest installing "restricted" multia-media codecs. The downloads were fast, stating off at full 10Mb/s, just tailing off a touch after a while. Similarly installing 110 bug fix updates took only a few minutes, despite requiring full kernel download. By comparison, though zypper itself is fast searching in DB, getting the repo info and sequence of small files, can really keep the end user waiting with alot of under-utilised bandwidth. + #41: Thiago Sayao (sayao) (2010-06-03 03:29:13) + https://bugzilla.novell.com/show_bug.cgi?id=609276 + (https://bugzilla.novell.com/show_bug.cgi?id=609276) + Anyone would pick this on Hack Week? + I think parallel downloads would speed up things a lot. I could keep + downloading when installing so the connection would not close - this + would save a lot of "hand-shaking" time. -- openSUSE Feature: https://features.opensuse.org/120340
participants (1)
-
fate_noreply@suse.de