[openFATE 120340] Decouple download and installation
Feature changed by: Richard Brown (RBrownSUSE) Feature #120340, revision 206 Title: Decouple download and installation openSUSE Distribution: Done Priority Requester: Desirable Requested by: Klaus Kämpf (kwk) Requested by: Peter Poeml (poeml) Requested by: Reinhard Max (rmax) Product Manager: Federico Lucifredi (flucifredi) Partner organization: openSUSE.org Description: Network installation could be improved by running package download and package installation in parallel. References: https://bugzilla.novell.com/show_bug.cgi?id=60844 https://bugzilla.novell.com/show_bug.cgi?id=209799 https://bugzilla.novell.com/show_bug.cgi?id=128050 https://bugzilla.novell.com/show_bug.cgi?id=370457 https://bugzilla.novell.com/show_bug.cgi?id=370054 https://bugzilla.novell.com/show_bug.cgi?id=385711 https://bugzilla.novell.com/show_bug.cgi?id=638191 http://en.opensuse.org/Libzypp/Failover http://metalinker.org/ Fate #300660: Download package groups before install Related Fate #307862: Download multiple packgaes in parallel Relations: - 60844 (novell/bugzilla/id: 60844) https://bugzilla.novell.com/show_bug.cgi?id=60844 - 209799 (novell/bugzilla/id: 209799) https://bugzilla.novell.com/show_bug.cgi?id=209799 - 128050 (novell/bugzilla/id: 128050) https://bugzilla.novell.com/show_bug.cgi?id=128050 - 370457 (novell/bugzilla/id: 370457) https://bugzilla.novell.com/show_bug.cgi?id=370457 - 370054 (novell/bugzilla/id: 370054) https://bugzilla.novell.com/show_bug.cgi?id=370054 - 385711 (novell/bugzilla/id: 385711) https://bugzilla.novell.com/show_bug.cgi?id=385711 - 638191 (novell/bugzilla/id: 638191) https://bugzilla.novell.com/show_bug.cgi?id=638191 - Libzypp/Failover (url: https://old-en.opensuse.org/Libzypp/Failover) - Metalink (url: http://metalinker.org) - Download package groups before install (feature/id: 300660) - Download multiple packages in parallel (feature/id: 307862) - Practical steps to parallel downloading of updates (feature/id: 318807) Use Case: This week i makred a upgrade of Qt libs and other things (install texlive). Yast update qt core and start the download of texlive (~250 MB) Well... the qt-x11 not be upgraded, when i can't start a qt app during the install of texlive because the qt is broken. I use kde and almost all of my applications is qt based. Discussion: #2: Klaus Kämpf (kwk) (2006-08-09 11:36:42) Michael already looked at this. Should be further evaluated for 10.3 #3: Edith Parzefall (emapedl) (2007-07-31 16:56:12) Problem is: what do we do if we lost the network connection when only half the packages are installed? #7: Reinhard Max (rmax) (2008-04-10 11:42:43) (reply to #3) I think this is a general problem for network based installations. Can you elaborate if/why you see this as a special problem when download and installation are running in parallel? #8: Peter Poeml (poeml) (2008-05-19 14:49:53) (reply to #7) For reliable operation of a business critical server, it is a must that packages are downloaded first, before the installation/update is started. Otherwise a network outage can lead to a broken system which is half updated. So if parallel download/install is implemented (for desktop users, maybe...) then please make sure that it is possible to disable it. #9: Reinhard Max (rmax) (2008-05-19 16:01:15) (reply to #8) As we just sorted out on IRC, the current implementation also doesn't allow to complete download before starting installation and the feature proposed here wouldn't make this worse. But as the wish of server admins to have all packages downloaded before starting to install them is very valid, I think we should extend this feature request to contain that requirement as well, and probably rename it to "decouple download and installation". If download and installation are properly decoupled, it should just be a matter of the parameters the algorithm is called with to get one of the following behaviours: * Download and install packages one by one (as it is now) * Keep downloading subsequent packages while installing (as originally proposed here) * Use a fixed-size download cache for flow control between download an installations (in situations where download is faster than installation and disk space is limited) * Download all packages of an update or installation and install them afterwards (as requested by Peter for server admins) #10: Jiri Srain (jsrain) (2008-05-20 10:22:10) (reply to #9) We already do have similar feature: #300660: Download package groups before install #42: Martin Seidler (pistazienfresser) (2010-06-09 20:35:25) (reply to #9) That surgestion of Reinhard Max sound's good. And I would surgest the last (and maybe most stable) variant or at least the situation now as default. How much time could be saved in parallel installation and how much time could be wasted with a system with packages that do not fit together if the download had been broken up during installation of the fist packages? #46: Artur Mustafin (hack2root) (2010-08-26 07:21:27) (reply to #3) I think all installation technology shuld be revised: We should invent globaly visible versioned library store, new packaging system, e.t.c. I think it is must be a simple commit, like this: -begin transaction -installs missing libraries/apps in libary store (multiversional, global, cached) -commit transaction in case of any error: -rollback trasaction #4: Edith Parzefall (emapedl) (2007-07-31 16:57:22) Please reject for 10.3. #5: Duncan Mac-Vicar (dmacvicar) (2008-01-10 16:23:46) This is included in the ZYpp plan for 11.0, however implementation will start after beta. Will keep in evaluation, it depends on other milestones for beta1 #14: Chris Hills (chaz6) (2009-02-16 16:42:06) Is there a feature request to install using both the base and updates repo so that the latest packages available at install-time are used, thus saving bandwidth? #15: Jiri Srain (jsrain) (2009-02-17 08:45:38) (reply to #14) This is how it works for quite some time. Repositories providing patches also provide the packages so that they can be installed directly from the update repositories (provided they contain newer version). #16: Duncan Mac-Vicar (dmacvicar) (2009-06-02 15:18:00) For 11.2 we will focus on flexibility of the download and install order (like installing after everything is downloaded), however parallel stuff should be done after we have this in place. Please reject. #20: Jörn Knie-von Allmen (phisiker) (2009-08-05 12:12:29) (reply to #16) Please do not reject. With a downloadprotocol this should be solvable. The Installtask can poll an that protocol (or be informed by a Signal when a download has ended. This is better then polling. Or if prefering an IPC-Framework: D-BUS. Never mind!). Such problems I had some times often to solve (but in Java) and generally, it isn't so dificult. #21: Ralph Ulrich (ulenrich) (2009-08-05 13:03:25) (reply to #20) Please reject forever: As operating system are sort of highly integrated databases there should be a commit like funcutionality: First download then install (like debian). Otherwise repositories representing a rolling release (factory) will potentially break an updating system. #22: Reinhard Max (rmax) (2009-08-06 16:18:54) (reply to #21) Yes, there are situations where this (optional) feature should better not be used, but that does not mean it should not be implemented, because there are many situations where it can be very useful. BTW, how would "download then install" give you anything near commit- like functionality? Factory could still change on the server while you are downloading and packages could still fail during installation, leaving you with an inconsistent installation. #24: Ralph Ulrich (ulenrich) (2009-08-11 23:42:59) (reply to #22) Yes, "still fail during installation": there is no commit-like functionality But in case of a changed Factory repository there would be an error: zypper dup --download-then-install || ( zypper ref && zypper dup) || ... In this more often case there would be a commit-like functionality #25: Reinhard Max (rmax) (2009-08-12 11:22:08) (reply to #24) I think the most common case is end users installing from mostly static repositories such as the original release and the official updates. That's what I had in mind when I first suggested this feature many many years ago when there was no openSUSE or Factory. So, please don't insist in this feature being rejected just because you don't have a usecase for it yourself. Nobody will force you to use it if you don't want to. #23: Reinhard Max (rmax) (2009-08-06 16:20:58) (reply to #16) As I've explained in comment #9, the different modes of operation would have been possible with a single implementation and different runtime parameters. I think this would actually be easier than implementing the different modes separately, so why hasn't it been considered? #17: Michal Papis (mpapis) (2009-07-04 20:58:23) I think great example here may be Gentoo, there You can chose the way of getting packages, by default download and installation go in paraler, but it is possible to get first sources and then run build/installation. If you are on the default and something breaks, ex. network, Yuo can always resume the last download&installation task. #18: Giuseppe Salinaro (superpeppo89) (2009-07-16 19:37:27) (reply to #17) This idea of gentoo is good... #19: Ján Kupec (jkupec) (2009-07-20 10:59:31) (reply to #18) Guys, as Duncan wrote in c#16, we are already developing the "download all, install all" mode (currently we only have "download one, install one, ..."), and yes, users will be able to choose one of these. This particular feature would be an improvement of the current mode: "download one, install one while downloading others, ...". I hope this clarifies the current status a bit. #26: Eduard Avetisyan (dich) (2009-09-15 09:16:15) (reply to #19) Sorry Jan, I think both "download one, install one, download next, install next..." and " download all, install all" are both MUCH slower than " download one, install one, download next in parallel ". Come on guys, you focused on two SLOW modes and left the only fast one for future, and that's since 5 versions! Why not reserve a couple of hundred MB (customizable) of diskspace and cache the downloads there while installing in the meantime. Hope it's not too late to think of it for os11.2... #27: Michael Kromer (mkromer) (2009-10-23 01:11:45) (reply to #26) Absolutely: I agree with you Eduard. The optimized way is to run a download process while rpm -{i,U} is running in the background, as all resources (Disk I/O, CPU, Net) can get used parallel. The only trick is to keep the installation of following packages on hold, if there is some kind of package which takes longer (like kernel-{default,source}; to keep --requires safe; example: kmp's which rely on the specific kernel version to be installed first, however even this could be developed in an optimized way, as package lists and therefore also package requirements are available at time of download). I think there would even be real great customization options possible like an option such as --download-threads. So if the installation/upgrade can keep up with the downloadspeed you will get *significant* performance. And even if not -> it could never get slower. #31: Bart Otten (bartotten) (2009-11-08 13:46:41) (reply to #26) Guess it is too late for for 11.2 but I think Eduard and Michael made a good point. This request is old, even before openFATE was for public... #32: Ján Kupec (jkupec) (2009-11-09 11:25:45) (reply to #26) Well, whether we should do one thing first and then the other, that's debatable, but the point is we do want this downloading while installing . The only issue here are the resources and priorities. Now that we have the first, we can hopefully move on to the next for 11.3 #37: Robert Davies (robopensuse) (2009-12-18 15:19:01) (reply to #32) #306966: [Beta2] No Shutdown/Suspend During Package Update https://features.opensuse.org/306966 is proposing making the slow disk Hog option DownloadInAdvance the default claiming practical integrity benefits, so this most popular feature request is needed to be implemented for 11.3 to mitigate end-user annoyance. #38: Duncan Mac-Vicar (dmacvicar) (2009-12-18 16:03:54) (reply to #37) Yeah, like if disk space would be a big problem nowadays compared with battery duration #39: Robert Davies (robopensuse) (2009-12-18 16:40:21) (reply to #38) Some people have netbooks with relatively limited space, SSD's are likely to become more popular to. So it is not just older machines with smaller disks. Improving battery life by reducing the time the disk needs to be spun up rather than idle, is just another reason to offer parallel download/install option. #28: Reza Davoudi (rd1381) (2009-10-27 14:35:07) plz make it so that it be an option in GUI not just command line i am sick of going through cmdline flags and options #29: John Bester (johnbester) (2009-10-29 07:13:05) I think the proposed sulotion will help, but to me this is not the real important issue. The most frustrating issue around installing a small package using YaST is the repository upgrade that must complete before you can do anything. (Unfortunately we do not all have the internet speeds as are the norm in Europe and US). The worst is actually when you have your laptop somewhere where you do not have internet access and have to wait for connection timeouts before you can do an installation. It happened a few times that I just wanted to install a small tool which slipped my mind when I did a clean install (such as midnight commander) quickly so that I can continue with what I am doing. At this point I am not fussed about having the absolute latest security fix - I just need to get going. In any case, I would propose that when you open the package manager, it should kick off a thread to download repository updates into a temp folder. The next time you start the package manader, it can simply replace those repositories for which there are updates available in the temp folder. My reasoning is: If I opened the package manager yesterday and I open it again today (and maybe again later today), why should I always wait a few minutes before I can start selecting packages? Switching off the refresh feature in "Software repostories" is not an option, because it takes too long and you typically do not want it switched off. #30: Reinhard Max (rmax) (2009-10-29 09:59:10) (reply to #29) Please open a new feature request for your proposal, as it is unrelated to the feature that is being discussed here. #33: Fco. Javier Nacher (xiscoj) (2009-12-07 09:39:04) hi, why don't we learn from another package managers?. I usually use smart ( http://labix.org/smart (http://labix.org/smart) ), I think it working mode is the apropiate. It downloads in parallel and after downloading all packages, installs them sequentially, also allows to update channels pressing a button only on demand, so it opens faster. I think that parallel install is not a good choice but parallel download is more than recommended. #34: Duncan Mac-Vicar (dmacvicar) (2009-12-07 11:01:03) (reply to #33) You missunderstood the feature. The feature talks about downloading multiple files at the same time, and installing what is possible to install at the same time. It does not implies that multiple packages are installed in parallel (however it does not discard it). You give no reason when stating that "parallel install is not a good choice". Downloading in parallel is something we will do anyways. However, as we have DownloadInAdvance and DownloadInHeaps, we can download group of packages that form a minimal dependency transaction. I doubt smart has this information. This would allow us to download the maximum amount of packages (either sequential or in paralell) that can be downloaded and installed while leaving the system in a consistent state. Those could be installed in parallel while the next "heap" or group is downloaded. Any error in the next download would leave the system still in a consistent state. So yes, we should learn from smart, yum and others, but we should not just copy, as we don't have all the limitations they do have, and that means we can actually try to improve things. #35: Robert Davies (robopensuse) (2009-12-07 21:21:45) (reply to #34) It is possible to grab the small "odds & sods" together that start being downloaded early in install even if they are logically independant? The trouble I notice on network install, is that there's many small packages not particularly important packages, which don't let the TCP/IP download get to decent speed, making things crawl early on at nowhere near usual bandwidth. So a strategy, like a browser of getting small files in parrallel, whilst mainly wanting those large consistent sets you mention, might make things look pleasingly snappy, rather than naive. Of course if a large package were to be downloaded, as the small ones arrive and are installed noone would complain. #40: Fco. Javier Nacher (xiscoj) (2010-01-28 09:15:41) (reply to #34) hi, I think parallel download good be a good start point, so once you have all the packages you can install without dependency problems and you'll be sure that the system will be consistent. Of course if some packages are not dependant on others could be installed while downloading the rest. But I think this could be more difficult to implement so as first step for this features could be parallel download of packages and the next step parallel install, and maybe the first step would be easier and faster to implement for 11.3 and the second step can be delayed for next opensuse version. Improving is always good :) #36: Robert Davies (robopensuse) (2009-12-10 16:21:39) Testing out Kubuntu 9.10, and one of early things it does on starting Konqi, is suggest installing "restricted" multia-media codecs. The downloads were fast, stating off at full 10Mb/s, just tailing off a touch after a while. Similarly installing 110 bug fix updates took only a few minutes, despite requiring full kernel download. By comparison, though zypper itself is fast searching in DB, getting the repo info and sequence of small files, can really keep the end user waiting with alot of under-utilised bandwidth. #41: Thiago Sayao (sayao) (2010-06-03 03:29:13) https://bugzilla.novell.com/show_bug.cgi?id=609276 (https://bugzilla.novell.com/show_bug.cgi?id=609276) Anyone would pick this on Hack Week? I think parallel downloads would speed up things a lot. I could keep downloading when installing so the connection would not close - this would save a lot of "hand-shaking" time. #43: Emmanuel ESCARABAJAL (eescar) (2010-06-29 03:28:38) I've been using SuSE for years, since the 5.x, and been wondering since then why they stop downloading packages while installing the one just finished, as compared to some other distros who keeps downloading while installing what was ready to! Ok, I must admit that in those early days, DSL was just born and 56k connections makes each and every second of time saved quite valuable; is it a reason, now we (quite) all have fat Dsl connections, to tolerate such a waste??? I, personnaly only have a 1k dsl line and it makes a great difference with the 10k line I have at work when speaking of the net install experience ... So Please, make the download process in-interupted Soooooon! #44: nick skeen (ns89) (2010-08-01 17:47:22) This is fine with me, as long as there's some form of limiting the number of downloads at once, and perhaps some sort of scheduler to decide which ones to download at the same time. #45: Ralph Ulrich (ulenrich) (2010-08-01 22:24:02) To prevent having a half way update done when crashing, if you use openSUSE factory then edit /etc/zypp/zypp.conf commit.downloadMode = DownloadInAdvance #47: Wallacy Freitas (wallacyf) (2010-09-06 18:52:33) I know the installation order of packages is important. But that does not mean having to wait to install a package for now start downloading the next. Downloading and installation are separate tasks. Package "A" was downloaded, the package "A" goes to "queue" for installation, while this package "B" is downloaded, then "C" then "D", etc.. Just use the term "cache", the packets are downloaded regardless of the previous packet has already been installed. So you can even download multiple packages at once. It would also be possible to install more than one at the same time, if in compliance with the dependences of course. The biggest problem of downloading one by one, is time spent with HTTP requests only to start the download itself. Outside we have to expect a package to be installed to start the next request. #48: Jeremy Thornton (leebrad) (2010-09-16 23:19:45) Would it be possible to have a switch option depending on the usage. Simply, by default, especially for the desktop use 1. Download, 2. Que for Install (Install), 3. Begin downloading next file. Call it REAL- TIME mode. Then offer an alternative called BATCH mode which simply downloads all files needed and installs when the entire download is complete, and the downloaded files are kept in cache until all are verified download and installed. This option allows an administrator to choose based on the particular setting of the unit. It could look something like zypper -rt install whatever (REAL-TIME mode) zypper -batch install whatever (BATCH mode) and in the YAST have GUI option to choose at the top before downloading. #49: Edward Cullen (screamingwithnosound) (2010-12-31 21:49:59) As Zypper already knows about dependencies, surely it would be (relatively) trivial to arrange download & install in such a way that if the connection was cut half-way through, the system would still remain consistent? That is, if A require B and B requires A, one would download both before installing them. If, however, X required Z, but Z does not require anything, one would download and install Z while X was still downloading. #50: Enea George (premamotion) (2011-01-23 15:18:59) Like in Ubuntu installation... #51: Glenn Doig (doiggl) (2011-01-25 16:11:12) Allow option to select open suse mirror. It maybe in the same country as you, or has good bandwidth which allows a large number of files in to be processed in a short period of time. Cheers... #52: Sławomir Lach (lachu) (2011-02-27 12:52:17) OpenSUSE can uses static(hard) links and onion file systems to made transaction save. We do any think on onion filesystem, like decompress a packages. The onion file system could been mount in /var/* and writable layer in /var/*. After installation is performed on onion filesystem, we can asks to close some programs(if necessary), make a hard links, clean writable layer and unmount onion filesystem. We can also rollback changes or commit it. #53: Duncan Mac-Vicar (dmacvicar) (2011-03-10 12:44:20) (reply to #52) Hard links across filesystems? #54: Reinhard Max (rmax) (2011-03-10 12:49:36) (reply to #53) Please guys, open a separate feature request if you want to discuss transaction safe installs. This request has way too many unrelated comments already. #55: Carlos Lange (cflange) (2011-05-15 22:38:07) We finally have this as a late edition to 11.4. Hurra! A victory for openFATE. Thank you to the developers! #58: Reinhard Max (rmax) (2011-05-16 09:43:30) (reply to #55) Only some aspects that got added during the discussion here were implemented in 11.4. The original request is still open, which was to speed up network installation by downloading the next packages while already downloaded ones are being installed. #59: Eduard Avetisyan (dich) (2011-09-25 20:59:59) (reply to #58) Right. The proposal was to DownloadWayBeforeItsNeeded :) #56: Robert Xu (bravoall1552) (2011-05-15 23:10:14) (reply to #55) oh? what option would that be? :O #57: Carlos Lange (cflange) (2011-05-16 08:04:01) (reply to #56) Simple zypper up, no options. From the updated Release Notes: "Default Download Mode Needs More Space The Zypp download mode for package installation has been changed: It first downloads all packages, before it starts installing them. This means, it now needs more disk space. The traditional behavior was to process one package after the other. If you are short on disk space, switch back to the traditional behavior by setting commit.downloadMode = DownloadAsNeeded in /etc/zypp/zypp.conf." And indeed, it downloads and applies diffs first, then installs the patches. I am seeing the same behaviour with "zypper install" when I install a list of 50 packages at once as part of my install script. All were first downloaded, then installed. #60: Roger Luedecke (shadowolf7) (2011-11-12 14:17:05) I would not want this as default behavior. My computer overheated today right in the middle of downloads. But thankfully the way we do it now kept my system safe. By all means offer this if you wish, but as an option and not set as default behavior. The only way this behavior should be default is on the NetInstall; and mind you I don't mean any further than during the installation of the distro. #61: Christoph Obexer (cobexer) (2011-11-13 11:05:40) (reply to #60) Overheating computers have hardware problems. We should not punish every openSuSE user because some(very little) of them have broken hardware. Fast should be the default. If I wanted to have a slow system I'd be using windows. #63: Eduard Avetisyan (dich) (2011-11-14 15:19:03) (reply to #61) +1 to Christoph :) #62: Eduard Avetisyan (dich) (2011-11-14 12:08:30) Gosh, this poor feature is now 5 years old! #64: Pete Eby (lewstherintelemon) (2011-12-11 22:25:15) How about the download and install in parallel? Seems like quite a valid request. #65: Eduard Avetisyan (dich) (2012-07-17 23:33:00) Alright, this feature now has the highest rating. Let's see for how many more releases it will be rejected. Just to emphasize it again: I've been doing an online update some days ago, and one (only one!) of the repositories had a slow link. Unfortunately, the package it needed to install was somewhere in the middle of the list. Now instead of doing the rest of the downloads and installs (in PARALLEL), zypper had to wait for ages till this poor package came, and THEN spend time again to get the other packages one-by-one. Miserable behavior. #66: fn ln (gwdg37) (2013-06-18 20:09:53) This is a must have feature, actually all updating should have optional elements letting the user decide how the package selection will be done (several handling apps avail), if and where the packages will be saved, if the actual update will be done on or off line, etc. etc. I've been forced to use a custom rsync sript to periodically bring up my local repos but this requires a lot of unnecessary doanloading. I have 4 computers each with 2 or more suse versions in the house and doing online updates on them one at a time is simply "not on"! Never was, never will be. The ideal would be a feature that lets (using yast or any other utility) the user repopulate the local repo only with wanted packages. + #67: Richard Brown (rbrownsuse) (2017-05-25 11:45:50) + Done as much as can be done while ensuring a safe system with the + 'heaps' approach. -- openSUSE Feature: https://features.opensuse.org/120340
participants (1)
-
fate_noreply@suse.de