Mailinglist Archive: opensuse-features (25 mails)

< Previous Next >
[New: openFATE 318807] practical steps to parallel downloadinging of updates
Feature added by: Jason Newton (jenewton)

Feature #318807, revision 1
Title: practical steps to parallel downloadinging of updates

openSUSE Distribution: Unconfirmed
Priority
Requester: Desirable

Requested by: Jason Newton (jenewton)
Partner organization: openSUSE.org

Description:
Yet another night I had to do updates, and 50-75 percent of that time I am
watching being squandered away. This goes on year after year (been using SuSE
since 2008 after Ubuntu, Gentoo, SlackWare before it in reverse order of
usage). With typically over 700 updates equating out to about 60 minutes I
think we can establish that updates on OpenSUSE are slow - even provided
infinite bandwidth and SSDs. I've cracked open the zypper sources more than
once with the intension of starting to rectify this but I cannot find the time
to accomplish it.

The problem is that the downloading used by zypper needs to be made parallel.
The reasons for this are to support overlapping transfers to keep downloads
going - there is a significant setup time to start a http based download and it
adds up over 700 packages in a typical development installation. It gets worse
when patches come up and shift execution to disk, throwing away precious
internet bandwidth timeslots. This is the case with zypper (d)up
--download=in-advance.

The good news is this doesn't have to involve threads. I've taken a look before
at adding support of this nature to zypper and while the codebase for zypper is
all nice and OOPed, it is super tighly coupled to single task at a time
execution. It is however using curl if I recall correctly. Parallel downloading
with curl is very easy if already using curl and can be done without threads.

Another way to do this without massive complications to the code base are
executing a plan of what to do as is current and then entering a special "sync
mode" (like rsync for updates) or exporting a shell script or text file of
links so a user can download the updates to cache with an external tool
designed for efficient batched downloading with pipelining and multiple
outbound requests and either isolates the parallel tasks to a local module or
gives the problem to an external tool. Zypper, after running this download tool
or "sync mode" just sees alot of cache hits and proceeds as normal thereafter.

The epitime is having a seperate download thread(pool) and connect it all up
over something like DDS or ZMQ. You can build more stuff inhouse if you're into
that. As dependencies are resolveed in the download threads, the update is
installed in the main thread. This way update installation downloading is
performed concurrently or overlapping which is going to be the fastest updates
get to be installed.

I think the last bit should be put on the roadmap but really this needs
addressed and not repeated kicking forever and ever and so I urge SUSE zypper
maintainers to provide a practical way for power users and suse admins to do
this basically now if they were to go add a factory zypper repo - if this means
"sync mode" or script/batch file generation for something like aria2c, that is
really not so bad.

Business case (Partner benefit):
openSUSE.org: Resolving the problem in the right way will increase efficiency
on SUSE's servers and reduce time of the update process both for SUSE and user
(also read $ and time). Everybody wins and SuSE admins/power users/developers
get to spend more time on other tasks.


--
openSUSE Feature:
https://features.opensuse.org/318807

< Previous Next >
List Navigation
This Thread
References