[opensuse-buildservice] a broad appeal to fix opensuse's package-management's repository/redirector failure & recovery mechasnism
In my experience, package management's routine failure to find/use a 'healthy' repository is a long-standing production problem. I'm requesting a fix. Here's a summary: I run Opensuse 13.1 on numerous machines lsb_release -rd Description: openSUSE 13.1 (Bottle) (x86_64) Release: 13.1 Current count is ~200. The machines are installed at multiple locations around the globe. They're connected to the 'net via a variety of different networks providers. Some of the machines are directly connected to the 'net, some are behind LAN routers, switches & firewalls. Package management for all of the machines is handled exclusively via zypper cli. Each machine has a common core of repositories defined in /etc/zypp/repos.d, and frequently has a number of additional @openSUSE dev (!'home') repos defined. In ALL cases, the default install of repos sets have been installed with the meta-director as baseurl, baseurl=http://download.opensuse.org/... Regular package maintenance consists of zypper clean --all zypper (d)up The maintenance frequency is nominally 1/wk, often 1/dy, and in devs' cases, often more frequent. In virtually ALL cases, the update process regularly fails @ retrieving/refreshing the repos' (meta)data. For example, a typical result is: ... Checking whether to refresh metadata for KDE4-Extra-Unstable Retrieving: repomd.xml .......................................................................................[error] File '/repodata/repomd.xml' not found on medium 'http://download.opensuse.org/repositories/KDE:/Unstable:/Extra/KDE_Current_o...' Abort, retry, ignore? [a/r/i/? shows all options] (a): ... This occurs occassionally for any/all repos, whether the standard distribution repos (security, update, etc), core DM (e.g. KDE*) additional repos, or the more 'esoteric' !home OBS-hosted repos (e.g., security:netfilter). The failure rate for overall update/upgrade process attempts is, very roughly, ~15%. The error is NON-recoverable. 'Abort' & 'retry' *never* work. Chats @ IRC re: the issue typically result in the same '(non)responses' : "wait", "works for me", "prove it", etc. The ONLY solution(s) that work are: (1) wait some random amount of time -- typically hours, occassionally days -- until the system magically heals itself, (2) visit the download.opensuse.org link for the repo, click 'details' for a target page, identify a specific working/available repo for the package(s) of interest, and manually edit baseurl= for the problematic repo. Neither is tenable for a reliable operating environment. It is simply unmanageable in either a single, local or widely-distributed environment. (2) is further confounded by the fact that, at any given time, a previously-working, manually-selected repo may, itself, fail, requiring -- yet again -- another manual intervention. Within the scope of our environment, no other distro's package management system has anywhere near the failure rate demonstrated here. (We've ~600+ other machines running a mix of Centos, Fedora, Debian & Ubuntu). This has been occurring for literally years, across multiple openSUSE versions, and remains unaddressed. I know, without any doubt, that others experience similar/frequent failures -- it's been a frequent discussion with our partners, as well as in openSUSE* IRC channels. This needs a fix. As to what, specifically, that fix can/should be -- I'm unclear. If a solution already exists, I'm unaware. One idea -- a fallback mechanism *within* a repos' definition would be useful For example, allow in a given repo's def'n, having multiple, numbered baseurls baseurl1=http://direct/url/to/specific/site/1/... baseurl2=http://direct/url/to/specific/site/2/... baseurl3=http://download.opensuse.org/... ... baseurlN=http://direct/url/to/specific/site/3/... and add fuction to zypper so that for each repo, the baseurls would be tried in order for any given failure. By adding, e.g., a failcount2abort=X to either/both a given repo's defn, or /etc/zypp(er).conf, the overall process could be terminated if there were "X" # of subsequent fails, indicating a likely systemic problem requiring further intervention. I'd appreciate hearing from "those responsible for keeping the redirector & repos working" re: * acknowledgement, or refusal thereof, of the failure issue * clarification as to why it occurs in the first place * ideas/suggestions as to what can/should be done to fix it Thanks. Grant -- To unsubscribe, e-mail: opensuse-buildservice+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-buildservice+owner@opensuse.org
On Wednesday 01 October 2014 16:10:32 grantksupport@operamail.com wrote:
One idea -- a fallback mechanism *within* a repos' definition would be useful
For example, allow in a given repo's def'n, having multiple, numbered baseurls
baseurl1=http://direct/url/to/specific/site/1/... baseurl2=http://direct/url/to/specific/site/2/... baseurl3=http://download.opensuse.org/... ... baseurlN=http://direct/url/to/specific/site/3/...
and add fuction to zypper so that for each repo, the baseurls would be tried in order for any given failure.
By adding, e.g., a
failcount2abort=X
to either/both a given repo's defn, or /etc/zypp(er).conf, the overall process could be terminated if there were "X" # of subsequent fails, indicating a likely systemic problem requiring further intervention.
This should not be too hard to achieve. Since libzypp-8.8.0 (openSUSE 11.4) using a 'mirrorlist' url instead of 'baseurl' within the .repo file is already supported: #baseurl=http://direct/url/to/specific/site mirrorlist=url://server/path/to/mirrorlist.file Defining multiple URLs for a repo this way is possible. I can't remember any feedback related to mirrorlist, so this feature either works or isn't used. Probably the later, as a quick check reveals that a local file can't be used as mirrorlist (file:/localpath/to/mirrorlist.file) and zypper does not switch non-interactively between the URLs on error. I filed a bugreport to track this. [https://bugzilla.suse.com/show_bug.cgi?id=899510] -- cu, Michael Andres +------------------------------------------------------------------+ Key fingerprint = 2DFA 5D73 18B1 E7EF A862 27AC 3FB8 9E3A 27C6 B0E4 +------------------------------------------------------------------+ Michael Andres SUSE LINUX Products GmbH, Development, ma@suse.de GF:Jeff Hawn,Jennifer Guild,Felix Imendörffer, HRB16746(AG Nürnberg) Maxfeldstrasse 5, D-90409 Nuernberg, Germany, ++49 (0)911 - 740 53-0 +------------------------------------------------------------------+ -- To unsubscribe, e-mail: opensuse-buildservice+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-buildservice+owner@opensuse.org
participants (2)
-
grantksupport@operamail.com
-
Michael Andres