[Bug 411409] New: update publishing to download.opensuse.org is not atomic
https://bugzilla.novell.com/show_bug.cgi?id=411409 User poeml@novell.com added comment https://bugzilla.novell.com/show_bug.cgi?id=411409#c405932 Summary: update publishing to download.opensuse.org is not atomic Product: openSUSE.org Version: unspecified Platform: All OS/Version: openSUSE 11.0 Status: ASSIGNED Severity: Major Priority: P5 - None Component: Download Infrastructure AssignedTo: poeml@novell.com ReportedBy: poeml@novell.com QAContact: adrian@novell.com CC: gs@novell.com, mls@novell.com, poeml@novell.com, ro@novell.com, abittner@stud.fh-heilbronn.de, kmachalkova@novell.com Depends on: 405932 Found By: Customer +++ This bug was initially created as a clone of Bug #405932 +++ See there for details about the problem (in short, a considerable time window with broken update repository during syncs) and my last comment: Indeed, metadata went from the stage server to download.opensuse.org earlier than the packages. And an X update was transferred in between, which lead to a respectable interval of ~2 minutes between availability of metadata and availability of yast2-ncurses-pkg-2.16.13_2.16.14-4.1_0.1.i586.delta.rpm. This surely hits quite many people. I am working on a fix. I also want to solve a similar (but different) issue that can arise. During the time of client metadata download and package download (spent with parsing of metadata and user interaction), it can happen that metadata and packages are replaced. This does *not* happen in the update tree though; there are no deletions. But it affects the build service and Factory. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=411409
User poeml@novell.com added comment
https://bugzilla.novell.com/show_bug.cgi?id=411409#c1
--- Comment #1 from Peter Poeml
https://bugzilla.novell.com/show_bug.cgi?id=411409
User abittner@stud.fh-heilbronn.de added comment
https://bugzilla.novell.com/show_bug.cgi?id=411409#c2
--- Comment #2 from andreas bittner
https://bugzilla.novell.com/show_bug.cgi?id=411409
User poeml@novell.com added comment
https://bugzilla.novell.com/show_bug.cgi?id=411409#c3
--- Comment #3 from Peter Poeml
im not an http expert, but wasnt there an http replycode that said something like server busy for now, or please revisit at a later time again. meaning to delay the http fetching of the metadata file.
The above fix pretty much avoids the need for this from the start. I am rather sure that the rest is neglectable. I'd rather spend time on fixing the bigger problems with Factory and BS next. :)
if i understand the whole mirror concept properly, this metadata comes from only one source: your opensuse download server. everbody gets their metadata from there and only gets redirected (if applicable) filewise to mirrors and such.
Yes, that's right.
so during the time period of updating the metadata file on this one masterserver, the httpd itself could actually block the delivering of the currently being updated metadta file with a http-reply that the visting client (yast, zypper, whatever) should revisit in like 10seconds or half a minute or whatever the intervals would fit.
Yes, that would be possible. I suggested something similar in the past for Factory updates. Since we control the client pretty much, this could be implemented. However, for the update tree we also need to take into account that there are other clients than libzypp which access the tree, so it shouldn't disturb them. People sometimes have self-written scripts that check for updates and such stuff. A 5xx code as such wouldn't be handled by them (neither would it by libzypp in its present form). For Factory it would be fine I assume.
something like that? maybe that would be the safest way to handle the metadataupdates.
503 service unavailable? and maybe implement some delay to re-try/re-fetch the metadata again when 503 is being given.
I believe this is best to be handled at the application level. After all, libzypp might talk to an entire separate server (a mirror, a local mirror, or a proxy cache) where it could encounter a similar situation, without provision for return codes with special meaning (if we did that). If libzypp handles it on the application level (in an "oops, something's wrong, let's just check this again" manner), it could work around these edge cases under various circumstances. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=411409
User poeml@novell.com added comment
https://bugzilla.novell.com/show_bug.cgi?id=411409#c4
Peter Poeml
Lade herunter: aide-0.13.1-20.1_20.2.i586.delta.rpm [Fehler] Datei './rpm/i586/aide-0.13.1-20.1_20.2.i586.delta.rpm' nicht auf dem Medium 'http://download.opensuse.org/update/11.0/' gefunden.
Abbrechen, wiederholen, ignorieren? [A/w/i]:
(Kindly reported by Christian Deckelmann.) download.o.o pulled the metadata at 10:31, but no packages. The metadata was downloaded by clients and contained references to the aide packages. I believe this is due to a race: The first sync (packages only) ran while putpatch had not started (and the packages were not there yet); and the second sync (metadata only) ran when putpatch was completely done. Thus, download.o.o pulled only the metadata. The packages came with the next sync then, 15 minutes later. Since the time that putpatch needs to run is hardly predictable, some locking would be best to avoid this race. It probably doesn't occur often, I assume, but as we see it can happen. Rudi, do you think you could touch a lockfile in opensuse-ftp/pub/opensuse/update/ before putpatch starts and remove it after the run? -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=411409
User poeml@novell.com added comment
https://bugzilla.novell.com/show_bug.cgi?id=411409#c5
--- Comment #5 from Peter Poeml
https://bugzilla.novell.com/show_bug.cgi?id=411409
Peter Poeml
https://bugzilla.novell.com/show_bug.cgi?id=411409
User ro@novell.com added comment
https://bugzilla.novell.com/show_bug.cgi?id=411409#c6
Ruediger Oertel
The first sync (packages only) ran while putpatch had not started (and the packages were not there yet); and the second sync (metadata only) ran when putpatch was completely done. Thus, download.o.o pulled only the metadata.
are you saying there is a double sync running where the second one only looks at metadata ? -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
participants (1)
-
bugzilla_noreply@novell.com