[Bug 982665] New: zypper dup upgrade on low disk space fatal issue
http://bugzilla.opensuse.org/show_bug.cgi?id=982665 Bug ID: 982665 Summary: zypper dup upgrade on low disk space fatal issue Classification: openSUSE Product: openSUSE Tumbleweed Version: Current Hardware: x86-64 OS: openSUSE 13.2 Status: NEW Severity: Normal Priority: P5 - None Component: libzypp Assignee: zypp-maintainers@forge.provo.novell.com Reporter: lurodriguez@suse.com QA Contact: qa-bugs@suse.de Found By: --- Blocker: --- I'm using OpenSUSE factory, recently I decided to zypper dup; I was low on disk space, but zypper continued. It seems to have failed at upgrading zypper itself and this corrupted libzypp library. After freeing space I had to upgrade a similar box and then scp over libzypp libraries and binaries over to my system. After this 'zypper dup' still had issues upgrading, in particular my GNOME environment came to its knees and wanted to abort. After this I could not boot regularly. To fix, I had to boot into single user mode (appending S to my kernel parameters), then dhcp, and zypper dup on the console while on Ethernet. A few issues then: 1) zypper should do a computation of needed disk space and not install unless you clear the space. That would be the proactive solution. 2) We need a test plan to include testing 'zypper dup' on low disk space; have it fail and then purposely corrupt libzyppp library 'echo FOO > /usr/lib64/libzypp.so.1600' is an example. 3) There is an unknown upgrade issue with low disk space and GNOME where it can leave your system in a state that even if you have libzypp and zypper properly installed, 'zypper dup' eventually causes GNOME to crash, and a regular boot up no longer works. I cannot be sure what exactly occurred to trigger such an issue.. but I figure I had installed circa OpenSUSE 13.2 material, and I did a zypper dup. So *maybe* this can be reproduced by testing a 'zypper dup' upgrade from opensuse 13.2 to factory, failing on low disk space, corrupting libzypp.. I think addressing 1) properly would satisfy my concerns as 2) and 3) are secondary issues. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=982665 http://bugzilla.opensuse.org/show_bug.cgi?id=982665#c1 --- Comment #1 from Michael Andres <ma@suse.com> --- Please attach the zypper logfile /var/log/zypper.log (or an older and compressed /var/log/zypper.log-YYYYMMDD.xz) that shows the reported behavior. To see the execution dates and zypper commands included in a logfile you can install the zypper-log package (available since zypper-1.6.11) and execute: zypper-log To extract the log of a specific command use zypper-log pid See the manual page zypper-log(8) for details on how to read older (rotated) zypper-log files. ---- -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=982665 Michael Andres <ma@suse.com> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |lurodriguez@suse.com Flags| |needinfo?(lurodriguez@suse. | |com) -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=982665 http://bugzilla.opensuse.org/show_bug.cgi?id=982665#c2 --- Comment #2 from Michael Andres <ma@suse.com> --- We need to see what actually happened. The 'last resort' is rpm itself, which checks if each packages fits to disk before actually installing it. However if you did not run 'zypper dup' on the console but within a terminal while your desktop crashes, it could explain the heavy damage.... -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=982665 http://bugzilla.opensuse.org/show_bug.cgi?id=982665#c3 Luis Rodriguez <lurodriguez@suse.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Flags|needinfo?(lurodriguez@suse. | |com) | --- Comment #3 from Luis Rodriguez <lurodriguez@suse.com> --- Created attachment 679306 --> http://bugzilla.opensuse.org/attachment.cgi?id=679306&action=edit zypper.log.bz2 (In reply to Michael Andres from comment #1)
Please attach the zypper logfile /var/log/zypper.log (or an older and compressed /var/log/zypper.log-YYYYMMDD.xz) that shows the reported behavior. To see the execution dates and zypper commands included in a logfile you can install the zypper-log package (available since zypper-1.6.11) and execute:
zypper-log
To extract the log of a specific command use
zypper-log pid
This is great! This does show the jump change for sure: 2016-05-23 14:15 7814 1.12.33 zypper search ocaml 2016-05-23 14:39 8698 1.12.33 zypper dup 2016-05-25 10:06 11176 1.13.1 zypper 2016-05-25 10:06 11242 1.13.1 zypper dup 2016-05-25 10:23 846 1.13.1 zypper dup 2016-05-25 10:31 1107 1.13.1 zypper dup 2016-05-25 10:33 1144 1.13.1 zypper dup The issue should have been in pid 8698 which bumped me from 1.12.33 to 1.13.1, and the last package it tried to upgrade and may have hit a wall with was: plymouth-branding-openSUSE-13.3-3.3.noarch.rpm The tail of the 'zypper-log 8698' output : 2016-05-23 16:20:47 <1> ergon.do-not-panic.com(8698) [zypp++] ExternalProgram.cc(start_program):412 pid 766 launched 2016-05-23 16:20:48 <1> ergon.do-not-panic.com(8698) [zypp++] ExternalProgram.cc(checkStatus):513 Pid 766 successfully completed 2016-05-23 16:20:48 <1> ergon.do-not-panic.com(8698) [Progress++] ProgressData.cc(report):88 {#6263|Installing: texlive-amsfonts-fonts-2015.104.3.04svn29208-23.1.noarch}END 2016-05-23 16:20:48 <1> ergon.do-not-panic.com(8698) [zypp] PathInfo.cc(unlink):659 unlink /var/cache/zypp/packages/repo-oss/suse/noarch/texlive-amsfonts-fonts-2015.104.3.04svn29208-23.1.noarch.rpm 2016-05-23 16:20:48 <1> ergon.do-not-panic.com(8698) [zypp] RpmHeader.cc(readPackage):255 ReferenceCounted(@0x477afd0<=1){0x4784eb0}{plymouth-branding-openSUSE-13.3-3.3} from /var/cache/zypp/packages/repo-oss/suse/noarch/plymouth-branding-openSUSE-13.3-3.3.noarch.rpm 2016-05-23 16:20:48 <1> ergon.do-not-panic.com(8698) [zypp] RpmDb.cc(doInstallPackage):1927 RpmDb::installPackage(/var/cache/zypp/packages/repo-oss/suse/noarch/plymouth-branding-openSUSE-13.3-3.3.noarch.rpm,0x0000000c) 2016-05-23 16:20:48 <1> ergon.do-not-panic.com(8698) [zypp++] ExternalProgram.cc(start_program):249 Executing 'rpm' '--root' '/' '--dbpath' '/var/lib/rpm' '-U' '--percent' '--noglob' '--force' '--nodeps' '--' '/var/cache/zypp/packages/repo-oss/suse/noarch/plymouth-branding-openSUSE-13.3-3.3.noarch.rpm' 2016-05-23 16:20:48 <1> ergon.do-not-panic.com(8698) [zypp++] ExternalProgram.cc(start_program):412 pid 788 launched I've also attached the full zypper.log compressed via bzip2. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=982665 http://bugzilla.opensuse.org/show_bug.cgi?id=982665#c4 Michael Andres <ma@suse.com> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |ma@suse.com Component|libzypp |GNOME Assignee|zypp-maintainers@forge.prov |bnc-team-gnome@forge.provo. |o.novell.com |novell.com --- Comment #4 from Michael Andres <ma@suse.com> --- Unfortunately the log does not contain hints. The dup already processed 2100 packages out of 4100 total and there are no hints to problems. As the log tells, zypp is waiting for rpm installing plymouth-branding-openSUSE-13.3-3.3.noarch.rpm to return, when it got killed. Rpms space check did not complain so far, so it also assumes the package can be unpacked to disk. By now I see neither problem nor solution within zypp scope. I'll forward the issue to the GNOME maintainers. Maybe they can help to investigate what actually killed your desktop and by this interrupted the dup. @corrupt libzypp library Are you certain, the libzypp.so.1600 file on disc was actually corrupted? The log says that libzypp has already been updated, while the zypper is still pending. I assume that this is the reason why (the old) zypper is not working after the crash. A corrupted file however could also indicate HW or file system problems. @live dup Especially if you perform bigger 'live dup's, it's IMO a good idea to use the console and not the desktop, as the console is less damageable. You should also try to 'zypper up zypper' before doing the 'dup'. This has the benefit, that the new distros zypper is performing the upgrade, and not the old one. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=982665 http://bugzilla.opensuse.org/show_bug.cgi?id=982665#c5 --- Comment #5 from Luis Rodriguez <lurodriguez@suse.com> --- (In reply to Michael Andres from comment #4)
Unfortunately the log does not contain hints. The dup already processed 2100 packages out of 4100 total and there are no hints to problems. As the log tells, zypp is waiting for rpm installing plymouth-branding-openSUSE-13.3-3.3.noarch.rpm to return, when it got killed. Rpms space check did not complain so far, so it also assumes the package can be unpacked to disk.
By now I see neither problem nor solution within zypp scope.
OK so if we tried to install a package without enough disk space then it was the fault of rpm. Is that right ? Does this mean rpm can provide some disk usage information to zypper per package? If so can zypper compute a total amount of needed disk space for an upgrade prior to upgrade, or was that already done ?
I'll forward the issue to the GNOME maintainers. Maybe they can help to investigate what actually killed your desktop and by this interrupted the dup.
OK thanks. I'm less concerned about me, I'm more concerned about this happening to a system upgrade on SLES, and hitting the same issue.
@corrupt libzypp library Are you certain, the libzypp.so.1600 file on disc was actually corrupted?
Well I am 100% sure it was either or libzypp.so.1600 or the zypper binary. zypper wouldn't run as it had complained about the library, after replacing the library it still would not run aborting somehow, I forget exactly how. I then replaced the binary and it worked.
The log says that libzypp has already been updated, while the zypper is still pending. I assume that this is the reason why (the old) zypper is not working after the crash. A corrupted file however could also indicate HW or file system problems.
True, I do not use btrfs, so ext4, and that's pretty rock solid. I haven't had any single hardware failures on my system at all. This would be a first, but I also cannot rule it out, as anything can be possible, even cosmic rays.
@live dup Especially if you perform bigger 'live dup's, it's IMO a good idea to use the console and not the desktop, as the console is less damageable.
This is clear to me now, given otherwise it was not able to complete. Enhancing the documentation over the 'zypper dup' to cover this would be good. Likewise detecting this, if possible, might be good as well.
You should also try to 'zypper up zypper' before doing the 'dup'. This has the benefit, that the new distros zypper is performing the upgrade, and not the old one.
If that's a generally good idea, can zypper ensure to prioritize this first in a 'zypper dup' ? Then its just part of the packaging decision. Luis -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=982665 http://bugzilla.opensuse.org/show_bug.cgi?id=982665#c6 --- Comment #6 from Michael Andres <ma@suse.com> --- (In reply to Luis Rodriguez from comment #5)
By now I see neither problem nor solution within zypp scope.
OK so if we tried to install a package without enough disk space then it was the fault of rpm. Is that right ?
No. But RPM itself calculates if the currently processed package fits to disk before actually trying to unpack it. The point is that I see no 'no space left on device' or similar error that made zypp die. So If actually the GNOME desktop died because there was no or too little disk space left, then it does not matter if zypp or any other process ate up the space. You would always be in danger. That's why I'd like the gonome maintainers to investigate it.
Does this mean rpm can provide some disk usage information to zypper per package?
No, rpm simply rejects to install the package. Zypper would catch this error and ask you to abort/retry/ignore. This is just a final check; actually we don't want to come into this situation.
If so can zypper compute a total amount of needed disk space for an upgrade prior to upgrade, or was that already done ?
To compute in advance, disk usage information per package needs to be part of the repo metadata, so we can compute before all packages are downloaded. We recently enhanced our rpmmd repos to provide more detailed (per partition) disk usage information (susetags/DVD already do). Zypper is not yet adapted to this, so further enhancing the space estimation is possible and it's already on the todo list. Nevertheless the desktop should not crash. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=982665 Sebastian Wagner <sebix+novell.com@sebix.at> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |sebix+novell.com@sebix.at -- You are receiving this mail because: You are on the CC list for the bug.
participants (1)
-
bugzilla_noreply@novell.com