[Bug 891168] New: Device names are changing dynamically, making the system disappear.
https://bugzilla.novell.com/show_bug.cgi?id=891168 https://bugzilla.novell.com/show_bug.cgi?id=891168#c0 Summary: Device names are changing dynamically, making the system disappear. Classification: openSUSE Product: openSUSE Factory Version: 201408* Platform: x86-64 OS/Version: Other Status: NEW Severity: Critical Priority: P5 - None Component: Basesystem AssignedTo: bnc-team-screening@forge.provo.novell.com ReportedBy: nrickert@ameritech.net QAContact: qa-bugs@suse.de Found By: --- Blocker: --- User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Firefox/31.0 Today, I installed snapshot 20140807 using live media. My install was to an external drive. If I have booted from the internal drive, then the devices are: /dev/sda internal drive /dev/sdb through /dev/sde - memory card reader slots (usually no memory installed) /dev/sdf first USB drive /dev/sdf second USB drive. During install, booted from live media (live KDE 20140807, and the live Gnome described in factory mailing list message "http://lists.opensuse.org/opensuse-factory/2014-08/msg00048.html" the install media was on a USB. /dev/sda internal hard drive /dev/sdb installation USB live media /dev/sdc through /dev/sdf memory card reader slots /dev/sdg the 80G external drive, to which I was installing. On two install attempts - first with that Gnome media, second with KDE, the install seemed to hang at about the time that it have finished copying the root file system. With the Gnome install, I saw popups about an inserted drive. And the system was in a hard hang. With the KDE install, there was eventually a message about copy failure. Whenn I checked "/dev/sdg" no longer existed. The device now appeared to be "/dev/sdh". The system still showed "/dev/sdg1" and "/dev/sdg3" as mounted, but the device files did not exist. Power off was the only way I could shutdown. A third try got me installed with the live KDE. After reboot, the devices were: /dev/sda internal hard drive /dev/sdb external drive to which I had installed /dev/sdc through /dev/sdf - memory card slots. After the final stages of install (which had errors), I started having "command not found" errors. It was very difficult to investigate, since the basic shell commands were not available. I did: cd /dev echo sd? And that showed: sda sdc sdd sde sdf sdg So it looks as if what was "/dev/sdb" at boot had become "/dev/sdg" I powered off, rebooted. This happened twice more. --------- My previous install on this external drive was 20140721, installed with 64bit DVD. I did not have any problems with that. It is possible that my external drive is suddenly failing. I don't see any other symptoms of disk failure. I guess I could install 13.1 or 20140721 and see if the problem still occurs. Reproducible: Always Steps to Reproduce: 1. 2. 3. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=891168 https://bugzilla.novell.com/show_bug.cgi?id=891168#c1 --- Comment #1 from Neil Rickert <nrickert@ameritech.net> 2014-08-09 17:19:11 UTC --- First a correction. My previous install on this external drive was 20140728 (not 20140721). This morning, I booted up the system. Within 30 minutes, I had lost access to the boot device. "/dev/sdb" no longer existed, but there was a "/dev/sdg". At that stage, I have no commands available other than shell builtins, so it is hard to investigate further. Subsequently, I reinstalled 20140728. And that has been running fine. The install went without a problem. After booting into the system, the system disk "/dev/sdb" (the external drive) remains available. So I'm pretty sure that the problem is due to a change between 20140728 and 30140807. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=891168 https://bugzilla.novell.com/show_bug.cgi?id=891168#c2 Bernhard Wiedemann <bwiedemann@suse.com> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |bwiedemann@suse.com, | |coolo@suse.com --- Comment #2 from Bernhard Wiedemann <bwiedemann@suse.com> 2014-08-13 16:19:45 CEST --- sound a bit like some kernel/udev/udisk screwup I wonder what updates we had in that area... -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=891168 https://bugzilla.novell.com/show_bug.cgi?id=891168#c3 --- Comment #3 from Neil Rickert <nrickert@ameritech.net> 2014-08-13 15:13:51 UTC --- I've kept the install 20140728 external drive for the present. And I'm holding onto the install media for it (I used the DVD installer). So I can compare. 20140728 used kernel-desktop-3.16.rc6-1.2. The repos currently have 3.16.rc7-1.2. I doubt that the kernel is the problem. I already updated to rc7 a few days ago, and the system behave properly that way. I figured it was a safe update, because I could boot the old kernel if that became necessary. 20140728 used udisks2-2.1.3-1.3, and the repos currently contain the same version. So that also unlikely. 20140728 has udev-210-19.1, while the repos currently have 210-22.1. My bets would be on the udev change as causing the problem. I wonder what that changed? If you want, as a test, I can update udev to the repo version and see if it breaks. As long as I have my install media, I should be able to go back. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=891168 https://bugzilla.novell.com/show_bug.cgi?id=891168#c4 Bernhard Wiedemann <bwiedemann@suse.com> changed: What |Removed |Added ---------------------------------------------------------------------------- AssignedTo|bnc-team-screening@forge.pr |systemd-maintainers@suse.de |ovo.novell.com | --- Comment #4 from Bernhard Wiedemann <bwiedemann@suse.com> 2014-08-13 17:23:51 CEST --- there were some changes recently https://build.opensuse.org/package/revisions/Base:System/systemd you should also be able to see them with rpm -qp --changelog /path/to/udev.rpm -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=891168 https://bugzilla.novell.com/show_bug.cgi?id=891168#c5 --- Comment #5 from Neil Rickert <nrickert@ameritech.net> 2014-08-13 16:35:19 UTC --- That link you provided makes me wonder if it could be a "systemd" change that causes the problem. I first saw this during an install attempt. I was running openSUSE-Factory-KDE-Live-x86_64-Snapshot20140807-Media.iso and did the install from there. The live media was running from a USB flash drive. I still have that flash drive. Could there be some logs saved in the hybrid partition that might be useful for analysis? -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=891168 https://bugzilla.novell.com/show_bug.cgi?id=891168#c6 --- Comment #6 from Neil Rickert <nrickert@ameritech.net> 2014-08-13 16:51:45 UTC --- I got that last bit a little wrong. I now recall that I recreated the hybrid partition after the failure. The second install from that USB stick worked, so there might be nothing there. My original install was from a Gnome iso openSUSE-13.2-livecd-gnome.x86_64-2.8.0.iso (the information in this came from the factory mailing list). I have not touched the hybrid partition on the USB I used for that. Since that install also appeared to fail in the same way, perhaps there is something in the hybrid partition there. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=891168 https://bugzilla.novell.com/show_bug.cgi?id=891168#c7 --- Comment #7 from Neil Rickert <nrickert@ameritech.net> 2014-08-15 14:39:08 UTC --- Created an attachment (id=602500) --> (http://bugzilla.novell.com/attachment.cgi?id=602500) transcript ("script" command) while gathering information I've changed strategy for investigating this bug. I have now installed factory (20140814) on the internal drive. While running that, I am mounting the external drive mount /dev/sdf1 /mnt/1 mount /dev/sdf3 /mnt/1/home The external drive came up as "/dev/sdf". After a while, "/dev/sdf" disappeared, and the external drive was now "/dev/sdg". The typescript file shows the output of: ls -l /dev/sd? fdisk -l /dev/sdg ## to confirm that it is the external drive df ## to show that partitions of "/dev/sdf" are what are mounted dmesg | grep sdg # message related to /dev/sdg dmesg | grep sdf # messages related to /dev/sdf Some other commands that didn't reveal anything. Doing it this way, I still see the bug show up. But I now have commands to work with to investigate it. Let me know if there is other information I should be checking. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=891168 https://bugzilla.novell.com/show_bug.cgi?id=891168#c8 --- Comment #8 from Neil Rickert <nrickert@ameritech.net> 2014-08-15 20:56:24 UTC --- Created an attachment (id=602542) --> (http://bugzilla.novell.com/attachment.cgi?id=602542) USB reconnects (from dmesg output) I tried some more tests. I plugged in the external drive, but did not mount. It showed up as "/dev/sdf". The attachment consists of the lines from "dmesg" output showing reconnects for this external drive. The KDE device notifier popped up each time that it reconnected. I then tried two different devices -- a USB flash drive, and a different external hard drive. Neither of them had a similar reconnect problem. I next plugged the same external drive (the one with problems), into a different computer (also running factory 20140814). There was no reconnect problem there either. I still think this is a software bug, since it did not happen up through factory snapshot 20140728. However, it looks as if it is a rare problem and won't affect many people. The computer where I am seeing this is a Dell Dimension C521, purchased in 2007. The external drive is an I/O Magic IDE/SATA drive container, in which I have mounted an 80G IDE disk. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=891168 https://bugzilla.novell.com/show_bug.cgi?id=891168#c9 Bernhard Wiedemann <bwiedemann@suse.com> changed: What |Removed |Added ---------------------------------------------------------------------------- CC|coolo@suse.com | AssignedTo|systemd-maintainers@suse.de |bnc-team-screening@forge.pr | |ovo.novell.com --- Comment #9 from Bernhard Wiedemann <bwiedemann@suse.com> 2014-08-18 08:38:05 CEST --- I could imagine it to be problems with the external power supply on the USB disk container (that depend on the software's access patterns) or it could be some strange interaction between the USB host adapter / driver and the device. I any case, this is most likely not an openSUSE-specific bug and thus we can't help much. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=891168 https://bugzilla.novell.com/show_bug.cgi?id=891168#c10 --- Comment #10 from Neil Rickert <nrickert@ameritech.net> 2014-08-18 18:05:34 UTC ---
I could imagine it to be problems with the external power supply on the USB disk container (that depend on the software's access patterns)
That seems unlikely to me. It is looking as if this happens when the device is idle, which is when power requirements should be at the lowest. If there's a lot of I/O activity on the device, it doesn't seem to reconnect. After installing 20140807, I booted, ran Yast and did some further software installing. There was no problem during the software installing. The failure occurred shortly after the installing had completed, leaving the device in a relatively idle status. Strange interaction between host adapter, device and driver -- that I can buy. But I wonder what the problem did not show up in 20140728, but did occur in 20140807. In any case, I recognize that this might not ever be solved. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=891168 https://bugzilla.novell.com/show_bug.cgi?id=891168#c11 Neil Rickert <nrickert@ameritech.net> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |CLOSED Resolution| |INVALID --- Comment #11 from Neil Rickert <nrickert@ameritech.net> 2014-09-01 20:44:27 UTC --- I'm marking this as closed. I reseated the cable connectors for the disk drive, and that seemed to fix the problem. As to what changed in recent factory releases: I had the problem also show up once when running 13.1. But it is harder to see. With recent factory releases, if the USB device is "/dev/sdf" and something is mounted, then after whatever error occurs, the device becomes "/dev/sdg" With 13.1, if the device is "/dev/sdf", then after the error it remains "/dev/sdf" and the file systems remain mounted (but now marked as read-only). -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
participants (1)
-
bugzilla_noreply@novell.com