[Bug 371657] New: attempt to access beyond end of device when using dmraid
https://bugzilla.novell.com/show_bug.cgi?id=371657 Summary: attempt to access beyond end of device when using dmraid Product: openSUSE 10.3 Version: Final Platform: x86-64 OS/Version: openSUSE 10.3 Status: NEW Severity: Critical Priority: P5 - None Component: Kernel AssignedTo: kernel-maintainers@forge.provo.novell.com ReportedBy: deanjo@sasktel.net QAContact: qa@suse.de Found By: --- Created an attachment (id=202477) --> (https://bugzilla.novell.com/attachment.cgi?id=202477) boot.msg log OK this issue seems to be tied to dmraid. If a software raid 0 is set up on these drives, everything functions as it should. When using dmraid though you get spammed with hundreds of lines like this during boot. Oddly enough they seem to be tied to my larger drives Seagates 7200.11 500 Gig Sata 2 drives. I also have 2 Maxtor 250 Gig drives running dmraid on the same controller (sda and sdb) and they do not have the same issue. It looks like that it can't figure out the drive geometry for the Seagate raids. I'm posting my boot.msg as an attachment Mar 17 00:19:22 my kernel: attempt to access beyond end of device Mar 17 00:19:22 my kernel: sde: rw=0, want=1953535936, limit=976773168 Mar 17 00:19:22 my kernel: printk: 30 messages suppressed. Mar 17 00:19:22 my kernel: Buffer I/O error on device sde1, logical block 1953535872 Mar 17 00:19:22 my kernel: attempt to access beyond end of device Mar 17 00:19:22 my kernel: sde: rw=0, want=1953535937, limit=976773168 Mar 17 00:19:22 my kernel: Buffer I/O error on device sde1, logical block 1953535873 Mar 17 00:19:22 my kernel: attempt to access beyond end of device Mar 17 00:19:22 my kernel: sde: rw=0, want=1953535938, limit=976773168 Mar 17 00:19:22 my kernel: attempt to access beyond end of device Mar 17 00:19:22 my kernel: sde: rw=0, want=1953535939, limit=976773168 Mar 17 00:19:22 my kernel: attempt to access beyond end of device Mar 17 00:19:22 my kernel: sde: rw=0, want=1953535940, limit=976773168 Mar 17 00:19:22 my kernel: attempt to access beyond end of device Mar 17 00:19:22 my kernel: sde: rw=0, want=1953535941, limit=976773168 Mar 17 00:19:22 my kernel: attempt to access beyond end of device Mar 17 00:19:22 my kernel: sde: rw=0, want=1953535942, limit=976773168 Mar 17 00:19:22 my kernel: attempt to access beyond end of device Mar 17 00:19:22 my kernel: sde: rw=0, want=1953535943, limit=976773168 Mar 17 00:19:22 my kernel: attempt to access beyond end of device Mar 17 00:19:22 my kernel: sde: rw=0, want=1953535936, limit=976773168 Mar 17 00:19:22 my kernel: attempt to access beyond end of device Mar 17 00:19:22 my kernel: sde: rw=0, want=1953535937, limit=976773168 -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=371657 User jeffm@novell.com added comment https://bugzilla.novell.com/show_bug.cgi?id=371657#c1 Jeff Mahoney <jeffm@novell.com> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |jeffm@novell.com Status|NEW |NEEDINFO Info Provider| |deanjo@sasktel.net --- Comment #1 from Jeff Mahoney <jeffm@novell.com> 2008-03-19 11:03:50 MST --- Can you provide the output of "dmsetup table" and also fdisk -l? -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=371657 User deanjo@sasktel.net added comment https://bugzilla.novell.com/show_bug.cgi?id=371657#c2 --- Comment #2 from Dean Hilkewich <deanjo@sasktel.net> 2008-03-19 22:02:54 MST --- Very well I'll recreate the array tomorrow night and report back. (I'll have to move the data off on of the set of drives as I switched them to softraid) -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=371657 User deanjo@sasktel.net added comment https://bugzilla.novell.com/show_bug.cgi?id=371657#c3 --- Comment #3 from Dean Hilkewich <deanjo@sasktel.net> 2008-03-19 23:07:10 MST --- Created an attachment (id=203066) --> (https://bugzilla.novell.com/attachment.cgi?id=203066) output of "dmsetup table" and also fdisk -l Output of dmtable setup and fdisk -l -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=371657 User deanjo@sasktel.net added comment https://bugzilla.novell.com/show_bug.cgi?id=371657#c4 --- Comment #4 from Dean Hilkewich <deanjo@sasktel.net> 2008-03-26 07:49:20 MST --- Is there anything else you need Jeff? -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=371657 Jeff Mahoney <jeffm@novell.com> changed: What |Removed |Added ---------------------------------------------------------------------------- AssignedTo|kernel-maintainers@forge.provo.novell.com |jeffm@novell.com Status|NEEDINFO |ASSIGNED Info Provider|deanjo@sasktel.net | -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=371657 User deanjo@sasktel.net added comment https://bugzilla.novell.com/show_bug.cgi?id=371657#c5 --- Comment #5 from Dean Hilkewich <deanjo@sasktel.net> 2008-03-28 22:06:32 MST --- Just thought I'd mention it's also present in openSUSE 11A3 -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=371657 User jeffm@novell.com added comment https://bugzilla.novell.com/show_bug.cgi?id=371657#c6 Jeff Mahoney <jeffm@novell.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|ASSIGNED |NEEDINFO Info Provider| |deanjo@sasktel.net --- Comment #6 from Jeff Mahoney <jeffm@novell.com> 2008-03-31 09:55:27 MST --- Ok, so I've test up a test environment that consists of 4 sparse files as loop back devices that match the disks in your system: $ ls -la sd* -rw-r--r-- 1 jeffm users 251000194048 2008-03-31 10:37 sda -rw-r--r-- 1 jeffm users 251000194048 2008-03-31 10:37 sdb -rw-r--r-- 1 jeffm users 500107863040 2008-03-31 10:38 sde -rw-r--r-- 1 jeffm users 500107863040 2008-03-31 10:38 sdf Then I created the dm striping environment to match yours: $ dmsetup table | grep nvidia|sort nvidia_bdjbjgci: 0 980469248 striped 2 128 7:0 0 7:1 0 nvidia_bdjbjgci_part1: 0 40965687 linear 253:20 63 nvidia_bdjbjgci_part2: 0 204812685 linear 253:20 40965750 nvidia_bdjbjgci_part3: 0 208845 linear 253:20 245778435 nvidia_bdjbjgci_part4: 0 734475735 linear 253:20 245987280 nvidia_bdjbjgci_part5: 0 41945652 linear 253:24 63 nvidia_bdjbjgci_part6: 0 1028097 linear 253:24 41945778 nvidia_bdjbjgci_part7: 0 691501797 linear 253:24 42973938 nvidia_fjfafgaa: 0 1953546240 striped 2 128 7:2 0 7:3 0 nvidia_fjfafgaa_part1: 0 1953536067 linear 253:28 63 .. and I can't recreate the problem. I seek to the end -3k and write 4k, and get a short write as expected. Can you provide more information as to how you're triggering this? -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=371657 User deanjo@sasktel.net added comment https://bugzilla.novell.com/show_bug.cgi?id=371657#c7 --- Comment #7 from Dean Hilkewich <deanjo@sasktel.net> 2008-03-31 20:14:39 MST --- I'm not sure how much more information (or what) I can give you to reproduce the issue. Here is what I do. I simply enable the fakeraid in bios, set the seagates up to a striped setup. I then go into YaST Partitioner, see the mapper device of the raid set and format it to a file system (ext3 or xfs it does not matter) set it to mount to /local/myraid hit apply and that's it. Upon reboot I get the attempt to access beyond end of device on the seagate part of the raid. The Maxtor's like I said before behave as expected and do not give that error nor do I get the message if I setup a pair of older Maxtor 120 Gigs in place of the seagates following the same procedure to get them raided. Now the maxtors are Sata 1 drives and the seagates are Sata 2 drives. Could it perhaps be because the Seagates are NCQ enabled drives? -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=371657 User cthiel@novell.com added comment https://bugzilla.novell.com/show_bug.cgi?id=371657#c8 Christoph Thiel <cthiel@novell.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEEDINFO |ASSIGNED Info Provider|deanjo@sasktel.net | --- Comment #8 from Christoph Thiel <cthiel@novell.com> 2008-04-25 02:39:09 MST --- Info provided in comment #7? -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=371657 Dean Hilkewich <deanjo@sasktel.net> changed: What |Removed |Added ---------------------------------------------------------------------------- Component|Kernel |Kernel OS/Version|openSUSE 10.3 |openSUSE 11.0 Product|openSUSE 10.3 |openSUSE 11.0 Target Milestone|--- |Beta 2 Version|Final |Factory -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=371657 User deanjo@sasktel.net added comment https://bugzilla.novell.com/show_bug.cgi?id=371657#c9 Dean Hilkewich <deanjo@sasktel.net> changed: What |Removed |Added ---------------------------------------------------------------------------- Priority|P5 - None |P2 - High Target Milestone|Beta 2 |--- Version|Factory |Beta 2 --- Comment #9 from Dean Hilkewich <deanjo@sasktel.net> 2008-05-05 09:04:35 MST --- Still present in beta 2 -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=371657 User gregkh@novell.com added comment https://bugzilla.novell.com/show_bug.cgi?id=371657#c10 Greg Kroah-Hartman <gregkh@novell.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Severity|Critical |Major --- Comment #10 from Greg Kroah-Hartman <gregkh@novell.com> 2008-05-09 10:20:19 MST --- hardware raid is not "CRITICAL", we always recommend using Linux's software raid instead (much more future proof and we can actually support it...) -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=371657 User deanjo@sasktel.net added comment https://bugzilla.novell.com/show_bug.cgi?id=371657#c11 --- Comment #11 from Dean Hilkewich <deanjo@sasktel.net> 2008-05-09 11:05:37 MST --- I would have to disagree with the assessment that software raid is a better solution. It has many downfalls when a system has multiple os's on the system. After all isn't "interoperability" one of the goals of Novell/openSUSE? Had this been an issue like grub not being able to load windows I'm pretty sure it would have been marked Critical. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=371657 User deanjo@sasktel.net added comment https://bugzilla.novell.com/show_bug.cgi?id=371657#c12 --- Comment #12 from Dean Hilkewich <deanjo@sasktel.net> 2008-06-08 10:23:00 MDT --- Is this being addressed? The issue still exists. I've tried the drives on 3 different motherboards with the same issue. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=371657 User deanjo@sasktel.net added comment https://bugzilla.novell.com/show_bug.cgi?id=371657#c13 Dean Hilkewich <deanjo@sasktel.net> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|ASSIGNED |RESOLVED Resolution| |REMIND Version|Beta 2 |RC 1 --- Comment #13 from Dean Hilkewich <deanjo@sasktel.net> 2008-06-08 10:26:05 MDT --- An update would be nice. Controllers that this has been found on are Nforce 3 250 Nforce 570 Nforce 780a Silicon Image 3114 -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=371657 User deanjo@sasktel.net added comment https://bugzilla.novell.com/show_bug.cgi?id=371657#c14 Dean Hilkewich <deanjo@sasktel.net> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|RESOLVED |REOPENED Resolution|REMIND | --- Comment #14 from Dean Hilkewich <deanjo@sasktel.net> 2008-06-08 10:27:26 MDT --- Reopened -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=371657 User deanjo@sasktel.net added comment https://bugzilla.novell.com/show_bug.cgi?id=371657#c15 --- Comment #15 from Dean Hilkewich <deanjo@sasktel.net> 2008-06-09 08:57:58 MDT --- OK I have some additional info on this. It seems that it is not isolated to the drives. The issue occurs when a additional raid set is added. If you create a raid 0 set using only two drives (any of the drives) hooked up everything is fine. If you create a dmraid partition afterwards the system gives the above mentioned errors. If you do a install on each of the raid setups individually and then plug them all in they all register successfully and no errors occur when the partitions are mounted. If you try to delete the partitions using disk utility however you cannot remove the extended partition or the /boot partition. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=371657 User deanjo@sasktel.net added comment https://bugzilla.novell.com/show_bug.cgi?id=371657#c16 --- Comment #16 from Dean Hilkewich <deanjo@sasktel.net> 2008-06-09 09:04:12 MDT --- "If you try to delete the partitions using disk utility however you cannot remove the extended partition or the /boot partition." I should mention that is on the slave raids that had to have the os installed to it individually to avoid getting the above mentioned errors. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=371657 User deanjo@sasktel.net added comment https://bugzilla.novell.com/show_bug.cgi?id=371657#c17 --- Comment #17 from Dean Hilkewich <deanjo@sasktel.net> 2008-06-11 08:56:02 MDT --- Here is the solution to this, the partitioner should have this process automated. http://le-gall.net/sylvain+violaine/blog/index.php?2007/12/04/32-solving-the... -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=371657 Dean Hilkewich <deanjo@sasktel.net> changed: What |Removed |Added ---------------------------------------------------------------------------- Priority|P2 - High |P1 - Urgent -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=371657 User jeffm@novell.com added comment https://bugzilla.novell.com/show_bug.cgi?id=371657#c18 Jeff Mahoney <jeffm@novell.com> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |mkoenig@novell.com AssignedTo|jeffm@novell.com |kasievers@novell.com Status|REOPENED |NEW --- Comment #18 from Jeff Mahoney <jeffm@novell.com> 2008-06-18 10:51:10 MDT --- Great, thanks for the research. I don't think the partitioner is the right place to do this, though. YaST should make things easier, but not be the only method of partitioning sanely. udev should probably just be smarter about scanning the partition tables on device-spanning fake raids like this. I'm reassigning this to the udev maintainer and adding the dmraid maintainer to the CC to coordinate methods of automatically identifying dmraid component devices. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=371657 User kasievers@novell.com added comment https://bugzilla.novell.com/show_bug.cgi?id=371657#c19 --- Comment #19 from Kay Sievers <kasievers@novell.com> 2008-06-18 11:23:21 MDT --- It's the kernel who creates the wrong block devices, udev does not look at partition tables at all. We could possibly work around such things, but I guess the obviously right fix would be that the kernel would not announce broken and unusable block devices. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=371657 User deanjo@sasktel.net added comment https://bugzilla.novell.com/show_bug.cgi?id=371657#c20 --- Comment #20 from Dean Hilkewich <deanjo@sasktel.net> 2008-09-27 11:38:12 MDT --- Any update, has this been fixed in 11.1? I couldn't tell because of the 11B1 64 being a bad iso and uninstallable. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=371657 User kasievers@novell.com added comment https://bugzilla.novell.com/show_bug.cgi?id=371657#c21 --- Comment #21 from Kay Sievers <kasievers@novell.com> 2008-09-27 15:06:35 MDT --- There are no changes I know of. I still think the kernel is wrong in advertising invalidly sized block devices. Userspace always needs to look at the end of devices to find raid signatures. I guess, the kernel should notice, that the disk is smaller than the partition entry states, and refuse to add such partitions, which pretend to be larger than the whole disk itself. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=371657 User kasievers@novell.com added comment https://bugzilla.novell.com/show_bug.cgi?id=371657#c22 Kay Sievers <kasievers@novell.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |NEEDINFO Info Provider| |jeffm@novell.com --- Comment #22 from Kay Sievers <kasievers@novell.com> 2008-10-07 04:48:28 MDT --- The "attempt to access beyond end of device" is caused by the kernel partition table parser, allowing partitions to be created, which pretend to be larger than the entire physical disk. The result of BLKGETSIZE64 and "size" in sysfs returns invalid values. Userspace could find that out by doing the consistency check, the kernel didn't do, by reading the disk size, reading the start and size of the partition, and check if it's in the limits of the disk. But doing that sound really like the wrong solution. Shouldn't the kernel partition table scanner just prevent the entire and invalid partition from showing up, or at least limit the partition size to the physical end of the disk? Jeff, any idea? -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=371657 User kasievers@novell.com added comment https://bugzilla.novell.com/show_bug.cgi?id=371657#c23 Kay Sievers <kasievers@novell.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Info Provider|jeffm@novell.com |deanjo@sasktel.net --- Comment #23 from Kay Sievers <kasievers@novell.com> 2008-10-09 08:55:58 MDT --- Posted patch to lkml, which keeps partitions in the limits of the disk: http://lkml.org/lkml/2008/10/9/219 For a possible additional userspace special casing of software raid setups, Dean, can you please paste the output of: /lib/udev/vol_id /dev/sde If possible, I like to see the above command for all disks belonging to the dmraid setup. Thanks! -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=371657 User deanjo@sasktel.net added comment https://bugzilla.novell.com/show_bug.cgi?id=371657#c24 --- Comment #24 from Dean Hilkewich <deanjo@sasktel.net> 2008-10-09 10:41:39 MDT --- Yes, I can do that. You will just have to give me a day to clear off those harddrives and reset up the dmraid on those drives. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=371657 User kasievers@novell.com added comment https://bugzilla.novell.com/show_bug.cgi?id=371657#c28 Kay Sievers <kasievers@novell.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEEDINFO |NEW Info Provider|deanjo@sasktel.net | --- Comment #28 from Kay Sievers <kasievers@novell.com> 2008-10-13 07:42:41 MDT --- Thanks for the testing! So, it would be possible to work around some issues here, as we detect the raid setup properly. I'm still not sure though, if we should make userspace ignore these partitions. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=371657 User deanjo@sasktel.net added comment https://bugzilla.novell.com/show_bug.cgi?id=371657#c29 --- Comment #29 from Dean Hilkewich <deanjo@sasktel.net> 2008-10-13 09:47:47 MDT --- What I don't understand is why this doesn't occur if I have 2 separate raid 0 setups with 2 different OS's installed to each array. For Example: I install openSUSE 11 to one set leaving the other set disconnected. Disconnect the first set, reconnect the second set install opensuse 11 onto those. Reconnect all drives and mount them the errors disappear. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=371657 User kasievers@novell.com added comment https://bugzilla.novell.com/show_bug.cgi?id=371657#c30 --- Comment #30 from Kay Sievers <kasievers@novell.com> 2008-10-23 15:03:16 MDT --- The patch to prevent the creation of invalid block devices went into the upstream kernel: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=... A proper, smarter handling of dmraid devices and the (wrong) in-kernel partition table parsing would still be nice. But at least the "attempt to access beyond end of device" warnings should be gone in a future kernel. Thanks again for your testing! -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=371657 User deanjo@sasktel.net added comment https://bugzilla.novell.com/show_bug.cgi?id=371657#c31 --- Comment #31 from Dean Hilkewich <deanjo@sasktel.net> 2008-10-26 00:52:26 MDT --- Thanks for the patch Kay, but do you have an idea why if you have two installations of the OS on separate raid sets that they can easily mount each others raid volumes without the messages appearing? What does the partitioner do different on install to a raid set volume that it does not do for a secondary raid set? -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=371657 User kasievers@novell.com added comment https://bugzilla.novell.com/show_bug.cgi?id=371657#c32 --- Comment #32 from Kay Sievers <kasievers@novell.com> 2008-10-26 09:42:28 MDT --- (In reply to comment #31 from Dean Hilkewich)
Thanks for the patch Kay, but do you have an idea why if you have two installations of the OS on separate raid sets that they can easily mount each others raid volumes without the messages appearing? What does the partitioner do different on install to a raid set volume that it does not do for a secondary raid set?
No, I have no idea. The messages should happen when probing for a raid setup (signature at the end of the volume) on any partition that is larger than the disk itself (table on first raid 0 disk, pointing to sectors of second raid0 disk). Udev will try to read the end of all the partitions the kernel announces. I have no idea why this does not to happen in your case. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=371657 User deanjo@sasktel.net added comment https://bugzilla.novell.com/show_bug.cgi?id=371657#c33 --- Comment #33 from Dean Hilkewich <deanjo@sasktel.net> 2008-10-26 11:40:33 MDT --- (In reply to comment #32 from Kay Sievers)
(In reply to comment #31 from Dean Hilkewich)
Thanks for the patch Kay, but do you have an idea why if you have two installations of the OS on separate raid sets that they can easily mount each others raid volumes without the messages appearing? What does the partitioner do different on install to a raid set volume that it does not do for a secondary raid set?
No, I have no idea. The messages should happen when probing for a raid setup (signature at the end of the volume) on any partition that is larger than the disk itself (table on first raid 0 disk, pointing to sectors of second raid0 disk).
Udev will try to read the end of all the partitions the kernel announces. I have no idea why this does not to happen in your case.
Right, therefore the partitioner has to be doing something different when creating a secondary raid set then it does on the primary raid set which leads us back to the partitioner being the cause of the issues with it not writing the table on the first disk properly of the second raid set. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=371657 User deanjo@sasktel.net added comment https://bugzilla.novell.com/show_bug.cgi?id=371657#c34 --- Comment #34 from Dean Hilkewich <deanjo@sasktel.net> 2008-10-26 12:50:06 MDT --- OK this is most defiantly a partitioner bug. To verify my hunch I went and created the secondary array in Windows XP64's partitioner with an empty extended partition. I then booted back into Suse, opened up the partitioner, created a logical partition on extended partition that windows setup. Applied and rebooted. No message warnings at all and a perfectly running secondary raid set. This is a partitioner bug. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=371657 Kay Sievers <kasievers@novell.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Priority|P1 - Urgent |P4 - Low -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=371657 User deanjo@sasktel.net added comment https://bugzilla.novell.com/show_bug.cgi?id=371657#c35 --- Comment #35 from Dean Hilkewich <deanjo@sasktel.net> 2008-11-17 13:16:06 MST --- So despite this being around forever as a bug @ a P1 and now being marked as a P4 that this bug will not be fixed for 11.1? -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=371657 User kasievers@novell.com added comment https://bugzilla.novell.com/show_bug.cgi?id=371657#c36 --- Comment #36 from Kay Sievers <kasievers@novell.com> 2008-11-17 17:39:13 MST --- (In reply to comment #35 from Dean Hilkewich)
So despite this being around forever as a bug @ a P1 and now being marked as a P4 that this bug will not be fixed for 11.1?
I do not think that the general problem causing this "bug" will be fixed. The problem is that the kernel parses partitions and creates block devices without knowing about the context or the metadata of the volume. All of these partitions should not exist in the first place, because the disk is part of a raid setup. But the kernel does not know that, and creates completely invalid partitions. There are two possible fixes, one would be to teach the kernel about all possible raid metadata, if the kernel wants to continue parsing partition tables. The other option would be to parse partitions only in userspace and not in the kernel. Both options are unlikely to happen _now_. In the upstream kernel we made sure (it's merged, but still tested and not in a released version) that partitions point only to a valid storage area, which will prevent the "access beyond end of device" warning. But still, it's only a cosmetic change to the underlying problem. We can port the upstream "cosmetic" fix to 11.1, if it survives the released upstream kernel (the former fix got removed because it broke some setups), but I do not see any "nice" fix to the general problem. The problem is known and exists since a while, without any good idea to fix it so far. Your specific behavior, that your second array behaves differently from the first one, nobody has an idea why this could be, so unfortunately, for now, I'm not sure what we can expect to be fixed. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=371657 User deanjo@sasktel.net added comment https://bugzilla.novell.com/show_bug.cgi?id=371657#c37 --- Comment #37 from Dean Hilkewich <deanjo@sasktel.net> 2008-11-17 17:55:50 MST --- (In reply to comment #36 from Kay Sievers)
(In reply to comment #35 from Dean Hilkewich)
So despite this being around forever as a bug @ a P1 and now being marked as a P4 that this bug will not be fixed for 11.1?
I do not think that the general problem causing this "bug" will be fixed.
The problem is that the kernel parses partitions and creates block devices without knowing about the context or the metadata of the volume. All of these partitions should not exist in the first place, because the disk is part of a raid setup. But the kernel does not know that, and creates completely invalid partitions.
There are two possible fixes, one would be to teach the kernel about all possible raid metadata, if the kernel wants to continue parsing partition tables. The other option would be to parse partitions only in userspace and not in the kernel. Both options are unlikely to happen _now_.
In the upstream kernel we made sure (it's merged, but still tested and not in a released version) that partitions point only to a valid storage area, which will prevent the "access beyond end of device" warning. But still, it's only a cosmetic change to the underlying problem.
We can port the upstream "cosmetic" fix to 11.1, if it survives the released upstream kernel (the former fix got removed because it broke some setups), but I do not see any "nice" fix to the general problem. The problem is known and exists since a while, without any good idea to fix it so far.
Your specific behavior, that your second array behaves differently from the first one, nobody has an idea why this could be, so unfortunately, for now, I'm not sure what we can expect to be fixed.
Well as I mentioned before in Comment 34, I believe we are barking up the wrong tree here. The partitioner is setting up the second arrays partition scheme different then the first. All errors disappear when the array is partitioned with another OS's partitioner. I have even verified this on Fedora and it creates the secondary raid set fine on it's own. While the primary set when creating the set does mark extended partition as a "f W95 Ext'd (LBA)" it does not do so for a secondary set in opensuse. It does however do it when partitioned in Fedora or windows. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=371657 User kasievers@novell.com added comment https://bugzilla.novell.com/show_bug.cgi?id=371657#c38 Kay Sievers <kasievers@novell.com> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |kasievers@novell.com AssignedTo|kasievers@novell.com |bnc-team-screening@forge.provo.novell.com Component|Kernel |YaST2 QAContact|qa@suse.de |jsrain@novell.com --- Comment #38 from Kay Sievers <kasievers@novell.com> 2008-11-17 18:07:08 MST --- Yeah, but the problem is that none of the things which access the partitions read the partition type value, and should still continue to open the device, and cause the "access beyond end of device" errors. I still have no idea why the messages disappear. But if it helps to change the partition type, let's see if the partitioner can do something. Reassigning to Yast. Summary: dmraid setups create kernel partitions which are completely invalid, and cause "access beyond end of device" errors on probing for metadata. Dean, the bug reporter, sees that setting a different partition type in the partition table makes the bug disappear. This may be not a sufficient fix for the general problem with dmraid setups, but if it helps, please look if we can do something here. Details in comment#37. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=371657 Kay Sievers <kasievers@novell.com> changed: What |Removed |Added ---------------------------------------------------------------------------- AssignedTo|bnc-team-screening@forge.provo.novell.com |yast2-maintainers@suse.de -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=371657 User deanjo@sasktel.net added comment https://bugzilla.novell.com/show_bug.cgi?id=371657#c39 Dean Hilkewich <deanjo@sasktel.net> changed: What |Removed |Added ---------------------------------------------------------------------------- Component|YaST2 |Kernel --- Comment #39 from Dean Hilkewich <deanjo@sasktel.net> 2008-11-17 18:55:34 MST --- It should be noted that this already happens automatically in the Suse partitioner on the primary raid. It gets marked as a type f This only occurs on a secondary set where it does not create the same type f partition. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=371657 User jkupec@novell.com added comment https://bugzilla.novell.com/show_bug.cgi?id=371657#c40 Ján Kupec <jkupec@novell.com> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |aschnell@novell.com Status|NEW |NEEDINFO Info Provider| |kernel-maintainers@forge.provo.novell.com --- Comment #40 from Ján Kupec <jkupec@novell.com> 2008-11-18 06:41:23 MST --- So is it a kernel issue or not? -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=371657 User deanjo@sasktel.net added comment https://bugzilla.novell.com/show_bug.cgi?id=371657#c41 --- Comment #41 from Dean Hilkewich <deanjo@sasktel.net> 2008-11-18 09:58:09 MST --- I don't see how it is when using another partitioner resolves the issue. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=371657 User kasievers@novell.com added comment https://bugzilla.novell.com/show_bug.cgi?id=371657#c42 Kay Sievers <kasievers@novell.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEEDINFO |NEW Component|Kernel |YaST2 Info Provider|kernel-maintainers@forge.provo.novell.com | --- Comment #42 from Kay Sievers <kasievers@novell.com> 2008-11-18 10:12:53 MST ---
So is it a kernel issue or not?
I don't see how it is when using another partitioner resolves the issue.
Dean, you changed it to "kernel" yesterday: Dean Hilkewich <deanjo@sasktel.net> changed: What |Removed |Added ---------------------------------------- Component|YaST2 |Kernel Please do not change theses fields, they are used to coordinate who's looking at the bug internally. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=371657 User deanjo@sasktel.net added comment https://bugzilla.novell.com/show_bug.cgi?id=371657#c43 --- Comment #43 from Dean Hilkewich <deanjo@sasktel.net> 2008-11-18 10:44:35 MST --- (In reply to comment #42 from Kay Sievers)
So is it a kernel issue or not?
I don't see how it is when using another partitioner resolves the issue.
Dean, you changed it to "kernel" yesterday:
Dean Hilkewich <deanjo@sasktel.net> changed:
What |Removed |Added ---------------------------------------- Component|YaST2 |Kernel
Please do not change theses fields, they are used to coordinate who's looking at the bug internally.
I swear I did not do that. If I did it was purely unintentional. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=371657 User aschnell@novell.com added comment https://bugzilla.novell.com/show_bug.cgi?id=371657#c44 Arvin Schnell <aschnell@novell.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |NEEDINFO Info Provider| |puzel@novell.com --- Comment #44 from Arvin Schnell <aschnell@novell.com> 2008-11-18 14:52:53 MST --- YaST uses parted to create extended partitions and does not specify whether 0x05 or 0x0F should be used as type. So if this really differs on several dmraid volumes the behavior must come from parted (bug or feature?). Maybe our parted maintainer knowns more. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
So if this really differs on several dmraid volumes the behavior must come from parted (bug or feature?). Maybe our parted maintainer knowns more. AFAIU the parted code, it should set the type of extended partition to 0xf if
https://bugzilla.novell.com/show_bug.cgi?id=371657 User puzel@novell.com added comment https://bugzilla.novell.com/show_bug.cgi?id=371657#c45 Petr Uzel <puzel@novell.com> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |puzel@novell.com Status|NEEDINFO |NEW Info Provider|puzel@novell.com | --- Comment #45 from Petr Uzel <puzel@novell.com> 2008-11-19 05:16:27 MST --- (In reply to comment #44 from Arvin Schnell) the storage device supports LBA and 0x5 otherwise.
YaST uses parted to create extended partitions and does not specify whether 0x05 or 0x0F should be used as type. The type can be forced with 'parted /dev/hdx set 1 type 0xf' in case its needed.
-- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=371657 User aschnell@novell.com added comment https://bugzilla.novell.com/show_bug.cgi?id=371657#c46 Arvin Schnell <aschnell@novell.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |NEEDINFO Info Provider| |deanjo@sasktel.net --- Comment #46 from Arvin Schnell <aschnell@novell.com> 2008-11-19 06:31:44 MST --- That sounds like a reasonable behaviour of parted (0x0f is also called "Extended partition, LBA-mapped"). Please provide content of /sys/firmware/edd/*/extensions. There we can see which drives support LBA. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=371657 User deanjo@sasktel.net added comment https://bugzilla.novell.com/show_bug.cgi?id=371657#c47 --- Comment #47 from Dean Hilkewich <deanjo@sasktel.net> 2008-11-19 13:04:36 MST --- Created an attachment (id=253613) --> (https://bugzilla.novell.com/attachment.cgi?id=253613) contents of sys/firmware/edd Here is the contents of sys/firmware/edd you requested. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=371657 User aschnell@novell.com added comment https://bugzilla.novell.com/show_bug.cgi?id=371657#c48 --- Comment #48 from Arvin Schnell <aschnell@novell.com> 2008-11-19 15:15:55 MST --- So there are 6 BIOS disks. From boot.msg I see that you have 6 hard disks and three BIOS RAIDs. The mapping between those is still missing (unclear). Please provide hwinfo output and say which BIOS RAID you consider primary and secondary (or where extended partitions have 0x05 or 0x0f). -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=371657 User deanjo@sasktel.net added comment https://bugzilla.novell.com/show_bug.cgi?id=371657#c49 --- Comment #49 from Dean Hilkewich <deanjo@sasktel.net> 2008-11-19 15:56:10 MST --- Created an attachment (id=253669) --> (https://bugzilla.novell.com/attachment.cgi?id=253669) hwinfo output The two Maxtors are running Raid 0 on dmraid and the 4 Seagates are running as Raid 5 using the fakeraid controller. The Maxtors are the primary, the seagates the secondary. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=371657 Arvin Schnell <aschnell@novell.com> changed: What |Removed |Added ---------------------------------------------------------------------------- AssignedTo|yast2-maintainers@suse.de |aschnell@novell.com Status|NEEDINFO |ASSIGNED Info Provider|deanjo@sasktel.net | Target Milestone|--- |Future/Later -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=371657 User deanjo@sasktel.net added comment https://bugzilla.novell.com/show_bug.cgi?id=371657#c51 --- Comment #51 from Dean Hilkewich <deanjo@sasktel.net> 2008-11-21 13:48:11 MST --- Arvin Schnell <aschnell@novell.com> changed: What |Removed |Added ---------------------------------------------------------------------------- AssignedTo|yast2-maintainers@suse.de |aschnell@novell.com Status|NEEDINFO |ASSIGNED Info Provider|deanjo@sasktel.net | Target Milestone|--- |Future/Later So I'll have to dig this bug report AGAIN and pray that it actually gets looked at in a future release? This bug was reported back in 10.3. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=371657 User aschnell@novell.com added comment https://bugzilla.novell.com/show_bug.cgi?id=371657#c52 --- Comment #52 from Arvin Schnell <aschnell@novell.com> 2008-11-22 06:56:48 MST --- Sorry, but we had deadline for RC1 on Friday. And issue was marked as "P4 - Low" so I have about 20 bugs to fix before I get to this one. Sure new once are coming in every day. I'm not even sure I will add a workaround in YaST for some kernel problem. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=371657 User deanjo@sasktel.net added comment https://bugzilla.novell.com/show_bug.cgi?id=371657#c53 --- Comment #53 from Dean Hilkewich <deanjo@sasktel.net> 2008-11-22 14:33:36 MST --- It was marked as P1 forever until Kay decided for some reason to knock it down for some unexplained reason to P4 on the 17th. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=371657 User swamp@suse.com added comment https://bugzilla.novell.com/show_bug.cgi?id=371657#c54 Swamp Script User <swamp@suse.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Status Whiteboard| |maint:released:11.0:21569 --- Comment #54 from Swamp Script User <swamp@suse.com> 2009-01-20 04:58:26 MST --- Update released for: kernel-debug, kernel-default, kernel-docs, kernel-kdump, kernel-pae, kernel-ppc64, kernel-ps3, kernel-rt, kernel-rt_debug, kernel-source, kernel-syms, kernel-vanilla, kernel-xen Products: openSUSE 11.0 (debug, i386, ppc, x86_64) -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
participants (1)
-
bugzilla_noreply@novell.com