[Bug 708712] New: kiwi: add support for dmraid fake-raid controllers
https://bugzilla.novell.com/show_bug.cgi?id=708712 https://bugzilla.novell.com/show_bug.cgi?id=708712#c0 Summary: kiwi: add support for dmraid fake-raid controllers Classification: openSUSE Product: openSUSE.org Version: unspecified Platform: Other OS/Version: Other Status: NEW Severity: Enhancement Priority: P5 - None Component: System Imaging AssignedTo: ms@novell.com ReportedBy: ms@novell.com QAContact: adrian@novell.com Found By: --- Blocker: --- would be great if kiwi can deal with such fake-raid devices here is the report from a user: Hi Marcus, A bit of a delay as there were a couple of things I had to identify but it's all working now. It's a long story so I'll try to keep it as concise as possible. 1. The RAID controller on these servers is an LSI controller, and is affectionately known, as I discovered, as "fake raid". Fake raid is neither a pure hardware controller, nor a pure software raid, but a mixture of the two. Some things are addressed by the BIOS, but the majority of the work is handled essentially by dmraid, at least as far as I can ascertain. 2. There are several kernel modules that need to be loaded for this to work. These are dm-log.ko, dm-region-hash.ko and dm-mirror.ko - there may be more in other server configurations but these were the ones used on our server. Adding these devices to the config.xml wasn't sufficient because they are only loaded when '/sbin/dmraid' is in initrd. For some reason I still have not been able to identify, the 'dmraid' package is not getting included in the first initrd package, even though <package name="dmraid" bootinclude="true"/>' is included in the config.xml. So once the ISOs are built, I have to explode the initrd, add the binary and it's shared libraries and then put it back together to make the initrd that is loaded over PXE. 3. Loading the modules, however, does not create the necessary devices for KIWI to look at installing the partition into. In the '--prepare' directory, the file [prepare-root]/lib/mkinitrd/boot/21-dmraid.sh has the command '/sbin/dmraid -a y -p' which creates the initial device '/dev/mapper/lsi_chedfhiceaa' which is the underlying device that should appear on the disk selection screen. Unfortunately, the '21-dmraid.sh' file doesn't seem to be included in the initrd either. Again, this is part of the 'dmraid' package so having that package properly installed may solve everything up to this point. 4. Having done all the above, the installer still failed. In the file '/usr/share/kiwi/image/oemboot/suse-dump', there is a check to find appropriate disk devices like so: deviceDisks=`$hwinfo --disk |\ grep "Device File:" | cut -f2 -d: |\ cut -f1 -d"("` Unfortunately, the fake-raid device does not show with the 'hwinfo --disk' command. You can add '--listmd' to it to show raid devices, but as the device hasn't yet been created as a raid device, that wont help either. Eventually, I settled on the following replacement which works for our purposes (and could probably be shortened), but may not cover all cases: deviceDisks=`fdisk -l|grep Disk|grep :|grep -v identifier|cut -f1 -d:|sed -e 's/Disk //'` This got us to the point that the disk selection showed the '/dev/sda', '/devsdb' and '/dev/mapper/lsi_chedfhiceaa' - our fake raid device. Awesome! I thought I was done. Alas, no! 5. At this point, the image is successfully downloaded and written to the disk. But the final step is to resize the partitiion and for that to happen, the system needs to find the partition. Something odd happens here and I'm not entirely sure what it is, but I did work around the problem again. The device doesn't appear until 'draid -a y' is run. At that point, the device shows up as '/dev/mapper/lsi_chedfhiceaap1' (note p1 added on the end). The KIWI scripts then run another command to check the disk partitions, and the device is now '/dev/mapper/lsi_chedfhiceaa_part1'. After the 'dmraid -a y' command is run, 'function() ddn' (KIWILinuxrc.sh) is called from waitForStorageDevices() , and then waitForStorageDevice is called once more after the device is now named with the _part1 at the end. As I could not get the device name to be consistent, I modified function() ddn to add the following: if echo $1 | grep -q "^\/dev\/mapper\/" ; then mapper=`echo $1"p"$2` if [ -e $mapper ] ; then echo $mapper return fi echo $1"_part"$2 return fi This approach will test to see if the first form of the name is there, and if it is, will return that, otherwise it will return the second form. 6. Having done all that, the blade was finally able to install just fine. There's only one more hiccup - some of the other blades have four NICs in them. These NICs are weird. The PXE server acquires its address from the DHCP server of eth0, but that interface is only used for the PXE portion. After that, eth2 is used instead. I'm not sure if there's a "proper way" to deal with this, but to address it here, I changed the file 'KIWILinuxRC.sh' has the following lines in function setupNetwork() changed from: if ! dhcpcd $opts $PXE_IFACE 1>&2;then systemException \ "Failed to setup DHCP network interface !" \ "reboot" fi to the following: PXE_IFACE="" for iface in ${dev_list[*]};do if dhcpcd $opts $iface 1>&2;then PXE_IFACE=$iface break fi done if [ -z $PXE_IFACE ] ;then systemException \ "Failed to setup DHCP network interface !" \ "reboot" fi That finally gets all the blade servers booting just fine. 7. Other things to note: a) the fake raid controller will not properly overwrite data on the disk when writing out the OEM file downloaded from the server. I think this may have been the cause of the busybox problem I was mentioning before. I think whenever I was using TFTP to download the image, I was not working from a clean disk. For some reason, the write fails and an exception is raised, crashing the install. b) insufficient memory does not seem to cause the download checks to fail. At all times, the download image is checked as fine, but the system of course crashes shortly afterwards - where depending on just how much it was able to download before it ran out of memory. So the final questions I have are: 1. Any idea why the dmraid package is not getting included in the boot initrd? 2. Does it seem feasible that the above changes can somehow be incorporated into KIWI and/or am I missing something in my config? Anyway, I hope all that makes sense. Let me know if you've got any more questions or suggestions. Thanks for your feedback in helping me go down this path. I now know more than I ever expected (or wanted to) about the Linux internals! Daniel. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=708712
https://bugzilla.novell.com/show_bug.cgi?id=708712#c1
--- Comment #1 from Marcus Schaefer
https://bugzilla.novell.com/show_bug.cgi?id=708712
https://bugzilla.novell.com/show_bug.cgi?id=708712#c
Marcus Schaefer
https://bugzilla.novell.com/show_bug.cgi?id=708712
https://bugzilla.novell.com/show_bug.cgi?id=708712#c2
Marcus Schaefer
https://bugzilla.novell.com/show_bug.cgi?id=708712
https://bugzilla.novell.com/show_bug.cgi?id=708712#c
Swamp Workflow Management
https://bugzilla.novell.com/show_bug.cgi?id=708712
https://bugzilla.novell.com/show_bug.cgi?id=708712#c4
Swamp Workflow Management
https://bugzilla.novell.com/show_bug.cgi?id=708712
https://bugzilla.novell.com/show_bug.cgi?id=708712#c5
Swamp Workflow Management
https://bugzilla.novell.com/show_bug.cgi?id=708712
https://bugzilla.novell.com/show_bug.cgi?id=708712#c
Swamp Workflow Management
participants (1)
-
bugzilla_noreply@novell.com