[Bug 623470] New: iscsi problem when starting xen domU
http://bugzilla.novell.com/show_bug.cgi?id=623470 http://bugzilla.novell.com/show_bug.cgi?id=623470#c0 Summary: iscsi problem when starting xen domU Classification: openSUSE Product: openSUSE 11.2 Version: Final Platform: x86-64 OS/Version: Other Status: NEW Severity: Major Priority: P5 - None Component: Xen AssignedTo: jdouglas@novell.com ReportedBy: koenig@linux.de QAContact: qa@suse.de Found By: --- Blocker: --- setup: my xen dom0 server uses 2 iscsi servers for domU disks (192.168.178.3 and 192.168.178.4). xen domU "os-suse103" with disk from 192.168.178.3 does *not* start up if iscsi server on 192.168.178.4 does not start (did not start up because of missing kernel module): # iscsiadm -m node | sort | cut -d, -f1 | uniq -c 28 192.168.178.3:3260 19 192.168.178.4:3260 # iscsiadm -m node | grep os-suse103 192.168.178.3:3260,1 iqn.2010-04.de.science-computing:os-suse103-flat.vmdk 192.168.178.3:3260,1 iqn.2010-04.de.science-computing:os-suse103-0-flat.vmdk iscsi server on 192.168.178.3 is up and running, 192.168.178.4 is down. trying to "xm create -c os-suse103" gives errors in /var/log/xen/xen-hotplug.log : ------------------------------------------------------------------------------- iscsiadm: cannot make connection to 192.168.178.4:3260 (111) iscsiadm: connection to discovery address 192.168.178.4 failed iscsiadm: cannot make connection to 192.168.178.4:3260 (111) iscsiadm: connection to discovery address 192.168.178.4 failed iscsiadm: cannot make connection to 192.168.178.4:3260 (111) iscsiadm: connection to discovery address 192.168.178.4 failed iscsiadm: cannot make connection to 192.168.178.4:3260 (111) iscsiadm: connection to discovery address 192.168.178.4 failed iscsiadm: cannot make connection to 192.168.178.4:3260 (111) iscsiadm: connection to discovery address 192.168.178.4 failed iscsiadm: connection login retries (reopen_max) 5 exceeded ------------------------------------------------------------------------------- and from /var/log/xen/xend.log: ------------------------------------------------------------------------------- [2010-07-19 15:23:00 4555] DEBUG (XendDomainInfo:514) XendDomainInfo.shutdown(poweroff) [2010-07-19 15:23:00 4555] DEBUG (XendDomainInfo:1733) XendDomainInfo.handleShutdownWatch [2010-07-19 15:23:00 4555] DEBUG (XendDomainInfo:1733) XendDomainInfo.handleShutdownWatch [2010-07-19 15:23:46 4555] INFO (XendDomainInfo:1919) Domain has shutdown: name=os-suse103 id=68 reason=poweroff. [2010-07-19 15:23:47 4555] DEBUG (XendDomainInfo:2757) XendDomainInfo.destroy: domid=68 [2010-07-19 15:23:47 4555] DEBUG (XendDomainInfo:2227) Destroying device model [2010-07-19 15:23:47 4555] DEBUG (XendDomainInfo:2234) Releasing devices [2010-07-19 15:23:47 4555] DEBUG (XendDomainInfo:2247) Removing vif/0 [2010-07-19 15:23:47 4555] DEBUG (XendDomainInfo:1137) XendDomainInfo.destroyDevice: deviceClass = vif, device = vif/0 [2010-07-19 15:23:47 4555] DEBUG (XendDomainInfo:2247) Removing console/0 [2010-07-19 15:23:47 4555] DEBUG (XendDomainInfo:1137) XendDomainInfo.destroyDevice: deviceClass = console, device = console/0 [2010-07-19 15:23:47 4555] DEBUG (XendDomainInfo:2247) Removing vbd/768 [2010-07-19 15:23:47 4555] DEBUG (XendDomainInfo:1137) XendDomainInfo.destroyDevice: deviceClass = vbd, device = vbd/768 [2010-07-19 15:23:47 4555] DEBUG (XendDomainInfo:2247) Removing vbd/832 [2010-07-19 15:23:47 4555] DEBUG (XendDomainInfo:1137) XendDomainInfo.destroyDevice: deviceClass = vbd, device = vbd/832 [2010-07-19 15:23:57 4555] DEBUG (XendDomainInfo:94) XendDomainInfo.create(['vm', ['name', 'os-suse103'], ['memory', 1000], ['maxmem', 2048], ['on_poweroff', 'destroy'], ['on_reboot', 'restart'], ['on_crash', 'restart'], ['vcpus', 4], ['on_xend_start', 'ignore'], ['on_xend_stop', 'ignore'], ['bootloader', '/usr/lib/xen/boot/domUloader.py'], ['bootloader_args', '--entry=hda2:/boot/vmlinuz-xenpae,/boot/initrd-xenpae'], ['image', ['linux', ['videoram', 4], ['args', 'root=/dev/hda2']]], ['s3_integrity', 1], ['device', ['vbd', ['uname', 'iscsi:iqn.2010-04.de.science-computing:os-suse103-flat.vmdk'], ['dev', 'hda'], ['mode', 'w']]], ['device', ['vbd', ['uname', 'iscsi:iqn.2010-04.de.science-computing:os-suse103-0-flat.vmdk'], ['dev', 'hdb'], ['mode', 'w']]], ['device', ['vif', ['bridge', 'br0'], ['mac', '00:0c:29:c1:cf:ef'], ['model', 'rtl8139']]]]) [2010-07-19 15:23:57 4555] DEBUG (XendDomainInfo:2324) XendDomainInfo.constructDomain [2010-07-19 15:23:57 4555] DEBUG (balloon:185) Balloon: 4499012 KiB free; need 4096; done. [2010-07-19 15:23:59 4555] DEBUG (XendDomain:453) Adding Domain: 69 [2010-07-19 15:23:59 4555] DEBUG (XendDomainInfo:2525) XendDomainInfo.initDomain: 69 256 [2010-07-19 15:23:59 4555] INFO (XendDomainInfo:2948) Mounting iqn.2010-04.de.science-computing:os-suse103-flat.vmdk on /dev/xvdp. [2010-07-19 15:23:59 4555] DEBUG (DevController:95) DevController: writing {'backend-id': '0', 'virtual-device': '51952', 'device-type': 'disk', 'state': '1', 'backend': '/local/domain/0/backend/vbd/0/51952'} to /local/domain/0/device/vbd/51952. [2010-07-19 15:23:59 4555] DEBUG (DevController:97) DevController: writing {'domain': 'Domain-0', 'frontend': '/local/domain/0/device/vbd/51952', 'uuid': '0bad18ee-be5d-bcf5-5988-397fcf313c75', 'bootable': '0', 'dev': '/dev/xvdp', 'state': '1', 'params': 'iqn.2010-04.de.science-computing:os-suse103-flat.vmdk', 'mode': 'w', 'online': '1', 'frontend-id': '0', 'type': 'iscsi'} to /local/domain/0/backend/vbd/0/51952. [2010-07-19 15:23:59 4555] DEBUG (DevController:144) Waiting for 51952. [2010-07-19 15:23:59 4555] DEBUG (DevController:654) hotplugStatusCallback /local/domain/0/backend/vbd/0/51952/hotplug-status. [2010-07-19 15:24:05 4555] DEBUG (DevController:654) hotplugStatusCallback /local/domain/0/backend/vbd/0/51952/hotplug-status. [2010-07-19 15:24:05 4555] DEBUG (DevController:668) hotplugStatusCallback 2. [2010-07-19 15:24:05 4555] ERROR (XendDomainInfo:3662) Device 51952 (vbd) could not be connected. /etc/xen/scripts/block failed; error detected. Traceback (most recent call last): File "/usr/lib64/python2.6/site-packages/xen/xend/XendDomainInfo.py", line 3658, in create_vbd dev_control.waitForDevice(devid) File "/usr/lib64/python2.6/site-packages/xen/xend/server/DevController.py", line 165, in waitForDevice "%s" % (devid, self.deviceClass, err)) VmError: Device 51952 (vbd) could not be connected. /etc/xen/scripts/block failed; error detected. [2010-07-19 15:24:05 4555] ERROR (XendDomainInfo:479) VM start failed Traceback (most recent call last): File "/usr/lib64/python2.6/site-packages/xen/xend/XendDomainInfo.py", line 459, in start XendTask.log_progress(31, 60, self._initDomain) File "/usr/lib64/python2.6/site-packages/xen/xend/XendTask.py", line 209, in log_progress retval = func(*args, **kwds) File "/usr/lib64/python2.6/site-packages/xen/xend/XendDomainInfo.py", line 2527, in _initDomain self._configureBootloader() File "/usr/lib64/python2.6/site-packages/xen/xend/XendDomainInfo.py", line 2957, in _configureBootloader vbd_uuid = dom0.create_vbd(vbd, disk) File "/usr/lib64/python2.6/site-packages/xen/xend/XendDomainInfo.py", line 3658, in create_vbd dev_control.waitForDevice(devid) File "/usr/lib64/python2.6/site-packages/xen/xend/server/DevController.py", line 165, in waitForDevice "%s" % (devid, self.deviceClass, err)) VmError: Device 51952 (vbd) could not be connected. /etc/xen/scripts/block failed; error detected. [2010-07-19 15:24:05 4555] DEBUG (XendDomainInfo:2757) XendDomainInfo.destroy: domid=69 [2010-07-19 15:24:05 4555] DEBUG (XendDomainInfo:2232) No device model [2010-07-19 15:24:05 4555] DEBUG (XendDomainInfo:2234) Releasing devices [2010-07-19 15:24:05 4555] ERROR (XendDomainInfo:99) Domain construction failed Traceback (most recent call last): File "/usr/lib64/python2.6/site-packages/xen/xend/XendDomainInfo.py", line 97, in create vm.start() File "/usr/lib64/python2.6/site-packages/xen/xend/XendDomainInfo.py", line 459, in start XendTask.log_progress(31, 60, self._initDomain) File "/usr/lib64/python2.6/site-packages/xen/xend/XendTask.py", line 209, in log_progress retval = func(*args, **kwds) File "/usr/lib64/python2.6/site-packages/xen/xend/XendDomainInfo.py", line 2527, in _initDomain self._configureBootloader() File "/usr/lib64/python2.6/site-packages/xen/xend/XendDomainInfo.py", line 2957, in _configureBootloader vbd_uuid = dom0.create_vbd(vbd, disk) File "/usr/lib64/python2.6/site-packages/xen/xend/XendDomainInfo.py", line 3658, in create_vbd dev_control.waitForDevice(devid) File "/usr/lib64/python2.6/site-packages/xen/xend/server/DevController.py", line 165, in waitForDevice "%s" % (devid, self.deviceClass, err)) VmError: Device 51952 (vbd) could not be connected. /etc/xen/scripts/block failed; error detected. ------------------------------------------------------------------------------- as soon as I start iscsitarget on 192.168.178.4 the domU "os-suse103" will start up again. manual access to that iscsi disk always works ok: # iscsiadm -m node -T iqn.2010-04.de.science-computing:os-suse103-flat.vmdk --login Logging in to [iface: default, target: iqn.2010-04.de.science-computing:os-suse103-flat.vmdk, portal: 192.168.178.3,3260] Login to [iface: default, target: iqn.2010-04.de.science-computing:os-suse103-flat.vmdk, portal: 192.168.178.3,3260]: successful # iscsiadm -m node -T iqn.2010-04.de.science-computing:os-suse103-flat.vmdk --logout Logging out of session [sid: 75, target: iqn.2010-04.de.science-computing:os-suse103-flat.vmdk, portal: 192.168.178.3,3260] Logout of [sid: 75, target: iqn.2010-04.de.science-computing:os-suse103-flat.vmdk, portal: 192.168.178.3,3260]: successful # this xen dom0 server does *not* use/access any disks from 192.168.178.4, all disk images come from 192.168.178.3. 192.168.178.4 is the iscsi server for a 2nd dom0 and is configured only to be able to migrate those domUs between xen servers (but used by default, but working if needed). -- Configure bugmail: http://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
http://bugzilla.novell.com/show_bug.cgi?id=623470 http://bugzilla.novell.com/show_bug.cgi?id=623470#c Charles Arnold <carnold@novell.com> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |carnold@novell.com AssignedTo|jdouglas@novell.com |jfehlig@novell.com QAContact|qa@suse.de |jdouglas@novell.com -- Configure bugmail: http://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
http://bugzilla.novell.com/show_bug.cgi?id=623470 http://bugzilla.novell.com/show_bug.cgi?id=623470#c1 James Fehlig <jfehlig@novell.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |NEEDINFO InfoProvider| |koenig@linux.de --- Comment #1 from James Fehlig <jfehlig@novell.com> 2010-07-19 22:52:59 UTC --- Hmm, I'm not able to reproduce this bug. I do see the messages you indicated in /var/log/xen/xen-hotplug.log but the domU still boots. Is the host a stock 11.2 system? I'm trying to reproduce on a SLES11 SP1 system but the block-iscsi script is the same between the two. -- Configure bugmail: http://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
http://bugzilla.novell.com/show_bug.cgi?id=623470 http://bugzilla.novell.com/show_bug.cgi?id=623470#c2 Harald Koenig <koenig@linux.de> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEEDINFO |NEW InfoProvider|koenig@linux.de | --- Comment #2 from Harald Koenig <koenig@linux.de> 2010-07-20 08:20:48 UTC --- (In reply to comment #1)
Hmm, I'm not able to reproduce this bug. I do see the messages you indicated in /var/log/xen/xen-hotplug.log but the domU still boots. Is the host a stock 11.2 system?
yes, only stock oss/non-oss (except for mozilla and xfce repos). it was upgraded from an originally 11.1 system... here some rpm versions: # rpm -qa \*xen\* \*iscsi\* | sort iscsitarget-0.4.17-4.4.x86_64 iscsitarget-kmp-default-0.4.17_2.6.31.5_0.1-4.4.x86_64 iscsitarget-kmp-xen-0.4.17_2.6.31.5_0.1-4.4.x86_64 kernel-xen-2.6.31.12-0.2.1.x86_64 kernel-xen-devel-2.6.31.12-0.2.1.x86_64 open-iscsi-2.0.870-28.1.x86_64 xen-3.4.1_19718_04-2.1.x86_64 xen-kmp-default-3.4.1_19718_04_2.6.31.5_0.1-2.1.x86_64 xen-libs-3.4.1_19718_04-2.1.x86_64 xen-tools-3.4.1_19718_04-2.1.x86_64 yast2-iscsi-client-2.18.6-2.2.noarch yast2-iscsi-server-2.18.2-2.2.noarch
I'm trying to reproduce on a SLES11 SP1 system but the block-iscsi script is the same between the two.
do you have any suggestion what's the best place to get more tracing info about what's going on ? in earlier problems (related to iscsi setup issues and nscd) I used strace on the full xen stack and sometimes instrumented python scripts to get mor detailed info, which both is pretty painful. is there a better way to get better insight ?! -- Configure bugmail: http://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
http://bugzilla.novell.com/show_bug.cgi?id=623470 http://bugzilla.novell.com/show_bug.cgi?id=623470#c3 James Fehlig <jfehlig@novell.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |NEEDINFO InfoProvider| |koenig@linux.de --- Comment #3 from James Fehlig <jfehlig@novell.com> 2010-07-20 16:38:04 UTC --- (In reply to comment #2)
do you have any suggestion what's the best place to get more tracing info about what's going on ?
Try adding a 'set -x' to top of block-iscsi script then monitor /var/log/xen/xen-hotplug.log while starting your domU. We should then be able to see any failures in block-iscsi. I'm wondering whether the following is needed in the 'add' logic of block-iscsi: /sbin/iscsiadm -m discovery | sed "s/ .*//g" | while read line; do /sbin/iscsiadm -m discovery -t sendtargets -p $line; done >/dev/null An long-ago version of block-iscsi did not contain this line and its addition predates my involvement in this script - so I don't know why it was added. Nothing is being done with the output of sendtargets discovery. Does commenting this line help? Do you have any clue as to how this line improves the script? -- Configure bugmail: http://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=623470 https://bugzilla.novell.com/show_bug.cgi?id=623470#c Ihno Krumreich <ihno@novell.com> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |ihno@novell.com OS/Version|Other |openSUSE 11.2 -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=623470 https://bugzilla.novell.com/show_bug.cgi?id=623470#c4 --- Comment #4 from James Fehlig <jfehlig@novell.com> 2010-09-02 18:19:14 UTC --- Harald - ping ... Did you try my trace suggestion above? Also, any thoughts on the discovery actions in current block-iscsi? Whether they need to exist or not? Thanks. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=623470 https://bugzilla.novell.com/show_bug.cgi?id=623470#c5 James Fehlig <jfehlig@novell.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEEDINFO |RESOLVED InfoProvider|koenig@linux.de | Resolution| |FIXED Target Milestone|--- |Future 11.3 --- Comment #5 from James Fehlig <jfehlig@novell.com> 2010-11-09 17:19:39 UTC --- We had a similar report against SLES (bug #552115), but that one focused on the discovery actions noted in comment #3 above. Hannes reported the following wrt the discovery hack in block-iscsi: --------------------------------------------------------------- The original idea of the discovery was to avoid the STP startup latency. When using static IP addresses, the route to the target IP might not be avialable directly after establishing the connection as stp hasn't finished. During this time the network stack would return -EHOSTUNREACH, which unfortunately is the same error as if the host is unreachable due to other reasons. So the connection would not be established as the in-kernel code couldn't distinguish here. Using discovery would avoid this scenario. However, I believe this issue has been fixed for SLES10 SP3 and SLES11 SP1. So it should work now even without the discovery. --------------------------------------------------------------- I've removed the discovery from block-iscsi for 11.3, SLE11 SP1, and Factory. I'm not sure if it fixed the issue reported here, but I haven't heard back from my suggestions in #3 (first paragraph in particular) for a long time. Closing as fixed now but please reopen if issue persists. Thanks! -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
participants (1)
-
bugzilla_noreply@novell.com