[Bug 1122182] New: multipath broken on power on tumbleweed
http://bugzilla.suse.com/show_bug.cgi?id=1122182 Bug ID: 1122182 Summary: multipath broken on power on tumbleweed Classification: openSUSE Product: openSUSE Tumbleweed Version: Current Hardware: Other OS: Other Status: NEW Severity: Normal Priority: P5 - None Component: Basesystem Assignee: bnc-team-screening@forge.provo.novell.com Reporter: ro@suse.com QA Contact: qa-bugs@suse.de Found By: --- Blocker: --- looks like the ipr controller strikes again on sle15 it still works: (xinomavro) # lsscsi -H [0] ipr [1] lpfc [2] lpfc [3] ipr # multipath -v3 -ll [...] ===== paths list ===== uuid hcil dev dev_t pri dm_st chk_st vend/prod/rev dev_st 0:2:0:0 sda 8:0 -1 undef undef IBM,IPR-0 6F46AC00 unknown 0:2:1:0 sdb 8:16 -1 undef undef IBM,IPR-0 6F46AC00 unknown 0:2:2:0 sdc 8:32 -1 undef undef IBM,IPR-0 6F46AC00 unknown 0:2:3:0 sdd 8:48 -1 undef undef IBM,IPR-0 6F425500 unknown 3:2:0:0 sde 8:64 -1 undef undef IBM,IPR-0 6F46AC00 unknown 3:2:1:0 sdf 8:80 -1 undef undef IBM,IPR-0 6F46AC00 unknown 3:2:2:0 sdg 8:96 -1 undef undef IBM,IPR-0 6F46AC00 unknown 3:2:3:0 sdh 8:112 -1 undef undef IBM,IPR-0 6F425500 unknown Jan 16 13:15:58 | libdevmapper version 1.03.01 (2017-12-18) Jan 16 13:15:58 | DM multipath kernel driver v1.13.0 Jan 16 13:15:58 | params = 1 queue_if_no_path 1 alua 2 1 service-time 0 1 2 8:48 1 1 service-time 0 1 2 8:112 1 1 Jan 16 13:15:58 | status = 2 0 1 0 2 1 A 0 1 2 8:48 A 0 0 1 E 0 1 2 8:112 A 0 0 1 Jan 16 13:15:58 | 1IBM_IPR-0_6F42550000000020: disassemble map [1 queue_if_no_path 1 alua 2 1 service-time 0 1 2 8:48 1 1 service-time 0 1 2 8:112 1 1 ] Jan 16 13:15:58 | sdd: udev property SCSI_IDENT_LUN_T10 whitelisted Jan 16 13:15:58 | sdd: mask = 0x8 Jan 16 13:15:58 | sdd: path state = running Jan 16 13:15:58 | sdd: detect_prio = yes (setting: multipath internal) Jan 16 13:15:58 | failed to issue vpd inquiry for pgc9 Jan 16 13:15:58 | loading /lib64/multipath/libpriosysfs.so prioritizer Jan 16 13:15:58 | sdd: prio = sysfs (setting: storage device autodetected) Jan 16 13:15:58 | sdd: prio args = "" (setting: storage device autodetected) Jan 16 13:15:58 | sdd: sysfs prio = 50 Jan 16 13:15:58 | sdh: udev property SCSI_IDENT_LUN_T10 whitelisted Jan 16 13:15:58 | sdh: mask = 0x8 Jan 16 13:15:58 | sdh: path state = running Jan 16 13:15:58 | sdh: detect_prio = yes (setting: multipath internal) Jan 16 13:15:58 | failed to issue vpd inquiry for pgc9 Jan 16 13:15:58 | sdh: prio = sysfs (setting: storage device autodetected) Jan 16 13:15:58 | sdh: prio args = "" (setting: storage device autodetected) Jan 16 13:15:58 | sdh: sysfs prio = 10 Jan 16 13:15:58 | 1IBM_IPR-0_6F42550000000020: disassemble status [2 0 1 0 2 1 A 0 1 2 8:48 A 0 0 1 E 0 1 2 8:112 A 0 0 1 ] 1IBM_IPR-0_6F42550000000020 dm-6 IBM,IPR-0 6F425500 size=2.6T features='1 queue_if_no_path' hwhandler='1 alua' wp=rw |-+- policy='service-time 0' prio=50 status=active | `- 0:2:3:0 sdd 8:48 active ready running `-+- policy='service-time 0' prio=10 status=enabled `- 3:2:3:0 sdh 8:112 active ready running [...] on tumbleweed: (obs-power8-01) # multipath -v3 -ll [...] Jan 16 12:10:56 | sdd: size = 6694404096 Jan 16 12:10:56 | sdd: vendor = IBM Jan 16 12:10:56 | sdd: product = IPR-0 5ECDEA00 Jan 16 12:10:56 | sdd: rev = Jan 16 12:10:56 | sdd: h:b:t:l = 3:2:1:0 Jan 16 12:10:56 | sdd: tgt_node_name = Jan 16 12:10:56 | sdd: 61512 cyl, 128 heads, 32 sectors/track, start at 0 Jan 16 12:10:56 | 3:2:1:0: attribute vpd_pg80 not found in sysfs Jan 16 12:10:56 | failed to read sysfs vpd pg80 Jan 16 12:10:56 | sdd: fail to get serial Jan 16 12:10:56 | sdd: detect_checker = yes (setting: multipath internal) Jan 16 12:10:56 | sdd: path_checker = tur (setting: storage device autodetected) Jan 16 12:10:56 | sdd: checker timeout = 60 s (setting: multipath.conf defaults/devices section) Jan 16 12:10:56 | sdd: tur state = up ===== paths list ===== uuid hcil dev dev_t pri dm_st chk_st vend/prod/rev dev_st 0:2:0:0 sda 8:0 -1 undef undef IBM,IPR-10 5ECDEA00 unknown 0:2:1:0 sdb 8:16 -1 undef undef IBM,IPR-0 5ECDEA00 unknown 3:2:0:0 sdc 8:32 -1 undef undef IBM,IPR-10 5ECDEA00 unknown 3:2:1:0 sdd 8:48 -1 undef undef IBM,IPR-0 5ECDEA00 unknown Jan 16 12:10:56 | libdevmapper version 1.03.01 (2018-07-19) Jan 16 12:10:56 | DM multipath kernel driver v1.13.0 Jan 16 12:10:56 | unloading const prioritizer Jan 16 12:10:56 | unloading tur checker obs-power8-01:~ # dmesg | grep -E "mapper|multipath" [ 15.461123] device-mapper: uevent: version 1.0.3 [ 15.461239] device-mapper: ioctl: 4.39.0-ioctl (2018-04-03) initialised: dm-devel@redhat.com [ 15.462946] systemd-modules-load[1317]: Inserted module 'dm_multipath' [ 228.941013] device-mapper: multipath service-time: version 0.3.0 loaded [ 228.941142] device-mapper: table: table load rejected: not all devices are blk-mq request-stackable [ 228.941169] device-mapper: table: unable to determine table type [ 228.943508] device-mapper: table: table load rejected: not all devices are blk-mq request-stackable [ 228.943523] device-mapper: table: unable to determine table type [ 228.946381] device-mapper: table: table load rejected: not all devices are blk-mq request-stackable [ 228.946387] device-mapper: table: unable to determine table type [ 228.949130] device-mapper: table: table load rejected: not all devices are blk-mq request-stackable [ 228.949137] device-mapper: table: unable to determine table type [ 334.250096] systemd[1]: Listening on Device-mapper event daemon FIFOs. [ 334.250195] systemd[1]: Listening on multipathd control socket. [ 336.190785] device-mapper: table: 254:0: multipath: error getting device [ 336.190812] device-mapper: ioctl: error adding target to table [ 336.192423] device-mapper: table: table load rejected: not all devices are blk-mq request-stackable [ 336.192430] device-mapper: table: unable to determine table type [ 336.200096] device-mapper: table: 254:0: multipath: error getting device [ 336.200108] device-mapper: ioctl: error adding target to table [ 336.202816] device-mapper: table: table load rejected: not all devices are blk-mq request-stackable [ 336.202821] device-mapper: table: unable to determine table type -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.suse.com/show_bug.cgi?id=1122182 Ruediger Oertel <ro@suse.com> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |hare@suse.com Assignee|bnc-team-screening@forge.pr |martin.wilck@suse.com |ovo.novell.com | -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.suse.com/show_bug.cgi?id=1122182 http://bugzilla.suse.com/show_bug.cgi?id=1122182#c1 Martin Wilck <martin.wilck@suse.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |IN_PROGRESS CC| |ro@suse.com Flags| |needinfo?(ro@suse.com) --- Comment #1 from Martin Wilck <martin.wilck@suse.com> --- Can I log onto the affected system? -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.suse.com/show_bug.cgi?id=1122182 http://bugzilla.suse.com/show_bug.cgi?id=1122182#c2 --- Comment #2 from Martin Wilck <martin.wilck@suse.com> --- Have you booted with "dm_mod.use_blk_mq=1" ? -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.suse.com/show_bug.cgi?id=1122182 http://bugzilla.suse.com/show_bug.cgi?id=1122182#c3 Ruediger Oertel <ro@suse.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Flags|needinfo?(ro@suse.com) | --- Comment #3 from Ruediger Oertel <ro@suse.com> --- ad c#1: well, this is one of the obs worker machines, it's reachable via obs-admin ad c#2: no obs-power8-01:~ # cat /proc/cmdline root=/dev/sda3 rw console=hvc0 cma=8192G loop.max_loop=64 vga=normal kiwiserver=192.168.128.17 kiwiservertype=http kiwiimage=workers/next-ppc64le-kvm/image-current.xz kvm_cma_resv_ratio=15 nomodeset kiwidebug=1 quiet ELOG_EXCEPTION=/dev/hvc0 BOOTIF=98:be:94:01:47:94 -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.suse.com/show_bug.cgi?id=1122182 http://bugzilla.suse.com/show_bug.cgi?id=1122182#c4 --- Comment #4 from Martin Wilck <martin.wilck@suse.com> --- (In reply to Ruediger Oertel from comment #3)
ad c#1: well, this is one of the obs worker machines, it's reachable via obs-admin
Perhaps you can send me some instructions via mail or IRC how to do that. It's notoriously difficult to grab ppc64 systems via orthos. I'll be careful, promised. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.suse.com/show_bug.cgi?id=1122182 http://bugzilla.suse.com/show_bug.cgi?id=1122182#c5 Martin Wilck <martin.wilck@suse.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Flags| |needinfo?(ro@suse.com) --- Comment #5 from Martin Wilck <martin.wilck@suse.com> --- Forgot to set NEEDINFO. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.suse.com/show_bug.cgi?id=1122182 http://bugzilla.suse.com/show_bug.cgi?id=1122182#c6 --- Comment #6 from Martin Wilck <martin.wilck@suse.com> --- Please try rebooting with "scsi_mod.use_blk_mq=Y" (I won't do that on your system). Since 4.20, device mapper uses only blk_mq. But our kernel still defaults to CONFIG_SCSI_MQ_DEFAULT = "N". This will cause all maps on top of SCSI devices to fail. This is likely a bug, but please let's confirm it by checking if the module parameter works. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.suse.com/show_bug.cgi?id=1122182 http://bugzilla.suse.com/show_bug.cgi?id=1122182#c7 Ruediger Oertel <ro@suse.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Flags|needinfo?(ro@suse.com) | --- Comment #7 from Ruediger Oertel <ro@suse.com> --- done ... works. scsi_mod.use_blk_mq=y obs-power8-01:~ # multipath -ll 1IBM_IPR-10_5ECDEA0000000020 dm-0 IBM,IPR-10 5ECDEA00 size=532G features='1 queue_if_no_path' hwhandler='1 alua' wp=rw |-+- policy='service-time 0' prio=50 status=active | `- 0:2:0:0 sda 8:0 active ready running `-+- policy='service-time 0' prio=10 status=enabled `- 3:2:0:0 sdc 8:32 active ready running 1IBM_IPR-0_5ECDEA0000000060 dm-1 IBM,IPR-0 5ECDEA00 size=3.1T features='1 queue_if_no_path' hwhandler='1 alua' wp=rw |-+- policy='service-time 0' prio=50 status=active | `- 0:2:1:0 sdb 8:16 active ready running `-+- policy='service-time 0' prio=10 status=enabled `- 3:2:1:0 sdd 8:48 active ready running -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.suse.com/show_bug.cgi?id=1122182 Jiri Slaby <jslaby@suse.com> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |jslaby@suse.com -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.suse.com/show_bug.cgi?id=1122182 http://bugzilla.suse.com/show_bug.cgi?id=1122182#c8 --- Comment #8 from Martin Wilck <martin.wilck@suse.com> --- Proposed a config change to use CONFIG_SCSI_MQ_DEFAULT on the stable kernel on the SUSE kernel ML. Binaries are building in home:mwilck:Bugs:1122182 on OBS. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.suse.com/show_bug.cgi?id=1122182 http://bugzilla.suse.com/show_bug.cgi?id=1122182#c9 --- Comment #9 from Martin Wilck <martin.wilck@suse.com> --- FWIW, no config change is necessary for v5.0 and higher, as MQ is always used by the SCSI layer there, too. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.suse.com/show_bug.cgi?id=1122182 http://bugzilla.suse.com/show_bug.cgi?id=1122182#c10 --- Comment #10 from Martin Wilck <martin.wilck@suse.com> --- The OBS build takes forever today. I've pushed this to users/mwilck/stable-for-next now. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.suse.com/show_bug.cgi?id=1122182 http://bugzilla.suse.com/show_bug.cgi?id=1122182#c11 --- Comment #11 from Jiri Slaby <jslaby@suse.com> --- Really IN_PROGRESS: https://build.opensuse.org/request/show/670130 -- You are receiving this mail because: You are on the CC list for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1122182 https://bugzilla.suse.com/show_bug.cgi?id=1122182#c12 Martin Wilck <martin.wilck@suse.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|IN_PROGRESS |RESOLVED Resolution|--- |FIXED --- Comment #12 from Martin Wilck <martin.wilck@suse.com> --- Fixed per previous comments. -- You are receiving this mail because: You are on the CC list for the bug.
participants (2)
-
bugzilla_noreply@novell.com
-
bugzilla_noreply@suse.com