[Bug 1094575] New: qeth_l2 (s390x): 4.17-rc6 kernel assigns different MAC address than 4.16 or SLE15
http://bugzilla.suse.com/show_bug.cgi?id=1094575 Bug ID: 1094575 Summary: qeth_l2 (s390x): 4.17-rc6 kernel assigns different MAC address than 4.16 or SLE15 Classification: openSUSE Product: openSUSE Tumbleweed Version: Current Hardware: S/390-64 OS: Other Status: NEW Severity: Normal Priority: P5 - None Component: Kernel Assignee: kernel-maintainers@forge.provo.novell.com Reporter: mkubecek@suse.com QA Contact: qa-bugs@suse.de Found By: --- Blocker: --- While working on bsc#1083710, I tried to reproduce the issue with 4.17-rc6 kernel from Devel:Kernel:master IBS project (KotD for master branch) on orthos machine s390vsl205.arch.suse.de. With this kernel, network device eth0 (qeth_l2 driver) did not work at all and I found message qeth 0.0.0800: MAC address 02:00:00:00:42:cd is not authorized in kernel log (02:00:00:00:42:cd was MAC address assigned to eth0). With SLE15 or Devel:Kernel:stable (stable branch, 4.16.11) kernel, the device got different address 02:00:00:00:01:08 and eth0 worked fine. Even with 4.17-rc6, eth0 started to work once I assigned 02:00:00:00:01:08 manually (using "ip link"). Checking git log between v4.16 and v4.17-rc6, these two commits caught my eye: b7493e91c11a s390/qeth: use Read device to query hypervisor for MAC bcacfcbc82b4 s390/qeth: fix MAC address update sequence and with these two reverted, eth0 got 02:00:00:00:01:08 and worked. Note: SLE15 kernel has backport of ec61bd2fd2a2 which one of these refers to with Fixes tag (it came in 4.13-rc1) but not any of these two. As I don't understand how is MAC address assignment supposed to work, I don't know if this is really a regression from these two commits or rather result of misconfiguration (wrong address assigned by SLE15 or stable kernel configured as "correct" so that correct one assigned by 4.17-rc6 is rejected). -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.suse.com/show_bug.cgi?id=1094575
Michal Kubeček
http://bugzilla.suse.com/show_bug.cgi?id=1094575
http://bugzilla.suse.com/show_bug.cgi?id=1094575#c2
Michal Kubeček
http://bugzilla.suse.com/show_bug.cgi?id=1094575
http://bugzilla.suse.com/show_bug.cgi?id=1094575#c3
Johannes Thumshirn
http://bugzilla.suse.com/show_bug.cgi?id=1094575
Hanns-Joachim Uhl
http://bugzilla.suse.com/show_bug.cgi?id=1094575
http://bugzilla.suse.com/show_bug.cgi?id=1094575#c5
--- Comment #5 from Michal Kubeček
http://bugzilla.suse.com/show_bug.cgi?id=1094575
http://bugzilla.suse.com/show_bug.cgi?id=1094575#c6
--- Comment #6 from Michal Kubeček
http://bugzilla.suse.com/show_bug.cgi?id=1094575
http://bugzilla.suse.com/show_bug.cgi?id=1094575#c7
Michal Kubeček
http://bugzilla.suse.com/show_bug.cgi?id=1094575
http://bugzilla.suse.com/show_bug.cgi?id=1094575#c8
Johannes Thumshirn
My biggest concern is that with the backports in SLE15 and SLE15-UPDATE, it's likely that first SLE15 maintenance update is going to use different address than GMC kernel. And if that may result in "not authorized" error and broken networking, we may want to avoid such inconsistency.
Agreed. IBM I'd like to drop the affected patches from the SLE15-UPDATE branch until this issue is resolved. Would this be OK -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.suse.com/show_bug.cgi?id=1094575
Hanns-Joachim Uhl
(In reply to Michal Kubeček from comment #7)
My biggest concern is that with the backports in SLE15 and SLE15-UPDATE, it's likely that first SLE15 maintenance update is going to use different address than GMC kernel. And if that may result in "not authorized" error and broken networking, we may want to avoid such inconsistency.
Agreed.
IBM I'd like to drop the affected patches from the SLE15-UPDATE branch until this issue is resolved. Would this be OK . Hello SUSE / Johannes, just for clarification ... ... you are thinking to revert these two patches from the SLES 15-Update kernel for now:
http://bugzilla.suse.com/show_bug.cgi?id=1094575
http://bugzilla.suse.com/show_bug.cgi?id=1094575#c9
Hanns-Joachim Uhl
http://bugzilla.suse.com/show_bug.cgi?id=1094575
Hanns-Joachim Uhl
http://bugzilla.suse.com/show_bug.cgi?id=1094575
Hanns-Joachim Uhl
http://bugzilla.suse.com/show_bug.cgi?id=1094575
http://bugzilla.suse.com/show_bug.cgi?id=1094575#c10
Johannes Thumshirn
(In reply to Michal Kubeček from comment #7)
My biggest concern is that with the backports in SLE15 and SLE15-UPDATE, it's likely that first SLE15 maintenance update is going to use different address than GMC kernel. And if that may result in "not authorized" error and broken networking, we may want to avoid such inconsistency.
Agreed.
IBM I'd like to drop the affected patches from the SLE15-UPDATE branch until this issue is resolved. Would this be OK . Hello SUSE / Johannes, just for clarification ... ... you are thinking to revert these two patches from the SLES 15-Update kernel for now:
(In reply to Johannes Thumshirn from comment #8) 1. " * Fri May 18 2018 tbogendoerfer@suse.de ... - s390/qeth: use Read device to query hypervisor for MAC (bsc#1061024 FATE#323301). - commit 22f22c5 " 2. " * Tue May 15 2018 jthumshirn@suse.de ... - s390/qeth: fix MAC address update sequence (bnc#1093148, LTC#167307). ... - commit a741d19 " ... correct ...? Please confirm or advise ... Thanks for your support.
Correct -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.suse.com/show_bug.cgi?id=1094575
http://bugzilla.suse.com/show_bug.cgi?id=1094575#c11
--- Comment #11 from Michal Kubeček
http://bugzilla.suse.com/show_bug.cgi?id=1094575
http://bugzilla.suse.com/show_bug.cgi?id=1094575#c12
--- Comment #12 from Johannes Thumshirn
For the sake of completeness, I tried to boot SLE12-SP3 kernel (latest update, 4.4.131-94.29-default) and eth0 gets address 02:00:00:00:01:08, i.e. the same as with 4.16.11 or SLE15 kernel.
But SLE12-SP3 should contain the same two patches, shouldn't it? -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.suse.com/show_bug.cgi?id=1094575
http://bugzilla.suse.com/show_bug.cgi?id=1094575#c13
--- Comment #13 from Michal Kubeček
(In reply to Michal Kubeček from comment #11)
For the sake of completeness, I tried to boot SLE12-SP3 kernel (latest update, 4.4.131-94.29-default) and eth0 gets address 02:00:00:00:01:08, i.e. the same as with 4.16.11 or SLE15 kernel.
But SLE12-SP3 should contain the same two patches, shouldn't it?
SLE12-SP3 has ec61bd2fd2a2 and bcacfcbc82b4 but not b7493e91c11a. I'll try SLE12-SP2-LTSS which has none of the three patches. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.suse.com/show_bug.cgi?id=1094575
http://bugzilla.suse.com/show_bug.cgi?id=1094575#c14
--- Comment #14 from Michal Kubeček
http://bugzilla.suse.com/show_bug.cgi?id=1094575
http://bugzilla.suse.com/show_bug.cgi?id=1094575#c16
--- Comment #16 from Michal Kubeček
http://bugzilla.suse.com/show_bug.cgi?id=1094575
http://bugzilla.suse.com/show_bug.cgi?id=1094575#c17
--- Comment #17 from Michal Kubeček
http://bugzilla.suse.com/show_bug.cgi?id=1094575
http://bugzilla.suse.com/show_bug.cgi?id=1094575#c20
Michal Kubeček
------- Comment From ursula.braun@de.ibm.com 2018-06-01 02:55 EDT------- Try "vmcp couple 0800 to SYSTEM VSWNL2".
Different error message but it doesn't seem to do much difference: ------------------------------------------------------------------------------ s390vsl205:~/:[1]# vmcp det nic 0800 NIC 0800 is destroyed; devices 0800-0802 detached s390vsl205:~/:[0]# vmcp def nic 0800 type qdio devices 3 NIC 0800 is created; devices 0800-0802 defined s390vsl205:~/:[0]# vmcp couple 0800 to SYSTEM VSWNL2 HCPCPL2788E NIC 0800 not connected; already connected to VSWITCH SYSTEM VSWNL2 Error: non-zero CP response for command 'COUPLE 0800 TO SYSTEM VSWNL2': #2788 s390vsl205:~/:[1]# vmcp -b 4k q v nic Adapter 0700.P00 Type: QDIO Name: UNASSIGNED Devices: 3 MAC: 02-00-00-00-01-07 VSWITCH: SYSTEM VSWN1 Adapter 0800.P00 Type: QDIO Name: HYD1G1 Devices: 3 MAC: 02-00-00-00-42-CD VSWITCH: SYSTEM VSWNL2 MAC: 02-00-00-00-01-08 Device: 0802 Current ------------------------------------------------------------------------------ After a reboot, system is unreachable through network (and for some reason, serial console doesn't work any more so I couldn't find more). -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.suse.com/show_bug.cgi?id=1094575
http://bugzilla.suse.com/show_bug.cgi?id=1094575#c23
--- Comment #23 from Michal Kubeček
http://bugzilla.suse.com/show_bug.cgi?id=1094575
http://bugzilla.suse.com/show_bug.cgi?id=1094575#c25
--- Comment #25 from Michal Kubeček
------- Comment From julian.wiedmann@de.ibm.com 2018-06-25 09:11 EDT------- Unfortunately not. But the problem is easily reproducible, so we'll most likely revert to old behaviour (diag26c on the DATA device) in the next upstream submission.
Thank you for the information. If mainline commit b7493e91c11a is going to be reverted, that would make SLE15 compatible with mainline again (GM does not have it and it has been disabled in SLE15 branch). So from SLE15 point of view, this is a good news. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.suse.com/show_bug.cgi?id=1094575
http://bugzilla.suse.com/show_bug.cgi?id=1094575#c28
--- Comment #28 from Swamp Workflow Management
http://bugzilla.suse.com/show_bug.cgi?id=1094575
http://bugzilla.suse.com/show_bug.cgi?id=1094575#c28
--- Comment #28 from Swamp Workflow Management
http://bugzilla.suse.com/show_bug.cgi?id=1094575
Swamp Workflow Management
http://bugzilla.suse.com/show_bug.cgi?id=1094575
http://bugzilla.suse.com/show_bug.cgi?id=1094575#c30
Thomas Bogendoerfer
http://bugzilla.suse.com/show_bug.cgi?id=1094575
Swamp Workflow Management
http://bugzilla.suse.com/show_bug.cgi?id=1094575
http://bugzilla.suse.com/show_bug.cgi?id=1094575#c31
--- Comment #31 from Swamp Workflow Management
http://bugzilla.suse.com/show_bug.cgi?id=1094575
Swamp Workflow Management
http://bugzilla.suse.com/show_bug.cgi?id=1094575
Swamp Workflow Management
participants (1)
-
bugzilla_noreply@novell.com