[Bug 851997] New: Net installation fails if netdevice=eth0, needs netdevice=enp7s13
https://bugzilla.novell.com/show_bug.cgi?id=851997 https://bugzilla.novell.com/show_bug.cgi?id=851997#c0 Summary: Net installation fails if netdevice=eth0, needs netdevice=enp7s13 Classification: openSUSE Product: openSUSE 13.1 Version: Final Platform: i686 OS/Version: openSUSE 13.1 Status: NEW Severity: Normal Priority: P5 - None Component: Installation AssignedTo: bnc-team-screening@forge.provo.novell.com ReportedBy: jimc@math.ucla.edu QAContact: jsrain@suse.com Found By: --- Blocker: --- In /boot/grub2/grub.cfg I create and boot from a stanza saying linux /m1/boot131/linux install=http://192.9.200.194/SuSE/i586/13.1 manual=0 usessh=1 sshpassword=qwerty addswap=1 netdevice=eth0 hostip=192.9.200.207 netmask=255.255.255.192 gateway=192.9.200.193 netwait=60 This exact procedure was used successfully to upgrade to v12.3 and previous versions. But on OpenSuSE-13.1 the host shows on the local console, "Activating manual setup program", and when you go through manual configuration, where it should bring up the net it says [something like] "Error configuring the network, probably your network card was not recognized by the kernel." Which is no lie. On virtual machines hosted on KVM/qemu, both i686 and x86_64, the procedure works and eth0 is the name used for the netdevice. I only installed 13.1-RC1 on virtual test machines so I can't say if the problem is seen there. So far, every non-virtual machine I've tried had a different device name, but each time you reboot you get the same name per machine. Workaround #1: In manual installation, back out to the main menu and get into "Expert". Select "Show Configuration". Look for the "netdevices" line and write down the device name shown. (The configuration does not include the overriding netdevice=eth0 from the command line, so "Set Configuration" is useless at this point.) Reboot. Edit the Grub command line so netdevice=enp7s13 or whatever random-looking device name it's using. Boot, and the network will come up. Workaround #2: I haven't tried this yet, but almost certainly if I remove netdevice=eth0 completely there will be no problem on most of the workstations because they have only one network interface. Also on machines with multiple NICs where the desired interface is the one whose driver wins the race condition to be loaded first. However, at least one of the servers has the main net on eth1, and a few workstations have builtin 802.11 that sometimes appears before eth0, so I still have to specify netdevice= for those. What I would like the developers to do: First, return to the successful paradigm used for v12.3. Second, some guidance would be really appreciated on how to figure out in advance what device name has to be specified for the netdevice. I looked in driver sources but was not able to figure out where the device name is set (before alteration by /etc/udev/rules.d/70-persistent-net.rules in the not-yet-installed operating system). Some of the servers have flaky K-V-M service and putting on a monitor to do workaround #1 is a royal pain. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=851997 https://bugzilla.novell.com/show_bug.cgi?id=851997#c zhang jiajun <jzhang@suse.com> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |jzhang@suse.com AssignedTo|bnc-team-screening@forge.pr |aj@suse.com |ovo.novell.com | -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=851997 https://bugzilla.novell.com/show_bug.cgi?id=851997#c Andreas Jaeger <aj@suse.com> changed: What |Removed |Added ---------------------------------------------------------------------------- AssignedTo|aj@suse.com |mchang@suse.com -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=851997 https://bugzilla.novell.com/show_bug.cgi?id=851997#c2 --- Comment #2 from Steffen Winterfeldt <snwint@suse.com> 2013-11-28 11:35:12 CET --- You were hit by the new network interface naming scheme: http://www.freedesktop.org/wiki/Software/systemd/PredictableNetworkInterface... If you don't specify a network device it will try all interfaces in turn and use the one where it finds the install repo. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=851997 https://bugzilla.novell.com/show_bug.cgi?id=851997#c3 --- Comment #3 from James Carter <jimc@math.ucla.edu> 2013-12-05 05:47:41 UTC --- Thanks for the pointer; it explains a lot about where the device names are coming from. With luck I could automate creating the device name, which would really help in my automated install/upgrade script. However, the reported very intelligent probing strategy did not work for me. I had these misfortunes: On machine #1 I gave it netdevice=enp2s0 but it still got an error configuring the network. This machine also has a USB NIC (unplugged cable to the wild side). I needed to physically remove the USB NIC to get it to boot. Machine #2 has two wired Ethernet ports (with different hardware) and a wireless NIC, all internal. Only one Ethernet port had a cable. If I told it the wanted netdevice it got an error configuring the network. If I omitted netdevice it may have done the round robin thing, but the error caused by the missing cable was fatal. I got this machine to boot by omitting netdevice and specifying brokenmodules=e1000e brokenmodules=rtl8723ae. Tomorrow I'm doing a server at work that has a four port multifunction NIC on the motherboard; only 2 ports are connected. Cross fingers... Maybe I'll take a mini-hub down there and connect the remaining ports to it. Suggestion on the logic of network configuration: If any failure occurs, from failure to bring up the NIC, to failure to get an address, to failure to obtain the installation system from the webserver, report it and go on to the next NIC. Only be fatal if every NIC fails. Also, if the user has positively told you to use a particular device, I think you should not touch the others (unless it fails). Also it would be really nice to bring back the dialog in manual net configuration for choosing the netdevice. Preferably also in the F4 dialog in the initrd. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=851997 https://bugzilla.novell.com/show_bug.cgi?id=851997#c4 Steffen Winterfeldt <snwint@suse.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |NEEDINFO InfoProvider| |jimc@math.ucla.edu --- Comment #4 from Steffen Winterfeldt <snwint@suse.com> 2013-12-05 10:14:04 CET ---
Suggestion on the logic of network configuration: If any failure occurs, from failure to bring up the NIC, to failure to get an address, to failure to obtain the installation system from the webserver, report it and go on to the next NIC. Only be fatal if every NIC fails.
This is how it should work.
Also, if the user has positively told you to use a particular device, I think you should not touch the others (unless it fails).
Also, it should already work this way. Please add linuxrc.debug=1 linuxrc.log=/logfile to your boot options and attach the log(s). -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=851997 https://bugzilla.novell.com/show_bug.cgi?id=851997#c5 --- Comment #5 from James Carter <jimc@math.ucla.edu> 2013-12-07 01:57:42 UTC --- Created an attachment (id=570711) --> (http://bugzilla.novell.com/attachment.cgi?id=570711) linuxrc.log output with no netdevice= and USB NIC plugged in -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=851997 https://bugzilla.novell.com/show_bug.cgi?id=851997#c6 James Carter <jimc@math.ucla.edu> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEEDINFO |NEW InfoProvider|jimc@math.ucla.edu | --- Comment #6 from James Carter <jimc@math.ucla.edu> 2013-12-07 02:01:44 UTC --- I got 2 logfiles, the first with netdevice absent and all 4 NICs available (Ethernet NIC cable unplugged) and the other with the USB ethernet NIC removed entirely and netdevice=enp2s0. To snarf the logfiles (for someone else reading this bug report) I did: At the grub menu hit "e" and edit the "linux" line adding/removing command line parameters. F10 or ctrl-X to boot. It monkeys with the NICs and pops the dialog "Cannot locate SuSE repo". Hit return; accept defaults (or change if desired) for language and keyboard. In the installer's main menu, scroll down to "Expert", then to "Start Shell". In the shell: mount /dev/sda1 /mnt (It helps to have memorized your devices.) cp /logfile /mnt/tmp/logfile.try1 umount /mnt exit (or do whatever other investigation). Indeed, if I specify netdevice=enp2s0 it only tries that device and doesn't touch the others (good). In the shell session I went through my guess of the setup procedure, successfully. This is from memory and irrelevant or failed command lines are omitted (remember "-c" on arping! No job control in the shell.) ip addr add 192.9.200.193/26 dev enp2s0 (try some stuff) ip addr del 192.9.200.193/26 dev enp2s0 (interface stays up) arping -b -c 3 -I enp2s0 -s 0.0.0.0 -D 192.9.200.193 (sent 3 broadcasts, 0 replies, good, -D = duplicate address detection) echo $? (prints 0, that wasn't the error culprit) ip addr add 192.9.200.193/26 dev enp2s0 arping -b -c 3 -I enp2s0 192.9.200.194 (3 unicast replies, exit code 0) curl http://192.9.200.194/SuSE/x86_64/13.1/media.1/media It delivered the media label: name and build date. So it's very mysterious why the installer thought the network was hosed. Also, this time unplugging enp0s18f2u1u1 (USB NIC) and specifying netdevice did not get it to boot. (Did I actually omit netdevice? I should have tried it today both ways.) With the -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=851997 https://bugzilla.novell.com/show_bug.cgi?id=851997#c7 --- Comment #7 from James Carter <jimc@math.ucla.edu> 2013-12-07 02:03:47 UTC --- Created an attachment (id=570712) --> (http://bugzilla.novell.com/attachment.cgi?id=570712) linuxrc.log with netdevice=enp2s0 and USB NIC unplugged completely -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=851997 https://bugzilla.novell.com/show_bug.cgi?id=851997#c8 --- Comment #8 from James Carter <jimc@math.ucla.edu> 2013-12-13 05:27:21 UTC --- For the record, I did our first server upgrade to 13.1 today. This is a Dell R710 with four Broadcom NetXtreme II BCM5709 Gigabit Ethernet ports (2x dual port NICs). They should have come out as enp1s0, enp1s0f1, enp2s0, enp2s0f1. However, they *were* seen as em1 em2 em3 em4, hiss, boo! The driver is bnx2. I'm guessing that device name generation happens in the driver, and not all drivers have been updated to the new paradigm. The paravirtual net driver for KVM also calls its device eth0 rather than using the bus structure. I wish I could predict what the device name was going to be, for writing the installer boot stanza, but it looks like it will be hit and miss this year. When I specified the correct netdevice=em1, the machine came right up, with no bitching about the two ports that are unused and have no cables. I was too chicken to omit netdevice entirely. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=851997 https://bugzilla.novell.com/show_bug.cgi?id=851997#c9 --- Comment #9 from James Carter <jimc@math.ucla.edu> 2013-12-13 05:50:35 UTC --- I re-read the reference you gave previously, and I'm wrong about drivers not being updated, but em1..4 are not covered by the scheme described or the code it refers to, so I'm mystified. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=851997 https://bugzilla.novell.com/show_bug.cgi?id=851997#c10 --- Comment #10 from Steffen Winterfeldt <snwint@suse.com> 2013-12-13 08:28:35 CET --- https://features.opensuse.org/310896 -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=851997 https://bugzilla.novell.com/show_bug.cgi?id=851997#c11 --- Comment #11 from James Carter <jimc@math.ucla.edu> 2013-12-13 19:42:52 UTC --- Thanks, that adds a useful piece to the puzzle. Cueing on /sys/class/net/${ETHN}/device/label and .../device/index, I can construct the netdevice. We have several of this series of server, and I'll have to check if the previous model has a similar issue. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
participants (1)
-
bugzilla_noreply@novell.com