[wicked-devel] Question about bringing up devices in a bond
I have an application that’s responsible for configuring the network on the host; it writes configuration files to /etc/sysconfig/network, and then invokes ifup, ifdown, etc to up and down interfaces and bonds. Since moving to SLES 12 SP1 and wicked and the wicked nanny daemon, we are seeing an issue I’m hoping someone on this list may be able to help with. This issue does not occur on SLES 11 systems using Network Manager. The scenario is as follows. To bring up a bond, the application does the following. (The NIC names and file names I’m using are just for the purpose of the example): Create a config file for p5p2 and copy/write the file to /etc/sysconfig/network/ifcfg-p5p2 ifup p5p2 Create a config file for p5p2 and copy/write the file to /etc/sysconfig/network/ifcfg-p5p4 ifup p5p4 Creates a config file for the bond and copy/write the file to /etc/sysconfig/network/ifcfg-bond0cpc1 ifup bond0cpc1 Everything works fine the first time, after a fresh Linux boot. I then manually bring down the NICs/bonds with ifdowns and remove the files from the /etc/sysconfig/network directory. Then, the second time I attempt to perform the same steps to bring up the same NICs and bond, I see the following: Create a config file for p5p2 and copy the file to /etc/sysconfig/network/ifcfg-p5p2 ifup p5p2 (works fine) Create a config file for p5p2 and copy the file to /etc/sysconfig/network/ifcfg-p5p4 ifup p5p4 (works fine) Create a config file for the bond and copy the file to /etc/sysconfig/network/ifcfg-bond0cpc1 ifup bond0cpc1 (*** fails) On the failure of the ifup of the bond the second time, the ifup returns “device-not-running”, and the same failure is exhibited over and over again. So the main question is, why is that? The NIC I’m using is the Intel Corporation I350 Gigabit Network Connection (4-port, copper). Here’s some more info. If I execute the ifups in a shell script so that there’s very little time delay between copying the files into the /etc/sysconfig/network directory and ifup’ing the devices, it works every time. cp ifcfg-p5p2 ifcfg-p5p4 /etc/sysconfig/network ifup p5p2 ifup p5p4 cp ifcfg-bond0cpc0 /etc/sysconfig/network ifup bond0cpc0 If you now ifdown the devices and remove their ifcfg- files from /etc/sysconfig/network and re-run the script, it works every time. Doing the commands from the command line, or putting delays in the shell script; that is, putting sleeps between copying the files to the /etc/sysconfig/network directory and ifup’ing the devices, fails every time. I tried manual configuration of a bond on the Intel Corporation I350 Gigabit Fiber Network Connection (4-port, fiber) and it works every time regardless. Finally, if I disable the nanny daemon by editing /etc/wicked/common.xml (setting <use-nanny>false</use-nanny> and restarting the wicked service) manual ifup’ing works regardless. Does anyone have any ideas on what the issue could be? Thanks in advance. -- To unsubscribe, e-mail: wicked-devel+unsubscribe@opensuse.org To contact the owner, e-mail: wicked-devel+owner@opensuse.org
Yes, ifup resolves what do do and then forgets to send half of it (deletes) to nanny and nanny does not "ifdown" the slaves causing to keep the slave config in nanny. Bug is open for this issue for a while (and it is on my "top 10" list, just there are always some "high prio" bugs pushed to the front...). An "ifup p5p2 ; ifup p5p4" before "ifup bond0cpc1" is useless as the slave has to call "ip link set master bond0cpc1 dev p5p2", that is, the ifup's configure standalone devices, not slaves. Just use: #add /etc/sysconfig/network/ifcfg-p5p2 # <- optional [1] #add /etc/sysconfig/network/ifcfg-p5p4 # <- optional [1] add /etc/sysconfig/network/ifcfg-bond0cpc1 and then either: ifup bond0cpc1 or wicked ifreload all When you remove all 3 configs, call "wicked ifreload all", this should work and shutdown the slaves and bond down. Alternatively, explicit "wicked ifdown bond0cpc1 p5p4 p5p2" call will remove the slave config with the advise to enslave into the bond. [1] Bond slaves automatically get a "STARTMODE=hotplug BOOTPROTO=none" config through ifcfcg-bond0cpc1, so ifcfg-$slave is only needed when you want to set e.g. ETHTOOL options or similar. The MTU and most other things set on bond is propagated to slaves. The "ifdown bond0cpc1" is not all, setting down a master does not automatically set down the slaves, e.g. when you have: br0 { eth0 } br1 { eth0.1 } calling ifdown br0 would break br1 + eth0.1 vlan when down of br0 would also set down the slave (there are also another cases). Am 02.02.2017 um 19:42 schrieb Jason Schultz:
I have an application that’s responsible for configuring the network on the host; it writes configuration files to /etc/sysconfig/network, and then invokes ifup, ifdown, etc to up and down interfaces and bonds. Since moving to SLES 12 SP1 and wicked and the wicked nanny daemon, we are seeing an issue I’m hoping someone on this list may be able to help with. This issue does not occur on SLES 11 systems using Network Manager.
The scenario is as follows. To bring up a bond, the application does the following. (The NIC names and file names I’m using are just for the purpose of the example):
Create a config file for p5p2 and copy/write the file to /etc/sysconfig/network/ifcfg-p5p2 ifup p5p2 Create a config file for p5p2 and copy/write the file to /etc/sysconfig/network/ifcfg-p5p4 ifup p5p4 Creates a config file for the bond and copy/write the file to /etc/sysconfig/network/ifcfg-bond0cpc1 ifup bond0cpc1
Everything works fine the first time, after a fresh Linux boot.
I then manually bring down the NICs/bonds with ifdowns and remove the files from the /etc/sysconfig/network directory.
Then, the second time I attempt to perform the same steps to bring up the same NICs and bond, I see the following:
Create a config file for p5p2 and copy the file to /etc/sysconfig/network/ifcfg-p5p2 ifup p5p2 (works fine) Create a config file for p5p2 and copy the file to /etc/sysconfig/network/ifcfg-p5p4 ifup p5p4 (works fine) Create a config file for the bond and copy the file to /etc/sysconfig/network/ifcfg-bond0cpc1 ifup bond0cpc1 (*** fails)
On the failure of the ifup of the bond the second time, the ifup returns “device-not-running”, and the same failure is exhibited over and over again. So the main question is, why is that?
The NIC I’m using is the Intel Corporation I350 Gigabit Network Connection (4-port, copper).
Here’s some more info.
If I execute the ifups in a shell script so that there’s very little time delay between copying the files into the /etc/sysconfig/network directory and ifup’ing the devices, it works every time.
cp ifcfg-p5p2 ifcfg-p5p4 /etc/sysconfig/network ifup p5p2 ifup p5p4 cp ifcfg-bond0cpc0 /etc/sysconfig/network ifup bond0cpc0
If you now ifdown the devices and remove their ifcfg- files from /etc/sysconfig/network and re-run the script, it works every time. Doing the commands from the command line, or putting delays in the shell script; that is, putting sleeps between copying the files to the /etc/sysconfig/network directory and ifup’ing the devices, fails every time.
I tried manual configuration of a bond on the Intel Corporation I350 Gigabit Fiber Network Connection (4-port, fiber) and it works every time regardless.
Finally, if I disable the nanny daemon by editing /etc/wicked/common.xml (setting <use-nanny>false</use-nanny> and restarting the wicked service) manual ifup’ing works regardless. Does anyone have any ideas on what the issue could be?
Thanks in advance.
Gruesse / Regards, Marius Tomaschewski <mt@suse.de>, <mt@suse.com> -- SUSE LINUX GmbH, GF: Felix Imendörffer, Jane Smithard, Dilip Upmanyu, Graham Norton, HRB 21284 (AG Nürnberg), Maxfeldstraße 5, 90409 Nürnberg, Germany -- To unsubscribe, e-mail: wicked-devel+unsubscribe@opensuse.org To contact the owner, e-mail: wicked-devel+owner@opensuse.org
participants (2)
-
Jason Schultz
-
Marius Tomaschewski