Repost: Bonding on Suse 8.1 SMP crashes system
I was wondering if anyone else has had this problem... I have tested on numerous machines: Using e100.o both eth0 and eth1 network cards: SMP Config: =========== Single proc with smp kernel -> crashes at ifconfig bond0 x.x.x.x netmask y.y.y.y up Dual Proc same smp kernel -> crashes at ifconfig bond0 x.x.x.x netmask y.y.y.y up Tested on Kernels: Suse8.1 - Linux brutus 2.4.19-64GB-SMP #1 SMP Wed Nov 27 00:56:43 UTC 2002 i686 unknown United Linux - Linux mail 2.4.19-64GB-SMP #1 SMP Mon Oct 21 18:48:05 UTC 2002 i686 unknown Single Proc Config: =================== United Linux - Linux zeus 2.4.19-4GB #1 Mon Oct 21 18:45:41 UTC 2002 i686 unknown bond0 Link encap:Ethernet HWaddr 00:02:B3:9A:19:33 inet addr:x.x.x.x Bcast:x.x.x.255 Mask:255.255.255.0 inet6 addr: fe80::200:ff:fe00:0/10 Scope:Link UP BROADCAST RUNNING MASTER MULTICAST MTU:1500 Metric:1 RX packets:110 errors:0 dropped:0 overruns:0 frame:0 TX packets:72 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:13275 (12.9 Kb) TX bytes:10707 (10.4 Kb) eth0 Link encap:Ethernet HWaddr 00:02:B3:9A:19:33 inet addr:x.x.x.x Bcast:x.x.x.255 Mask:255.255.255.0 inet6 addr: fe80::202:b3ff:fe9a:1933/10 Scope:Link UP BROADCAST RUNNING SLAVE MULTICAST MTU:1500 Metric:1 RX packets:110 errors:0 dropped:0 overruns:0 frame:0 TX packets:72 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:100 RX bytes:13275 (12.9 Kb) TX bytes:10707 (10.4 Kb) Interrupt:9 Base address:0xa400 Memory:cf000000-cf000038 eth1 Link encap:Ethernet HWaddr 00:02:B3:03:7C:1B inet addr:x.x.x.x Bcast:x.x.x.255 Mask:255.255.255.0 BROADCAST MULTICAST MTU:1500 Metric:1 RX packets:21 errors:0 dropped:0 overruns:0 frame:0 TX packets:21 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:100 RX bytes:3208 (3.1 Kb) TX bytes:2211 (2.1 Kb) Interrupt:5 Base address:0xa000 Memory:ce000000-ce000038 root@zeus:~ # ping www.yahoo.com PING www.yahoo.akadns.net (64.58.76.230) from x.x.x.x : 56(84) bytes of data. 64 bytes from w9.dcx.yahoo.com (64.58.76.230): icmp_seq=1 ttl=239 time=152 ms 64 bytes from w9.dcx.yahoo.com (64.58.76.230): icmp_seq=2 ttl=239 time=347 ms 64 bytes from w9.dcx.yahoo.com (64.58.76.230): icmp_seq=3 ttl=239 time=156 ms I haven't found anything to suggest that this is not possible on SMP, on the contrary /usr/src/linux/Documents/networking/bonding.txt says: Questions : =========== 1. Is it SMP safe? Yes. The old 2.0.xx channel bonding patch was not SMP safe. The new driver was designed to be SMP safe from the start. compiled ifenslave.c as per bonding.txt document o install ifenslave.c, do: # gcc -Wall -Wstrict-prototypes -O -I/usr/src/linux/include ifenslave.c -o ifenslave # cp ifenslave /sbin/ifenslave ifenslave is obviously not the problem yet as we haven't run the command to enslave the devices, but rather hard locks after ifconfig bond0 and only on SMP. Any help or insight with this would be appreciated as I really have to get this to work with SMP. My guess is the bonding.o module on the SMP kernel has a problem, but I don't know where to go from here... Has anyone else had this problem??? On a side note if anyone know how to simplify the use of the ifenslave commands for enslaving the devices on startup that would also be appreciated. Have followed the directions as per the bonding.txt document, but it fails to enslave the network devices. from bonding.txt: ================= /etc/sysconfig/network-scripts directory that looks like this: DEVICE=bond0 IPADDR=192.168.1.1 NETMASK=255.255.255.0 NETWORK=192.168.1.0 BROADCAST=192.168.1.255 ONBOOT=yes BOOTPROTO=none USERCTL=no Thanks for any help in advance. If you could reply to my personal address as well as the list that would be great. James
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Wednesday 07 May 2003 05:55 pm, Deiknumi Lists wrote:
I was wondering if anyone else has had this problem...
I have tested on numerous machines:
Using e100.o both eth0 and eth1 network cards:
...description of bonding problem... Things I would try: - - Use eepro100 instead of e100. I've seen a couple of cases where this fixed odd problems - - Turn on magic SysRq Are you getting any "oops" on the system's console (a bunch of numbers)? If so, you can copy them down and run them through ksymoops after the machine is rebooted and it will tell you where in the kernel it was when it crashed. After it crashes, you may also be able to glean some information from the SysRq keys described in /usr/src/linux/Documentation/sysrq.txt. If you need any more information, email back. - -- James Oakley Engineering - SolutionInc Ltd. joakley@solutioninc.com http://www.solutioninc.com -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.0.7 (GNU/Linux) iD8DBQE+ukoX+FOexA3koIgRArLkAJ9XxdnibPMhLWa70SyHOd5zt5a0dgCcC+OK f4gVFUDC3jgx+S/biMxXJkw= =dsYN -----END PGP SIGNATURE-----
participants (2)
-
Deiknumi Lists
-
James Oakley