[Bug 678540] New: eth interace dropts packets when vlan + bridge are on top of it
https://bugzilla.novell.com/show_bug.cgi?id=678540 https://bugzilla.novell.com/show_bug.cgi?id=678540#c0 Summary: eth interace dropts packets when vlan + bridge are on top of it Classification: openSUSE Product: openSUSE 11.4 Version: Final Platform: x86-64 OS/Version: SuSE Other Status: NEW Severity: Major Priority: P5 - None Component: Kernel AssignedTo: kernel-maintainers@forge.provo.novell.com ReportedBy: mt@novell.com QAContact: qa@suse.de Found By: --- Blocker: --- ip link set up dev eth0 ip a a 192.168.1.1/24 brd + dev eth0 ping to 192.168.1.254 works fine. modprobe 8021q vconfig add eth0 11 ip link set up dev eth0.11 address 66:b4:46:3b:c6:11 brctl addbr br1100 brctl addif br1100 eth0.11 brctl stp br1100 on brctl setfd br1100 4 brctl sethello br1100 1 brctl setmaxage br1100 6 ip link set up br1100 or also using ifup with: ifcfg-eth0.11: USERCONTROL='no' STARTMODE='auto' BOOTPROTO='static' ETHERDEVICE='eth0' LLADDR='66:b4:46:3b:c6:11' ifcfg-br1100: USERCONTROL='no' STARTMODE='auto' BOOTPROTO='static' BRIDGE='yes' BRIDGE_PORTS='eth0.11' BRIDGE_STP='on' BRIDGE_FORWARDDELAY='4' BRIDGE_HELLOTIME='1' BRIDGE_MAXAGE='6' ping to 192.168.1.254 drops packets as soon as the bridge is up: --- 192.168.1.254 ping statistics --- 18 packets transmitted, 9 received, 50% packet loss, time 17015ms rtt min/avg/max/mdev = 0.718/0.763/1.002/0.085 ms --- www.l.google.com ping statistics --- 8 packets transmitted, 3 received, 62% packet loss, time 7002ms rtt min/avg/max/mdev = 29.847/31.515/34.034/1.812 ms It works fine again after a restart (stopping everything and start eth0 only). Of course, it is not limitted to icmp -- everything else stumbles significantly. DNS is almost unusable. Exactly same setup always worked in the past (sle11-sp1, 11.3, 11.2). -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=678540
https://bugzilla.novell.com/show_bug.cgi?id=678540#c1
--- Comment #1 from Marius Tomaschewski
https://bugzilla.novell.com/show_bug.cgi?id=678540
https://bugzilla.novell.com/show_bug.cgi?id=678540#c2
--- Comment #2 from Marius Tomaschewski
https://bugzilla.novell.com/show_bug.cgi?id=678540
https://bugzilla.novell.com/show_bug.cgi?id=678540#c3
--- Comment #3 from Marius Tomaschewski
https://bugzilla.novell.com/show_bug.cgi?id=678540
https://bugzilla.novell.com/show_bug.cgi?id=678540#c4
Igor Shalakhin
https://bugzilla.novell.com/show_bug.cgi?id=678540
https://bugzilla.novell.com/show_bug.cgi?id=678540#c5
--- Comment #5 from Igor Shalakhin
https://bugzilla.novell.com/show_bug.cgi?id=678540
https://bugzilla.novell.com/show_bug.cgi?id=678540#c
Jeff Mahoney
https://bugzilla.novell.com/show_bug.cgi?id=678540
https://bugzilla.novell.com/show_bug.cgi?id=678540#c6
--- Comment #6 from Brandon Philips
I've removed the bonding, changed to use eth0 directly, booted it with destop kernel... No problems with ping so far, but it were unable to stop the network: a ip link set down br5000 were in D state and ip a s on another still open console stopped as well in D state... ?!
Please file a separate report for this bug. It sounds unrelated to the packet loss. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=678540
https://bugzilla.novell.com/show_bug.cgi?id=678540#c7
--- Comment #7 from Brandon Philips
Exactly same setup always worked in the past (sle11-sp1, 11.3, 11.2).
Can you attach a supportconfig for your machine? Install supportutils. I would like to see the hardware and network setup in detail. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=678540
https://bugzilla.novell.com/show_bug.cgi?id=678540#c8
--- Comment #8 from Brandon Philips
I can confirm this bug.
Upgraded from 11.3 to 11.4. My config is little different:
ifcfg-eth1
BOOTPROTO='none' BROADCAST='' ETHTOOL_OPTIONS='' IPADDR='' MTU='' NAME='82574L Gigabit Network Connection'
ifcfg-vlan11
BOOTPROTO='none' BROADCAST='' ETHERDEVICE='eth1' ETHTOOL_OPTIONS='' IPADDR='' STARTMODE='auto'
ifcfg-xenbr1
BOOTPROTO='static' BRIDGE='yes' BRIDGE_FORWARDDELAY='0' BRIDGE_PORTS='eth1' BRIDGE_STP='off' IPADDR='10.1.1.5/24' REMOTE_IPADDR='' STARTMODE='auto'
ifcfg-xenbr11
BOOTPROTO='static' BRIDGE='yes' BRIDGE_FORWARDDELAY='0' BRIDGE_PORTS='vlan11' BRIDGE_STP='off' IPADDR='10.1.81.11' STARTMODE='auto'
Igor- Putting a VLAN and a bridge on the same physical device doesn't make sense and was broken, with good reason, in 2.6.37[1]. I would recommend that you create a bridge device and then hang the vlan off of that. If that doesn't work please file a new bug as your setup is slightly different than Marius's. [1] http://thread.gmane.org/gmane.linux.network/149864 -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=678540
https://bugzilla.novell.com/show_bug.cgi?id=678540#c9
--- Comment #9 from Marius Tomaschewski
(In reply to comment #2)
I've removed the bonding, changed to use eth0 directly, booted it with destop kernel... No problems with ping so far, but it were unable to stop the network: a ip link set down br5000 were in D state and ip a s on another still open console stopped as well in D state... ?!
Please file a separate report for this bug. It sounds unrelated to the packet loss.
=> Bug 679685. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=678540
https://bugzilla.novell.com/show_bug.cgi?id=678540#c13
--- Comment #13 from Igor Shalakhin
Igor- Putting a VLAN and a bridge on the same physical device doesn't make sense and was broken, with good reason, in 2.6.37[1]. I would recommend that you create a bridge device and then hang the vlan off of that. If that doesn't work please file a new bug as your setup is slightly different than Marius's.
Ok, I make new tests. I think, there is 2 problems: 1. I can make so: /vlan11--br11 eth0 - \vlan12--br12 It work in my test. A so I can't make: /vlan11--br11 eth0---------------br0 \vlan12--br12 Maybe, it was good reason to disable this capability.. 2. In driver "forcedeth" for nVidia MCP77 assist distressing bug: 00:0a.0 Ethernet controller: nVidia Corporation MCP77 Ethernet (rev a2) 34: None 00.0: 10701 Ethernet [Created at net.124] Unique ID: usDW.ndpeucax6V1 Parent ID: rBUF.i_i4VydhKh5 SysFS ID: /class/net/eth0 SysFS Device Link: /devices/pci0000:00/0000:00:0a.0 Hardware Class: network interface Model: "Ethernet network interface" Driver: "forcedeth" Driver Modules: "forcedeth" Device File: eth0 HW Address: 00:24:8c:23:0c:8e Link detected: yes Config Status: cfg=new, avail=yes, need=no, active=unknown Attached to: #15 (Ethernet controller) ifconfig eth0 inet 10.1.81.15 up On my switch I set untagged VlanID and portVID to eth0 in 11 and ping gate with enabled vlan11 on lan interface --- 10.1.81.1 ping statistics --- 253 packets transmitted, 253 received, 0% packet loss, time 252002ms rtt min/avg/max/mdev = 0.073/0.196/2.631/0.170 ms Reverse ping also good. Then I set vlan settings on switch to default and create vconfig eth0 11 ifconfig eth0 inet 10.1.1.20 up ifconfig eth0.11 inet 10.1.81.15 up then ping --- 10.1.81.1 ping statistics --- 335 packets transmitted, 173 received, 48% packet loss, time 333998ms rtt min/avg/max/mdev = 0.083/0.179/0.420/0.058 ms Revere ping is similar. I also add bridge to eth0.11 and ping --- 10.1.81.1 ping statistics --- 254 packets transmitted, 134 received, 47% packet loss, time 252998ms rtt min/avg/max/mdev = 0.094/0.182/0.424/0.077 ms Of course, physical network is good, but I try to change patch-cord and port in switch. On Intel 82574L controller in the same server all work without problems. Besides, in the same server on the same Nvidia network adapter 2 years worked OpenSolaris with a similar configuration of vlan's and bridges :) And at last, if I attach simple bridge to eth0 - all work. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=678540
https://bugzilla.novell.com/show_bug.cgi?id=678540#c14
Brandon Philips
And at last, if I attach simple bridge to eth0 - all work.
I don't understand what you mean by this. Can you try disabling any features you might have enabled in here: ethtool -k eth0 -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=678540
https://bugzilla.novell.com/show_bug.cgi?id=678540#c15
--- Comment #15 from Brandon Philips
https://bugzilla.novell.com/show_bug.cgi?id=678540
https://bugzilla.novell.com/show_bug.cgi?id=678540#c16
--- Comment #16 from Igor Shalakhin
(In reply to comment #13)
And at last, if I attach simple bridge to eth0 - all work.
I don't understand what you mean by this.
I just create bridge on eth0 without vlan and connect to it xen domU's, which not need vlan. Network performance is very good in such case.
Can you try disabling any features you might have enabled in here: ethtool -k eth0
All by default from system install: # ethtool -k eth0 Offload parameters for eth0: rx-checksumming: on tx-checksumming: on scatter-gather: on tcp-segmentation-offload: on udp-fragmentation-offload: off generic-segmentation-offload: on generic-receive-offload: on large-receive-offload: off rx-vlan-offload: off tx-vlan-offload: off ntuple-filters: off receive-hashing: off -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=678540
https://bugzilla.novell.com/show_bug.cgi?id=678540#c17
--- Comment #17 from Brandon Philips
All by default from system install:
# ethtool -k eth0 Offload parameters for eth0: rx-checksumming: on .. receive-hashing: off
Right, can you disable the on features in turn and see if it helps? -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=678540
https://bugzilla.novell.com/show_bug.cgi?id=678540#c18
Nick Kozubsky
https://bugzilla.novell.com/show_bug.cgi?id=678540
https://bugzilla.novell.com/show_bug.cgi?id=678540#c19
Marius Tomaschewski
Then i can see: Terminal hangs while /etc/init.d/network stop (or restart). Also I can't to kill a dhcpcd by 'killall -9 dhcpcd' command. Reboot don't work too (hangs on stoping vlan interface).
Yes, see Bug 679685 and comments above -- this happens with desktop kernel. You have to reboot using (Ctrl-Alt-)SysRq s, u, b. SysRq is on the Print-Key: Ctrl+Alt+Print + s, Ctrl+Alt+Print + s, Ctrl+Alt+Print + u, Ctrl+Alt+Print + b -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=678540
https://bugzilla.novell.com/show_bug.cgi?id=678540#c20
--- Comment #20 from Marius Tomaschewski
Right, can you disable the on features in turn and see if it helps?
No, it does not help. As soon as vlan11 interface is up, it starts to stall again and again. I'm still able to access it via ssh and provide the following, outputs, but its like over a very slow and far link... When I start the bridge it is quite soon not possible any more to use the open connection. When I stop the bridge again on the console, the commands & outputs I've tried to use while the bridge were up, start to arrive. # rpm -q kernel-default kernel-default-2.6.37.6-0.x86_64 # rpm -q kernel-default --changelog | head -4 * Di Mär 29 2011 bphilips@suse.de - gro: reset skb_iif on reuse (bnc#682965, CVE-2011-1478). - gro: Reset dev pointer on reuse (bnc#682965, CVE-2011-1478). - commit ebd85e0 # ethtool -K eth0 rx off tx off sg off tso off ufo off gso off gro off lro off rxvlan off txvlan off rxhash off Cannot set device udp large send offload settings: Operation not supported # ethtool -K eth0 gso off # ethtool -k eth0 Offload parameters for eth0: rx-checksumming: off tx-checksumming: off scatter-gather: off tcp-segmentation-offload: off udp-fragmentation-offload: off generic-segmentation-offload: off generic-receive-offload: off large-receive-offload: off rx-vlan-offload: off tx-vlan-offload: off ntuple-filters: off receive-hashing: off # ifconfig eth0 Link encap:Ethernet Hardware Adresse 00:23:54:aa:aa:aa inet[...] UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:8606 errors:0 dropped:585 overruns:0 frame:0 TX packets:6125 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 Sendewarteschlangenlänge:1000 RX bytes:905863 (884.6 Kb) TX bytes:2091686 (1.9 Mb) Interrupt:41 Basisadresse:0xe000 vlan11 Link encap:Ethernet Hardware Adresse 66:B4:46:3B:C6:11 UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:3 errors:0 dropped:0 overruns:0 frame:0 TX packets:0 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 Sendewarteschlangenlänge:0 RX bytes:216 (216.0 b) TX bytes:0 (0.0 b) # cat /proc/net/vlan/vlan11 vlan11 VID: 11 REORDER_HDR: 1 dev->priv_flags: 1 total frames received 5 total bytes received 464 Broadcast/Multicast Rcvd 4 total frames transmitted 0 total bytes transmitted 0 total headroom inc 0 total encap on xmit 0 Device: eth0 INGRESS priority mappings: 0:0 1:0 2:0 3:0 4:0 5:0 6:0 7:0 EGRESS priority mappings: -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=678540
https://bugzilla.novell.com/show_bug.cgi?id=678540#c21
--- Comment #21 from Marius Tomaschewski
https://bugzilla.novell.com/show_bug.cgi?id=678540
https://bugzilla.novell.com/show_bug.cgi?id=678540#c22
Marius Tomaschewski
https://bugzilla.novell.com/show_bug.cgi?id=678540
https://bugzilla.novell.com/show_bug.cgi?id=678540#c24
Jeff Mahoney
participants (1)
-
bugzilla_noreply@novell.com