[Bug 458072] New: e1000e: duplicated packets
https://bugzilla.novell.com/show_bug.cgi?id=458072 Summary: e1000e: duplicated packets Product: openSUSE 11.1 Version: Final Platform: x86-64 OS/Version: openSUSE 11.1 Status: NEW Severity: Normal Priority: P5 - None Component: Kernel AssignedTo: bnc-team-screening@forge.provo.novell.com ReportedBy: munderl@tnt.uni-hannover.de QAContact: qa@suse.de Found By: --- If I ping my computer with a e1000e network card I get duplicated packets: PING merlin.tnt.uni-hannover.de (130.75.31.22) 56(84) bytes of data. 64 bytes from merlin.tnt.uni-hannover.de (130.75.31.22): icmp_seq=1 ttl=255 time=0.505 ms 64 bytes from merlin.tnt.uni-hannover.de (130.75.31.22): icmp_seq=1 ttl=64 time=0.510 ms (DUP!) 64 bytes from merlin.tnt.uni-hannover.de (130.75.31.22): icmp_seq=2 ttl=64 time=0.514 ms 64 bytes from merlin.tnt.uni-hannover.de (130.75.31.22): icmp_seq=2 ttl=255 time=0.523 ms (DUP!) After rcnetwork stop; rmmod e1000e; modprobe e1000e; rcnetwork start everything is back to normal for some time. This happens also with the latest kernels for openSUSE 11.0 but not with the earlier ones so it was introduced somewhere in the 11.0 kernel line. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=458072 User meissner@novell.com added comment https://bugzilla.novell.com/show_bug.cgi?id=458072#c1 Marcus Meissner <meissner@novell.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |NEEDINFO Info Provider| |munderl@tnt.uni-hannover.de --- Comment #1 from Marcus Meissner <meissner@novell.com> 2008-12-10 12:31:10 MST --- any dmesg output when this happens? -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=458072 User munderl@tnt.uni-hannover.de added comment https://bugzilla.novell.com/show_bug.cgi?id=458072#c2 --- Comment #2 from Marco Munderloh <munderl@tnt.uni-hannover.de> 2008-12-11 04:39:20 MST --- No, there's nothing in the logs. When I start the ping on the machine itself, there are no DUP packets. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=458072 User chrubis@novell.com added comment https://bugzilla.novell.com/show_bug.cgi?id=458072#c3 Cyril Hrubis <chrubis@novell.com> changed: What |Removed |Added ---------------------------------------------------------------------------- AssignedTo|bnc-team-screening@forge.provo.novell.com |kernel-maintainers@forge.provo.novell.com Status|NEEDINFO |NEW Info Provider|munderl@tnt.uni-hannover.de | --- Comment #3 from Cyril Hrubis <chrubis@novell.com> 2008-12-11 07:27:57 MST --- Please next time remove NEEDINFO status by selecting "This comment/attachment provides ..." checkbox after supplying needed information. Reassigning to maintainers. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=458072 User munderl@tnt.uni-hannover.de added comment https://bugzilla.novell.com/show_bug.cgi?id=458072#c4 --- Comment #4 from Marco Munderloh <munderl@tnt.uni-hannover.de> 2008-12-18 05:27:58 MST --- I did some additional tests. I removed the module and reloaded it. After that I checked every minute from a different machine if I get duplicated packets. It seems that it is less than an hour that the network link is stable. Here are the times after duplicated packets occur: 52 min 56 min 33 min And there is still absolutely nothing in the logs. I'm using kernel 2.6.27.7-6-default at the moment. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=458072 User munderl@tnt.uni-hannover.de added comment https://bugzilla.novell.com/show_bug.cgi?id=458072#c5 Marco Munderloh <munderl@tnt.uni-hannover.de> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |munderl@tnt.uni-hannover.de --- Comment #5 from Marco Munderloh <munderl@tnt.uni-hannover.de> 2009-01-08 09:06:20 MST --- Does anything happen with this topic? It is really annoying that network is unreliable and performance is that bad. Can I do any additional tests? It's easy to reproduce (at least on my machine). Just wait about one hour... -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=458072 User munderl@tnt.uni-hannover.de added comment https://bugzilla.novell.com/show_bug.cgi?id=458072#c6 --- Comment #6 from Marco Munderloh <munderl@tnt.uni-hannover.de> 2009-01-12 09:08:08 MST --- FYI: I just downloaded 2.6.27.9 from kernel.org, compiled it using make cloneconfig, and the error is still there - so it's not just a SuSE-patch what botched this. Pinging from another machine tcpdump on this machine tells me just ONE packet received and one transmitted. On the other machine, however, to replies are received and shown in tcpdump. I disabled all offloading using ethtool to no avail. What else can I do? -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=458072 User kkeil@novell.com added comment https://bugzilla.novell.com/show_bug.cgi?id=458072#c7 Karsten Keil <kkeil@novell.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Priority|P5 - None |P4 - Low Status|NEW |NEEDINFO CC| |kkeil@novell.com Info Provider| |munderl@tnt.uni-hannover.de AssignedTo|kernel-maintainers@forge.pr |kkeil@novell.com |ovo.novell.com | --- Comment #7 from Karsten Keil <kkeil@novell.com> 2009-01-12 09:24:41 MST --- Hmm I do not see this here with my e1000e hardware (up to now), will try to stress test it a little bit. But wait - looking closer to you log: 64 bytes from merlin.tnt.uni-hannover.de (130.75.31.22): icmp_seq=1 ttl=255 time=0.505 ms 64 bytes from merlin.tnt.uni-hannover.de (130.75.31.22): icmp_seq=1 ttl=64 time=0.510 ms (DUP!) 64 bytes from merlin.tnt.uni-hannover.de (130.75.31.22): icmp_seq=2 ttl=64 time=0.514 ms 64 bytes from merlin.tnt.uni-hannover.de (130.75.31.22): icmp_seq=2 ttl=255 time=0.523 ms (DUP!) You have a routing problem in your network, this are real duplicate packets the second arrives some milliseconds later and have a reduced ttl (255--->64) so the second packet traveled via 255-64 = 191 hops. Try traceroute if you see duplicate packets again, maybe it show something. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=458072 User munderl@tnt.uni-hannover.de added comment https://bugzilla.novell.com/show_bug.cgi?id=458072#c8 Marco Munderloh <munderl@tnt.uni-hannover.de> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEEDINFO |NEW Info Provider|munderl@tnt.uni-hannover.de | --- Comment #8 from Marco Munderloh <munderl@tnt.uni-hannover.de> 2009-01-12 10:09:18 MST --- I thought about a routing problem, too. But The two machines are directly connected with a switch (but there is an gateway). And as I said, the problem starts 30 to 60 minutes after the e1000e modules is loaded. And 0.005 ms seems a little bit to fast to hop 191 times... traceroute shows just one hop and no duplicates. Maybe the second packet is ignored. The routing table looks perfectly fine. Strange thing is, we have a second computer, slightly different in hardware but also with e1000e and intel Q965 chipset, and I wrote the root partition bytewise to that machine, no errors do appear there... The routing table is the same on both machines. I try to connect only the two machines with a switch without any uplink later and see what happens then. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=458072 User munderl@tnt.uni-hannover.de added comment https://bugzilla.novell.com/show_bug.cgi?id=458072#c9 --- Comment #9 from Marco Munderloh <munderl@tnt.uni-hannover.de> 2009-01-12 10:15:29 MST --- Ok, tried it now. Just one (different) switch and two computers. No uplink, no routing. DUP packets still there... Pinging the other direction does always work, if that rings a bell. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=458072 User kkeil@novell.com added comment https://bugzilla.novell.com/show_bug.cgi?id=458072#c10 Karsten Keil <kkeil@novell.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |NEEDINFO Info Provider| |munderl@tnt.uni-hannover.de --- Comment #10 from Karsten Keil <kkeil@novell.com> 2009-01-12 10:39:42 MST --- OK to clarify: 1. Some more HW info lspci and lspci -n for the e1000e, is this a on board NIC ? 2. Which direction does show the duplicates ? -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=458072 User munderl@tnt.uni-hannover.de added comment https://bugzilla.novell.com/show_bug.cgi?id=458072#c11 Marco Munderloh <munderl@tnt.uni-hannover.de> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEEDINFO |NEW Info Provider|munderl@tnt.uni-hannover.de | --- Comment #11 from Marco Munderloh <munderl@tnt.uni-hannover.de> 2009-01-12 11:46:54 MST --- 1. It is a on-board NIC. lspic: 00:19.0 Ethernet controller: Intel Corporation 82566DM Gigabit Network Connection (rev 02) lspci -n: 00:19.0 0200: 8086:104a (rev 02) 2. Pinging from a different machine to this one with the e1000e creates duplicated replies. Pinging from this machine to another one works. In an 100MBit Network, I can transmit ~12MB/s to other machines but only ~5MB/s to this one. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=458072 User kkeil@novell.com added comment https://bugzilla.novell.com/show_bug.cgi?id=458072#c12 Karsten Keil <kkeil@novell.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |ASSIGNED CC| |john.ronciak@intel.com --- Comment #12 from Karsten Keil <kkeil@novell.com> 2009-01-13 06:40:53 MST --- Hmm, comment #6 say, that you can't see the duplicate packets on the machine itself with tcpdump, this means that if the packages come from this machine, the duplication occurs under the socket level, that maybe the card driver or the hardware/firmware. Since the driver itself do not handle ttl, I guess something in the firmware may do it, this would also explain, why only you see that on this HW. Please report the Mainboard brand and type, I will add Intel developer to this report. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=458072 User kkeil@novell.com added comment https://bugzilla.novell.com/show_bug.cgi?id=458072#c13 Karsten Keil <kkeil@novell.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|ASSIGNED |NEEDINFO Info Provider| |john.ronciak@intel.com --- Comment #13 from Karsten Keil <kkeil@novell.com> 2009-01-13 06:42:33 MST --- John, do you have any idea here ? -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=458072 User munderl@tnt.uni-hannover.de added comment https://bugzilla.novell.com/show_bug.cgi?id=458072#c14 Marco Munderloh <munderl@tnt.uni-hannover.de> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEEDINFO |ASSIGNED Info Provider|john.ronciak@intel.com | --- Comment #14 from Marco Munderloh <munderl@tnt.uni-hannover.de> 2009-01-13 07:30:16 MST --- dmidecode prints out the following: Base Board Information Manufacturer: Intel Corporation Product Name: DQ965GF Version: AAD41676-601 BIOS Information Vendor: Intel Corp. Version: CO96510J.86A.6080.2008.0812.1831 Release Date: 08/12/2008 The board does have a management engine. Maybe that is the source for the problem? -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=458072 User kkeil@novell.com added comment https://bugzilla.novell.com/show_bug.cgi?id=458072#c15 Karsten Keil <kkeil@novell.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|ASSIGNED |NEEDINFO Info Provider| |john.ronciak@intel.com --- Comment #15 from Karsten Keil <kkeil@novell.com> 2009-01-13 07:51:06 MST --- Yes thats what I think too, I hope John has an idea. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=458072 User john.ronciak@intel.com added comment https://bugzilla.novell.com/show_bug.cgi?id=458072#c16 --- Comment #16 from John Ronciak <john.ronciak@intel.com> 2009-01-13 10:16:33 MST --- Sorry guys I do not. There are no other reports of anything like this happening. You say that it happens on driver load but all kinds of things happen to the network stack when the driver loads. I'm pretty convinced that this is not a driver (or NIC HW) problem. Also, if it's working for some amount of time (like up to an hour), that is also in indication that it is not a driver problem. You say that the m/b has some management controller on it. Try disabling the management controller to see if this still happens. Does the other machine you talk about have one as well? If so are they configured the same? Have you tried to plug the systems back to back without the switch between them? Also, check to see what processes are running on the system. Maybe something is running that could be causing this. I'm pretty sure this is a configuration type problem on this particular system. The indication is that different OSes are doing the same thing. Just not sure what it is yet. What type of switches are being used? Is this only happening at 100Mbps? -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=458072 User kkeil@novell.com added comment https://bugzilla.novell.com/show_bug.cgi?id=458072#c17 Karsten Keil <kkeil@novell.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEEDINFO |ASSIGNED Info Provider|john.ronciak@intel.com | --- Comment #17 from Karsten Keil <kkeil@novell.com> 2009-01-15 12:00:02 MST --- OK what we can do to rule out that the packets come from the linux stack is to instrument the TX function with some limited packet log (conditional so it can be turned on after the system enter this state). -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=458072 User kkeil@novell.com added comment https://bugzilla.novell.com/show_bug.cgi?id=458072#c18 Karsten Keil <kkeil@novell.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|ASSIGNED |NEEDINFO Info Provider| |munderl@tnt.uni-hannover.de --- Comment #18 from Karsten Keil <kkeil@novell.com> 2009-01-16 09:33:33 MST --- I have some kernel module packages with the above debug ready, which kernel do you use (uname -a). -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=458072 User munderl@tnt.uni-hannover.de added comment https://bugzilla.novell.com/show_bug.cgi?id=458072#c19 Marco Munderloh <munderl@tnt.uni-hannover.de> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEEDINFO |ASSIGNED Info Provider|munderl@tnt.uni-hannover.de | --- Comment #19 from Marco Munderloh <munderl@tnt.uni-hannover.de> 2009-01-16 11:18:31 MST --- I managed to disable the management engine in bios now and the problem is gone! So it really has something to do with the management engine and not with the linux stack. Strange thing is, it appeared one day after a kernel update with nothing else changed. Maybe it there is still some dependency between bad configuration / faulty iatm firmware / e1000e module? -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=458072 User munderl@tnt.uni-hannover.de added comment https://bugzilla.novell.com/show_bug.cgi?id=458072#c20 --- Comment #20 from Marco Munderloh <munderl@tnt.uni-hannover.de> 2009-01-16 11:20:30 MST --- If you want me to do any further tests: my kernel version is 2.6.27.7-9-default atm. Thanks for all your help so far! -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=458072 User kkeil@novell.com added comment https://bugzilla.novell.com/show_bug.cgi?id=458072#c21 --- Comment #21 from Karsten Keil <kkeil@novell.com> 2009-01-16 13:14:36 MST --- In this case my debug module makes no sense. Please attach the dmidecode output, this should give some info about BIOS version and so on , maybe Intel has some more hints. Maybe reflashing/update BIOS can help as well. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=458072 User munderl@tnt.uni-hannover.de added comment https://bugzilla.novell.com/show_bug.cgi?id=458072#c22 --- Comment #22 from Marco Munderloh <munderl@tnt.uni-hannover.de> 2009-01-19 11:38:09 MST --- Created an attachment (id=266061) --> (https://bugzilla.novell.com/attachment.cgi?id=266061) Output of dmidecode Find attached the requested dmidecode output. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=458072 User kkeil@novell.com added comment https://bugzilla.novell.com/show_bug.cgi?id=458072#c23 Karsten Keil <kkeil@novell.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|ASSIGNED |RESOLVED Resolution| |WORKSFORME --- Comment #23 from Karsten Keil <kkeil@novell.com> 2009-02-26 03:11:10 MST --- OK I will close this as worksforme, since we cannot do anything here, so disabling the management engine is the only solution for now. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
participants (1)
-
bugzilla_noreply@novell.com