[Bug 211867] New: SuSE Linux IP stack sets Don't Fragment flag on UDP datagrams.
https://bugzilla.novell.com/show_bug.cgi?id=211867 Summary: SuSE Linux IP stack sets Don't Fragment flag on UDP datagrams. Product: SUSE Linux 10.1 Version: Final Platform: x86 OS/Version: SuSE Linux 10.1 Status: NEW Severity: Major Priority: P5 - None Component: Network AssignedTo: bnc-team-screening@forge.provo.novell.com ReportedBy: fdjongh@novell.com QAContact: qa@suse.de The Don't Fragment flag is meant for Path MTU discovery. Unlike TCP, UDP does not support Path MTU Discovery, since UDP does not packetize data into segments. Section 6.1 on page 9 of RFC 1191 reads the following about this topic: "We do not want the IP layer to simply set the DF bit in every packet, since it is possible that a packetization layer, perhaps a UDP application outside the kernel, is unable to change its datagram size." However, the SuSE Linux TCP/IP stack sets the DF bit in every datagram, unless Path MTU discovery is disabled by the sysctl parameter 'ip_no_pmtu_disc'. Of course, disabling Path MTU Discovery is not a solution, because Path MTU Discovery is very useful for performance of TCP connections, since it avoids fragmentation and reassembly. Especially fragmentation can put a heavy load on routers. The IP stack should only set the DF bit on datagrams that carry TCP segments, unless a UDP application specifically requests to set it with socket initialization. From the ip manual page in section 7 I understand that the IP stack allows applications to choose whether they want Path MTU Discovery per IP_MTU_DISCOVER socket option. As this manual page also reads that, in case 'ip_no_pmtu_disc' is false, the IP stack is supposed to only enable Path MTU Discovery with SOCK_STREAM (TCP) sockets, and to disable it on all others, I have no doubt that the actual behaviour of the IP stack is not in conformance with the intended, documented behaviour and RFC 1191. Please note that setting the DF flag with UDP datagrams will cause connectivity problems on UDP connections that cross a network path with links that support a smaller MTU than the local segment of the SuSE Linux host, if the concerning UDP application does not support Path MTU Discovery. Most UDP applications do not support Path MTU Discovery. Although I report this bug for SuSE Linux 10.1, the same defect is in every SuSE Linux distribution. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=211867 aj@novell.com changed: What |Removed |Added ---------------------------------------------------------------------------- AssignedTo|bnc-team- |okir@novell.com |screening@forge.provo.novell| |.com | -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=211867 okir@novell.com changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |ak@novell.com Status|NEW |NEEDINFO Info Provider| |ak@novell.com ------- Comment #1 from okir@novell.com 2006-10-13 13:11 MST ------- Oops. That looks strange indeed. I confirmed the stack does this for all UDP packets. Andi, are you aware of any reasons why we would do that? -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=211867 ak@novell.com changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEEDINFO |ASSIGNED Info Provider|ak@novell.com | ------- Comment #2 from ak@novell.com 2006-10-16 05:17 MST ------- Linux does path mtu discovery for UDP/RAW by default. This means it will keep track of the PMTU for the destination and will give you a EMSGSIZE locally if exceeded. You can disable it with the IP_MTU_DISCOVER socket option. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=211867 okir@novell.com changed: What |Removed |Added ---------------------------------------------------------------------------- Status|ASSIGNED |RESOLVED Resolution| |WONTFIX ------- Comment #3 from okir@novell.com 2006-12-18 01:16 MST ------- No feedback for 2 months. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=211867 ------- Comment #4 from fdjongh@novell.com 2006-12-19 06:39 MST ------- Hi Olaf, Thank you very much for the explanation you gave me yesterday per telephone about the way the Linux TCP/IP stack takes care of PMTU Discovery for UDP. I tested it today and came to the conclusion that it really works the way you explained to me. I am impressed, because fragmenting UDP datagrams into chunks of a size that is learned per PMTU discovery is a much more intelligent and efficient method than just not setting the DF bit on UDP datagrams. With much more efficient I mean that this way the first router that is connected to a link with a lower MTU in the path to the destination host will not be bothered with fragmentation as we do it already properly at the origin Linux host. Also the fact that the DF bit is not set on the fragments takes care of "black holes" further in the path. I am very happy to see that the Linux TCP/IP stack has such a smart solution for taking care of PMTU discovery for UDP. I have attached the LAN trace pmtu_udp.cap, which proves PMTU discovery for UDP works well with the Linux TCP/IP stack. Having said this, there is only a small chance that something goes wrong with PMTU discovery on UDP, i.e., what if the first router in the path is a "PMTUD black hole", I will explain below: Suppose the IP stack sends a UDP datagram that is smaller than the local MTU and hence sets the DF bit on this datagram. Suppose a router in the path to the destination is connected to a link with an MTU that is smaller than the datagram and this router does not return an ICMP 3-4 or its ICMP 3-4 is blocked by a firewall. Then the IP stack will not become aware of the lower MTU and the UDP transaction will fail. The easiest solution for this problem is to never set the DF bit on UDP datagrams. On TCP, black hole detection and recovery can be simply done by retransmitting data in very small segments (556 bytes) and/or by requesting the IP stack to not set the DF bit on the datagrams that are going to carry retransmitted and next segments on the connection. I understand that UDP is not a reliable transport and that this problem will not be addressed. It is just that I would like to make you aware of it and perhaps you have an idea about a solution or may-be you have already taken care of it in a way that did not come up in my mind. Thank you very much for your help and I hope you will respond to me about this potential problem as well. Please also don't hesitate to ask questions if my description of the potential problem is not clear enough. Thanks and kind regards, Fons -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=211867 ------- Comment #6 from fdjongh@novell.com 2006-12-19 06:45 MST ------- Created an attachment (id=110289) --> (https://bugzilla.novell.com/attachment.cgi?id=110289&action=view) This LAN trace shows how the Linux TCP/IP stack does path MTU discovery for UDP -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
participants (1)
-
bugzilla_noreply@novell.com