Bug ID | 1104824 |
---|---|
Summary | Bug in xen-netfront.c, concerning replacement of MAX_SKB_FRAGS with XEN_NETIF_NR_SLOTS_MIN |
Classification | openSUSE |
Product | openSUSE Distribution |
Version | Leap 15.0 |
Hardware | x86-64 |
OS | Linux |
Status | NEW |
Severity | Normal |
Priority | P5 - None |
Component | Kernel |
Assignee | kernel-maintainers@forge.provo.novell.com |
Reporter | Manfred.Haertel@rz-online.de |
QA Contact | qa-bugs@suse.de |
Found By | --- |
Blocker | --- |
After a recent kernel upgrade in a DomU, which run on a system with an older kernel in Dom0, the DomU crashed on heavy incoming network load. So I filed https://bugzilla.suse.com/show_bug.cgi?id=1102997 and later I invalidated my own bug report, because I could not reproduce the bug with a current Dom0 kernel. However, after some debugging and trying to understand how the driver works, I still think that the change where MAX_SKB_FRAGS was replaced by XEN_NETIF_NR_SLOTS_MIN is not completely correct. I hope my wording is correct (or at least understandable), trying to explain what I think is wrong. The variable max in xennet_get_responses() can now become larger by one than before. BEFORE the patch, the variable was 17 or 18, depending on rx->status <= RX_COPY_THRESHOLD, which means it depends on the size of the first fragment. If the first fragment was larger than the threshold, the variable became 17, and 18 otherwise. So, if a (ethernet GSO) frame with 18 fragments was received AND the first fragment was "large" (larger than the threshold), the condition slots > max in xennet_get_responses() was met, and the whole frame was discarded (and the infamous message "Too many slots" appeared). I don't think the Xen network code allows a frame with 19 fragments, so the condition slots > max could not be met at all when the first fragment was "small". So, in xennet_fill_frags() we could be sure, that if we have a frame with 18 fragments, then the first fragment would be "small". In THIS case, if the condition shinfo->nr_frags == MAX_SKB_FRAGS was met (meaning we have 18 fragments), we knew that the first fragment was small and so __pskb_pull_tail() was able to decrement nr_frags by one. So, BUG_ON(shinfo->nr_frags >= MAX_SKB_FRAGS) could never lead to a crash because nr_frags was now smaller than MAX_SKB_FRAGS. NOW the variable max in xennet_get_responses() can become 18 or 19, again depending on the size of the first fragment. Again, as (I think) a frame with 19 or even 20 fragments is not possible, the condition slots > max is NEVER met and the message "Too many slots" is actually impossible. However, in xennet_fill_frags() things are different. A frame with 18 fragments is now possible even if the first fragment is "large" (larger than RX_COPY_THRESHOLD). The problem is, that in this case __pskb_pull_tail() is NOT able to decrement shinfo->nr_frags by one. I THINK this is coded in the following lines in xennet_poll(): NETFRONT_SKB_CB(skb)->pull_to = rx->status; if (NETFRONT_SKB_CB(skb)->pull_to > RX_COPY_THRESHOLD) NETFRONT_SKB_CB(skb)->pull_to = RX_COPY_THRESHOLD; (I admit that I do not really understand what this actually does). So, with a "large" first fragment, shinfo->nr_frags is *still* MAX_SKB_FRAGS (18) after the pull, and so BUG_ON(shinfo->nr_frags >= MAX_SKB_FRAGS); causes a crash. It seems that the crash does NOT happen with a recent Dom0 kernel. I also tried to understand why. It seems as if the xen-netback driver in recent kernels does send only a maximum of *16* fragments (at least, I could not provoke it to send more). So, of course, the crash can never happen. On the other hand, I think, the code in xen-netfront is not correct in general. A Dom0 kernel which sends 18 fragments with a "large" first fragment is able to crash the DomU. For example, older linux kernels in Dom0, though I do not know when the change in netback to a fewer number of fragments happened. I also do not understand which is the universally correct solution.