Bug ID 1104824
Summary Bug in xen-netfront.c, concerning replacement of MAX_SKB_FRAGS with XEN_NETIF_NR_SLOTS_MIN
Classification openSUSE
Product openSUSE Distribution
Version Leap 15.0
Hardware x86-64
OS Linux
Status NEW
Severity Normal
Priority P5 - None
Component Kernel
Assignee kernel-maintainers@forge.provo.novell.com
Reporter Manfred.Haertel@rz-online.de
QA Contact qa-bugs@suse.de
Found By ---
Blocker ---

After a recent kernel upgrade in a DomU, which run on a system with an older
kernel in Dom0, the DomU crashed on heavy incoming network load.

So I filed https://bugzilla.suse.com/show_bug.cgi?id=1102997 and later I
invalidated my own bug report, because I could not reproduce the bug with a
current Dom0 kernel.

However, after some debugging and trying to understand how the driver works, I
still think that the change where MAX_SKB_FRAGS was replaced by
XEN_NETIF_NR_SLOTS_MIN is not completely correct. I hope my wording is correct
(or at least understandable), trying to explain what I think is wrong.

The variable max in xennet_get_responses() can now become larger by one than
before.

BEFORE the patch, the variable was 17 or 18, depending on rx->status <=
RX_COPY_THRESHOLD, which means it depends on the size of the first fragment. If
the first fragment was larger than the threshold, the variable became 17, and
18 otherwise.

So, if a (ethernet GSO) frame with 18 fragments was received AND the first
fragment was "large" (larger than the threshold), the condition slots > max in
xennet_get_responses() was met, and the whole frame was discarded (and the
infamous message "Too many slots" appeared).

I don't think the Xen network code allows a frame with 19 fragments, so the
condition slots > max could not be met at all when the first fragment was
"small".

So, in xennet_fill_frags() we could be sure, that if we have a frame with 18
fragments, then the first fragment would be "small".

In THIS case, if the condition shinfo->nr_frags == MAX_SKB_FRAGS was met
(meaning we have 18 fragments), we knew that the first fragment was small and
so __pskb_pull_tail() was able to decrement nr_frags by one. So,
BUG_ON(shinfo->nr_frags >= MAX_SKB_FRAGS) could never lead to a crash because
nr_frags was now smaller than MAX_SKB_FRAGS.

NOW the variable max in xennet_get_responses() can become 18 or 19, again
depending on the size of the first fragment. Again, as (I think) a frame with
19 or even 20 fragments is not possible, the condition slots > max is NEVER met
and the message "Too many slots" is actually impossible.

However, in xennet_fill_frags() things are different. A frame with 18 fragments
is now possible even if the first fragment is "large" (larger than
RX_COPY_THRESHOLD). The problem is, that in this case __pskb_pull_tail() is NOT
able to decrement shinfo->nr_frags by one. I THINK this is coded in the
following lines in xennet_poll():

NETFRONT_SKB_CB(skb)->pull_to = rx->status;
if (NETFRONT_SKB_CB(skb)->pull_to > RX_COPY_THRESHOLD)
        NETFRONT_SKB_CB(skb)->pull_to = RX_COPY_THRESHOLD;

(I admit that I do not really understand what this actually does).

So, with a "large" first fragment, shinfo->nr_frags is *still* MAX_SKB_FRAGS
(18) after the pull, and so

BUG_ON(shinfo->nr_frags >= MAX_SKB_FRAGS);

causes a crash.

It seems that the crash does NOT happen with a recent Dom0 kernel. I also tried
to understand why.

It seems as if the xen-netback driver in recent kernels does send only a
maximum of *16* fragments (at least, I could not provoke it to send more). So,
of course, the crash can never happen.

On the other hand, I think, the code in xen-netfront is not correct in general.
A Dom0 kernel which sends 18 fragments with a "large" first fragment is able to
crash the DomU. For example, older linux kernels in Dom0, though I do not know
when the change in netback to a fewer number of fragments happened.

I also do not understand which is the universally correct solution.


You are receiving this mail because: