[opensuse-virtual] 15.1 Xen DomU freezing under high network/disk load, recoverable with NMI trigger
Dear OpenSuse Team: Earlier today I sent a request to the list about a 42.3 DomU crashing. Olaf replied, and I've installed the new kernel, and I'll watch and see. I'm very grateful for the help. I'm sorry to post a second question, but I'm having a simliar-but-different problem on a different host and guest, and have reached an impasse. A few weeks ago, I took a copy of our crashy 42.3 DomU guest, and copied it to a new guest, just making a copy of the disk, and changing the name and IP address and booting it on a different physical host. I then did zypper dup from 42.3->15.0->15.1. This was intended as a "test run", if you like, to predict how client software would react to the upgrade. So now I have an upgraded *copy* of my machine, running 15.1. All patches applied. And it's running on a different host, which was a fresh load of 15.1, also with all patches applied. Linux host 4.12.14-lp151.28.36-default #1 SMP Fri Dec 6 13:50:27 UTC 2019 (8f4a495) x86_64 x86_64 x86_64 GNU/Linux This guest has a problem as well, in that, under sustained high network/disk loads, the guest freezes up completely. This happened twice today - I can pretty much *make* it happen just by starting a local rsync (i.e. on a crossover cable) of it's main big data partition (3TB).... about every other attempt to copy the entire partition via rsync over ssh will freeze the guest. I get the same annoyingly terse message on the physical host: [92630.531549] vif vif-6-0 vif6.0: Guest Rx stalled [92630.531613] br0: port 2(vif6.0) entered disabled state but, unlike my 42.3 guest, this one gives *no* log outputs or data at all on the guest. No BUG, no CPU lockup, no kernel traceback, nothing. I left a high priority shell on the hvc0 console, which, when the 42.3 guest had its problem, was still sort of responsive, and I left "top -n 1; sleep 15" running in a while true loop on it... but it was completely frozen. I could see the final top before the hang, and there was nothing to suggest a problem. The guest just... hangs. Unlike the frozen 42.3 guest, which showed pretty much continuous "run" state, the 15.1 guest seems to do the more-or-less "normal" behavior in xentop - switching between "b" and "r" modes, and showing normal utilization patterns. But the guest itself is stuck tight. I have seen mentions about the grant frames issue, and I did apply the higher value to the host and guests: # xen-diag gnttab_query_size 0 # Domain-0 domid=0: nr_frames=1, max_nr_frames=64 # xen-diag gnttab_query_size 1 # Xenstore domid=1: nr_frames=4, max_nr_frames=4 # xen-diag gnttab_query_size 6 # My guest domid=6: nr_frames=17, max_nr_frames=256 but this is still happening. Now here's the crazy part: I sat around trying to poke at the frozen guest and try different things before destroying it, and, skimming down my "xl" choices, I found "xl trigger". I had already tried pausing and unpausing the guest - that did nothing. But when I tried xl trigger (at random I tried the first option, so: xl trigger 6 nmi), the guest CAME BACK ONLINE! It said this: Uhhuh. NMI received for unknown reason 00 on CPU 0. Do you have a strange power saving mode enabled? Dazed and confused, but trying to continue on the console. I also saw it in /var/log/messages, followed by: clocksource: timekeeping watchdog on CPU0: Marking clocksource 'tsc' as unstable because the skew is too large: clocksource: 'xen' wd_now: 554c072567f2 wd_last: 54137c19cb3c mask: ffffffffffffffff clocksource: 'tsc' cs_now: 2d696bb78816d4 cs_last: 2d6640097d695e mask: ffffffffffffffff tsc: Marking TSC unstable due to clocksource watchdog On the host in /var/log/messages, I saw: [93760.637546] vif vif-6-0 vif6.0: Guest Rx ready [93760.637595] br0: port 2(vif6.0) entered blocking state [93760.637598] br0: port 2(vif6.0) entered forwarding state And, apart from the rsync/sshd processes (which I suspect the remote side had given up), everything else came right back online. MySQL, for example, was still running on the guest without issue, in fact apart from the log entries I cite above, there was no indication that the machine had even been broken. The 5- and 10-minute load averages were way up in the 30s... but everything else was fine. Prior to the freeze, the guest was continuously showing a load average of about 3.0 - with the rsync and sshd processes in run mode, and that's it - just as I'd expect. The guest is provisioned thusly: name="gggv" description="gggv" uuid="13289776-1c74-9ade-4242-8f7453249832" memory=90112 maxmem=90112 vcpus=26 cpus="4-31" on_poweroff="destroy" on_reboot="restart" on_crash="restart" on_watchdog="restart" localtime=0 keymap="en-us" type="pv" kernel="/usr/lib/grub2/x86_64-xen/grub.xen" extra="elevator=noop" disk=[ '/b/xen/gggv/gggv.root,raw,xvda1,w', '/b/xen/gggv/gggv.swap,raw,xvda2,w', '/b/xen/gggv/gggv.xa,raw,xvdb1,w', ] vif=[ 'mac=00:16:3f:04:05:41,bridge=br0', 'mac=00:16:3f:04:05:42,bridge=br1', ] vfb=['type=vnc,vncunused=1'] and is also the only guest running on its host. The host has: GRUB_CMDLINE_XEN="dom0_mem=4G dom0_max_vcpus=4 dom0_vcpus_pin gnttab_max_frames=256" and is in every other respect an essentially fresh 15.1 load. I'm thinking that this is a different problem than my 42.3 guest problem, but I don't know what to do with it. My next move was to make sure my hardware (and data, and OS!) were okay. So I moved the root filesystem of my upgraded guest aside, and did a fresh load of 15.1 onto a new root filesystem. When I use *that* to boot my guest, it seems to be stable. High network activity does not appear to stop it - I've done 5 or 6 copies of my huge filesystem in that mode without issue. Of course I'd like to do more cycles to be sure, but it seems stable compared to when the upgraded root is in place, when I can make the machine freeze up on almost every (or every other) copy attempt. The only thing I can think of that is different here, then, would be that, maybe, since the guest has been zypper dup'ed over time all the way back from 13.2 (the last time it was built fresh), that maybe it's inherited some old garbage that could be causing this. It seems to me that a zypper dup'ed guest "should" work properly, especially when it is the same version and kernel as the physical host; but, again (sorry) I have these freezes. So just for laughs, I ran an lsmod in both modes, and sorted and diffed them: The "clean" guest (which appears to be stable), has these four kernel modules not present on the upgraded guest: iptable_raw nf_conntrack_ftp nf_nat_ftp xt_CT The "dup'ped" guest (which seems to be crashable on a large local rsync) has these modules not present on a clean install: auth_rpcgss br_netfilter bridge grace intel_rapl ipt_MASQUERADE llc lockd nf_conntrack_netlink nf_nat_masquerade_ipv4 nfnetlink nfs_acl nfsd overlay sb_edac stp sunrpc veth xfrm_algo xfrm_user xt_addrtype xt_nat Both guests share these additional sysctl.conf settings: kernel.panic = 5 vm.panic_on_oom = 2 vm.swappiness = 0 net.ipv6.conf.all.autoconf = 0 net.ipv6.conf.default.autoconf = 0 net.ipv6.conf.eth0.autoconf = 0 net.ipv4.tcp_fin_timeout = 10 net.ipv4.tcp_tw_reuse = 0 The dup'ped guest has these additional sysctl.conf settings: net.ipv4.tcp_tw_recycle = 0 net.core.netdev_max_backlog=300000 net.core.somaxconn = 2048 net.core.rmem_max=67108864 net.core.wmem_max=67108864 net.ipv4.ip_local_port_range=15000 65000 net.ipv4.tcp_sack=0 net.ipv4.tcp_rmem=4096 87380 67108864 net.ipv4.tcp_wmem=4096 65536 67108864 all of which have, more or less, worked well in the past (when everything was on 42.3) and may or may not be relevant here. I'm sorry, I feel like I'm missing something obvious here, but I can't see it. I would be grateful for any guidance or insights into this. Yes, in addition to trying to upgrade my client in place to 15.1, I could just build a new guest by hand, but that would be even more time-consuming and seems like it should not be necessary. If I might quote from the kernel, "Dazed and confused, but trying to continue" is exactly how I'm feeling here. Why could this guest be hanging? Why does an NMI bring it back? What should I do next? Anything anyone would be willing to point me to or suggest would be gratefully appreciated. Glen -- To unsubscribe, e-mail: opensuse-virtual+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-virtual+owner@opensuse.org
On 21.12.2019 07:31, Glen wrote:
So just for laughs, I ran an lsmod in both modes, and sorted and diffed them:
The "clean" guest (which appears to be stable), has these four kernel modules not present on the upgraded guest:
iptable_raw nf_conntrack_ftp nf_nat_ftp xt_CT
The "dup'ped" guest (which seems to be crashable on a large local rsync) has these modules not present on a clean install:
auth_rpcgss br_netfilter bridge
One of these two is, according to my experience, a fair candidate for your problems. I'm not a networking specialist at all, so I can't give any suggestions on how to convert the upgraded guest to a network config not requiring these modules. (Trying to get rid of br_netfilter alone may be easier, but again I'm not really knowledgeable in this area at all.) Jan
grace intel_rapl ipt_MASQUERADE llc lockd nf_conntrack_netlink nf_nat_masquerade_ipv4 nfnetlink nfs_acl nfsd overlay sb_edac stp sunrpc veth xfrm_algo xfrm_user xt_addrtype xt_nat
Both guests share these additional sysctl.conf settings:
kernel.panic = 5 vm.panic_on_oom = 2 vm.swappiness = 0 net.ipv6.conf.all.autoconf = 0 net.ipv6.conf.default.autoconf = 0 net.ipv6.conf.eth0.autoconf = 0 net.ipv4.tcp_fin_timeout = 10 net.ipv4.tcp_tw_reuse = 0
The dup'ped guest has these additional sysctl.conf settings:
net.ipv4.tcp_tw_recycle = 0 net.core.netdev_max_backlog=300000 net.core.somaxconn = 2048 net.core.rmem_max=67108864 net.core.wmem_max=67108864 net.ipv4.ip_local_port_range=15000 65000 net.ipv4.tcp_sack=0 net.ipv4.tcp_rmem=4096 87380 67108864 net.ipv4.tcp_wmem=4096 65536 67108864
all of which have, more or less, worked well in the past (when everything was on 42.3) and may or may not be relevant here.
I'm sorry, I feel like I'm missing something obvious here, but I can't see it. I would be grateful for any guidance or insights into this. Yes, in addition to trying to upgrade my client in place to 15.1, I could just build a new guest by hand, but that would be even more time-consuming and seems like it should not be necessary. If I might quote from the kernel, "Dazed and confused, but trying to continue" is exactly how I'm feeling here. Why could this guest be hanging? Why does an NMI bring it back? What should I do next? Anything anyone would be willing to point me to or suggest would be gratefully appreciated.
Glen
-- To unsubscribe, e-mail: opensuse-virtual+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-virtual+owner@opensuse.org
I'm going to guess that you didn't install your Xen on your HostOS using "the" recommended standard procedure... Which is to use the YaST Virtualization module. If you did that, then you shouldn't have variations. Also, you would be prompted to install a bridge device. That said, I don't know how old your HostOS installations are (except for any you say you just installed). I am not involved with or track what the Xen team does, but the Linux kernel has undergone several related changes over the past 8 years or so... There first was the "pre-ip" era when we used usermode utilities like ipconfig regularly. Then, the ip tools were introduced, I don't remember if they were introduced initially as User mode tools are not. Today we have version 2 of ip tools which are implemented as kernel mode modules which comprise not only of the expanded ip commands which more or less replace everything "pre-ip tools" but also bridging devices and more. In other words, practically everything networking that used to be implemented as its own utility can now likely be done with an "ip" command. That <might> explain the differences in your two installations but shouldn't be a cause of problems... The "old way" should still be supported, but as complex as the modifications to the Xen kernel likely are, could be signposts to possible issues. Bottom line is that I don't think you should have to get down into the weeds as you've done and might be counter-productive, higher up methods should work that accommodate whatever is happening at a lower level. Blabbering away... Tony On Mon, Dec 23, 2019 at 1:26 AM Jan Beulich <jbeulich@suse.com> wrote:
On 21.12.2019 07:31, Glen wrote:
So just for laughs, I ran an lsmod in both modes, and sorted and diffed them:
The "clean" guest (which appears to be stable), has these four kernel modules not present on the upgraded guest:
iptable_raw nf_conntrack_ftp nf_nat_ftp xt_CT
The "dup'ped" guest (which seems to be crashable on a large local rsync) has these modules not present on a clean install:
auth_rpcgss br_netfilter bridge
One of these two is, according to my experience, a fair candidate for your problems. I'm not a networking specialist at all, so I can't give any suggestions on how to convert the upgraded guest to a network config not requiring these modules. (Trying to get rid of br_netfilter alone may be easier, but again I'm not really knowledgeable in this area at all.)
Jan
grace intel_rapl ipt_MASQUERADE llc lockd nf_conntrack_netlink nf_nat_masquerade_ipv4 nfnetlink nfs_acl nfsd overlay sb_edac stp sunrpc veth xfrm_algo xfrm_user xt_addrtype xt_nat
Both guests share these additional sysctl.conf settings:
kernel.panic = 5 vm.panic_on_oom = 2 vm.swappiness = 0 net.ipv6.conf.all.autoconf = 0 net.ipv6.conf.default.autoconf = 0 net.ipv6.conf.eth0.autoconf = 0 net.ipv4.tcp_fin_timeout = 10 net.ipv4.tcp_tw_reuse = 0
The dup'ped guest has these additional sysctl.conf settings:
net.ipv4.tcp_tw_recycle = 0 net.core.netdev_max_backlog=300000 net.core.somaxconn = 2048 net.core.rmem_max=67108864 net.core.wmem_max=67108864 net.ipv4.ip_local_port_range=15000 65000 net.ipv4.tcp_sack=0 net.ipv4.tcp_rmem=4096 87380 67108864 net.ipv4.tcp_wmem=4096 65536 67108864
all of which have, more or less, worked well in the past (when everything was on 42.3) and may or may not be relevant here.
I'm sorry, I feel like I'm missing something obvious here, but I can't see it. I would be grateful for any guidance or insights into this. Yes, in addition to trying to upgrade my client in place to 15.1, I could just build a new guest by hand, but that would be even more time-consuming and seems like it should not be necessary. If I might quote from the kernel, "Dazed and confused, but trying to continue" is exactly how I'm feeling here. Why could this guest be hanging? Why does an NMI bring it back? What should I do next? Anything anyone would be willing to point me to or suggest would be gratefully appreciated.
Glen
-- To unsubscribe, e-mail: opensuse-virtual+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-virtual+owner@opensuse.org
-- To unsubscribe, e-mail: opensuse-virtual+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-virtual+owner@opensuse.org
Comment on the initial post... AFAIC all your /etc/sysctl/ settings look benign, But be aware that there is an effort to deprecate /etc/sysctl. Inspect the file on a newly installed system to view the alternative sysctl files which will likely not be deprecated. I can't believe this is happening (/etc/sysctl/) has been such a standard location to implement proc settings and more since forever. But, that's just tomorrow and isn't today... Your settings look fine to me. Your post suggests you think that your problem might be network or disk related... You have to narrow that down because those are very different... Remember when you reallocate resources from system to network as you've done that you're stealing from the system to grant to networking. If your system needs resources, you'll have to re-optimize. To monitor and analyze your disk I/O, you should use the utility iotop. For the system, of course you can use top, htop or any of a variety of other possible utilities. If optimization is going to be an ongoing desire, you should consider instrumenting your system and set up network monitoring, there are many different guides on how to do this both FOSS and commercial, from Nagios and Nagios clones to commercial apps in combination with sensors. You should also consider whether you should be monitoring from within your Guest or on the HostOS. Don't know if you need a read on what you're doing to optimize, I wrote the article long ago based on an earlier openSUSE, everything in it and what you're currently doing should be just as effective on current openSUSE https://sites.google.com/site/4techsecrets/optimize-and-fix-your-network-con... HTH, Tony On Mon, Dec 23, 2019 at 10:59 AM Tony Su <tonysu@su-networking.com> wrote:
I'm going to guess that you didn't install your Xen on your HostOS using "the" recommended standard procedure... Which is to use the YaST Virtualization module. If you did that, then you shouldn't have variations. Also, you would be prompted to install a bridge device.
That said, I don't know how old your HostOS installations are (except for any you say you just installed). I am not involved with or track what the Xen team does, but the Linux kernel has undergone several related changes over the past 8 years or so... There first was the "pre-ip" era when we used usermode utilities like ipconfig regularly. Then, the ip tools were introduced, I don't remember if they were introduced initially as User mode tools are not. Today we have version 2 of ip tools which are implemented as kernel mode modules which comprise not only of the expanded ip commands which more or less replace everything "pre-ip tools" but also bridging devices and more. In other words, practically everything networking that used to be implemented as its own utility can now likely be done with an "ip" command.
That <might> explain the differences in your two installations but shouldn't be a cause of problems... The "old way" should still be supported, but as complex as the modifications to the Xen kernel likely are, could be signposts to possible issues. Bottom line is that I don't think you should have to get down into the weeds as you've done and might be counter-productive, higher up methods should work that accommodate whatever is happening at a lower level.
Blabbering away... Tony
On Mon, Dec 23, 2019 at 1:26 AM Jan Beulich <jbeulich@suse.com> wrote:
On 21.12.2019 07:31, Glen wrote:
So just for laughs, I ran an lsmod in both modes, and sorted and diffed them:
The "clean" guest (which appears to be stable), has these four kernel modules not present on the upgraded guest:
iptable_raw nf_conntrack_ftp nf_nat_ftp xt_CT
The "dup'ped" guest (which seems to be crashable on a large local rsync) has these modules not present on a clean install:
auth_rpcgss br_netfilter bridge
One of these two is, according to my experience, a fair candidate for your problems. I'm not a networking specialist at all, so I can't give any suggestions on how to convert the upgraded guest to a network config not requiring these modules. (Trying to get rid of br_netfilter alone may be easier, but again I'm not really knowledgeable in this area at all.)
Jan
grace intel_rapl ipt_MASQUERADE llc lockd nf_conntrack_netlink nf_nat_masquerade_ipv4 nfnetlink nfs_acl nfsd overlay sb_edac stp sunrpc veth xfrm_algo xfrm_user xt_addrtype xt_nat
Both guests share these additional sysctl.conf settings:
kernel.panic = 5 vm.panic_on_oom = 2 vm.swappiness = 0 net.ipv6.conf.all.autoconf = 0 net.ipv6.conf.default.autoconf = 0 net.ipv6.conf.eth0.autoconf = 0 net.ipv4.tcp_fin_timeout = 10 net.ipv4.tcp_tw_reuse = 0
The dup'ped guest has these additional sysctl.conf settings:
net.ipv4.tcp_tw_recycle = 0 net.core.netdev_max_backlog=300000 net.core.somaxconn = 2048 net.core.rmem_max=67108864 net.core.wmem_max=67108864 net.ipv4.ip_local_port_range=15000 65000 net.ipv4.tcp_sack=0 net.ipv4.tcp_rmem=4096 87380 67108864 net.ipv4.tcp_wmem=4096 65536 67108864
all of which have, more or less, worked well in the past (when everything was on 42.3) and may or may not be relevant here.
I'm sorry, I feel like I'm missing something obvious here, but I can't see it. I would be grateful for any guidance or insights into this. Yes, in addition to trying to upgrade my client in place to 15.1, I could just build a new guest by hand, but that would be even more time-consuming and seems like it should not be necessary. If I might quote from the kernel, "Dazed and confused, but trying to continue" is exactly how I'm feeling here. Why could this guest be hanging? Why does an NMI bring it back? What should I do next? Anything anyone would be willing to point me to or suggest would be gratefully appreciated.
Glen
-- To unsubscribe, e-mail: opensuse-virtual+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-virtual+owner@opensuse.org
-- To unsubscribe, e-mail: opensuse-virtual+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-virtual+owner@opensuse.org
Jan, Tony - Thank you both for your responses. I have more information now which might be helpful, I'll provide it after I answer your comments. On Mon, Dec 23, 2019 at 1:26 AM Jan Beulich <jbeulich@suse.com> wrote:
br_netfilter bridge One of these two is, according to my experience, a fair candidate for your problems.
Thank you. I'll focus in on these and see what I can do. On Mon, Dec 23, 2019 at 11:00 AM Tony Su <tonysu@su-networking.com> wrote:
I'm going to guess that you didn't install your Xen on your HostOS using "the" recommended standard procedure... Which is to use the YaST Virtualization module. If you did that, then you shouldn't have variations. Also, you would be prompted to install a bridge device.
Sorry, I was not clear, my fault. My HostOS is OpenSuse 15.1 across all hosts. On two of the hosts, it is a fresh load from the downloaded ISO, with only the defaults plus the Xen patterns selected. Following the fresh load, I did a zypper update, and I did the recommend standard procedure, using Yast2 to install the virtualization support. That process did indeed prompt me to create a bridge, and I did. It seemed to me to be the same procedure. The other two hosts were fresh loaded (in the past) at 42.3, using the same procedure then, and have since been zypper-dup'ped to 15.0 and then 15.1 per the upgrade procedure. All four hosts seem "clean"... and the problem exists with guests on all four hosts. But to be clear - the hosts are not freezing up or losing network connectivity at all. The hosts are fine. It is only the guests that are having issues.
That said, I don't know how old your HostOS installations are (except for any you say you just installed)
The two fresh hosts were loaded about 6 weeks ago. The other two were dup'ped about 12 weeks ago. All have been zypper-updated to the latest stuff since then. All four hosts run only Xen, no other stuff at the same time. Just stock OpenSuse 15.1, and Xen Dom0.
Blabbering away...
Please continue! I read everything else you said with interest. It's often the case that one thing one person says can trigger something for someone else, and I'm hoping that happens here. I am very grateful for the background, history and detail. On Mon, Dec 23, 2019 at 11:20 AM Tony Su <tonysu@su-networking.com> wrote:
AFAIC all your /etc/sysctl/ settings look benign,
Thanks.
But be aware that there is an effort to deprecate /etc/sysctl.
I wasn't aware of that *sigh* but thank you.
Your post suggests you think that your problem might be network or disk related...
Maybe. I honestly don't know. All I know is, when I have a guest running on any of my hosts, even if that guest is idle (because, for example, it's a copy of a production machine) and therefore not getting internet traffic or usage, I can make the host crash by rsyncing a lot of data over a crossover cable at maximum speed. I haven't tested letting the host "just sit there", I suppose that'd be a good bracketing test; for now, it seems like an rsync read triggers the issue, so I assume (perhaps incorrectly) it's network or disk. Everything else in your email read, understood, and appreciated, and
https://sites.google.com/site/4techsecrets/optimize-and-fix-your-network-con...
I will read that after I send this. So where I am now is this: I have a guest machine image running 42.3. There are four copies of this guest, running on four different hosts. The hosts are running 15.1. Two are fresh loads, two are zypper-dup'ped from 42.3 fresh loads. The guest machines have had problems on all four hosts. 1. The "production" guest is running 42.3 When its host was also on 42.3, it was rock solid. When I dup'ped the host to 15.1, the guest started going into the weeds every 5-7 days at random. This is the one I first reported in the 42.3 thread. Olaf suggested installing the SLE12-SP5 kernel on that guest. I did that roughly 72 hours ago. So far no issues, but it needs more time. I previously thought that I had to destroy/recreate this guest (as I mentioned in my thread); I now realize (see #2 below) that if it crashed again I should be able to recover it with an xl trigger nmi. 2. One of the backup guests, whose job it was to just rsync from production, is a stock 42.3 guest still running the 4.4 (42.3) kernel. It's host has also been dupped to 15.1. It has never had a problem, until today, when it went into the weeds. I was able to recover it using xl trigger nmi. 3. A third backup guest is also a stock 42.3 guest running the 4.4 kernel. Its host, however, was a fresh load of 15.1. It has locked up once as well. 4. My fourth guest is a copy of the original 42.3 which the guest itself has been zypper-dup'ped to 15.1 I have been copying data from this machine, which causes it to freeze up, which is also recoverable with an NMI. So.... * Problem exists on any 15.1 host, whether fresh loaded or dup'ped. * Problem exists on multiple copies of this particular guest, whether at 42.3 or dup'ped to 15.1 The SLE12-SP5 kernel *might* have resolved this, but more time is needed to be sure. Problem seems to be confined to (ccopies of) this particular guest , for this particular client. * Problem *seems* to not exist on any of my other guests from different origins/other clients running 42.3 or 15.1, dup'ped or fresh, although I suppose there could still be broken guests that just haven't crashed yet. But testing a fresh loaded 15.1 guest I could not get it to crash. * Problem *seems* to be related to utilization of network or disk, the more utilization, the more frequent the hang. No log output on guest at all to indicate why. Virtual hardware just.... stops. I'm literally just guessing at this point, but that is why I suspect something in this particular guest. It could be a legacy thing - this guest was last freshly loaded at 13.1, and has been zypper-dup'ped step by step ever since. That could be it, but I have other guests with similar histories that are not malfunctioning. This guest is also running the extra modules I mentioned, and I'm going to look at that. In addition, this guest runs things like Docker, Elastic Search, Kibana, and other programs that tend to eat CPU and IO even when they're idle (grumble grumble). I can't help but wonder if one of these might be contributing. But the thing is, when the machine hangs, it literally just... hangs. The host can't detect it, it still shows b/r states and normal usage, the only thing the host sees is "Guest network stalled". But the guest console is frozen, and literally you'd think you'd have to destroy/recreate (as I did). Only by chance did I discover that an NMI recovered it. After recovery, the host literally just starts running again, just as it was, right from where it left off, except for the clock. So it's as if the guest is literally stopping at a (virtual) hardware level... and hanging.... and then continuing on when I NMI it. So it seems to me that processes on the guest, even Docker, would show some sign of trouble beforehand. But there is none. Loads are normal, iotop is normal, I've literally say on the regular top with a 1.0 second refresh and had a guest hang on me right while I was looking at it - and there is literally no warning at all. I mean at this point I'm toying with an every-minute cronjob on the host like: * * * * * ping -c4 -w5 [myhostip] &>/dev/null || xl trigger [guestdomid] nmi Meaning that, as soon a the host can't ping the guest, assume it's in the weeds, and NMI it. I shouldn't have to live like that, but at least I'd sleep through the night. So I hope this clarifies. I'm kind of depressed that zypper-dup'ping the guest to 15.1 didn't solve this - I hope the SLES kernel does. But if not, it seems like a fresh load/recreate of the guest is all I can do, and I'll do it if I have to, but I'm hoping this rings a bell for someone who can point me to some additional data or solution. Thank you all for your patience and support! With great respect and appreciation, Glen -- To unsubscribe, e-mail: opensuse-virtual+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-virtual+owner@opensuse.org
Argh. On Mon, Dec 23, 2019 at 2:47 PM Glen <glenbarney@gmail.com> wrote:
Maybe. I honestly don't know. All I know is, when I have a guest running on any of my hosts, even if that guest is idle (because, for example, it's a copy of a production machine) and therefore not getting internet traffic or usage, I can make the host crash by rsyncing a lot of data over a crossover cable at maximum speed. I
Typo. Ugh. The host does not ever crash. s/host crash/guest crash/ => I can make the GUEST crash by rsyncing a lot of data. Sorry. Glen -- To unsubscribe, e-mail: opensuse-virtual+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-virtual+owner@opensuse.org
On Tue, Dec 24, 2019 at 6:10 AM Glen <glenbarney@gmail.com> wrote:
Argh.
On Mon, Dec 23, 2019 at 2:47 PM Glen <glenbarney@gmail.com> wrote:
Maybe. I honestly don't know. All I know is, when I have a guest running on any of my hosts, even if that guest is idle (because, for example, it's a copy of a production machine) and therefore not getting internet traffic or usage, I can make the host crash by rsyncing a lot of data over a crossover cable at maximum speed. I
Typo. Ugh.
The host does not ever crash.
s/host crash/guest crash/ => I can make the GUEST crash by rsyncing a lot of data.
If it's easily reproducible, then the simplest way to troubleshoot is to use another distro as guest, or at least another kernel. You said that this guest runs docker, right? Then it should be easy enough to try install a new guest with another distro (e.g. ubuntu, centos, whatever), install docker on it, and try to replicate your problem. If the culprit is bridge/netfilter module, you could probably test with installing another kernel. Try kernel:HEAD? https://doc.opensuse.org/documentation/leap/reference/html/book.opensuse.ref... In any case, those two methods can help verify whether the problem is in the kernel (module), or something in leap userland (including its docker version). -- Fajar -- To unsubscribe, e-mail: opensuse-virtual+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-virtual+owner@opensuse.org
On Mon, Dec 23, 2019 at 1:26 AM Jan Beulich <jbeulich@suse.com> wrote:
On 21.12.2019 07:31, Glen wrote:
So just for laughs, I ran an lsmod in both modes, and sorted and diffed them: br_netfilter bridge One of these two is, according to my experience, a fair candidate for your problems. I'm not a networking specialist at all, so I can't give any suggestions on how to convert the upgraded guest to a network config not requiring these modules. (Trying to get rid of br_netfilter alone may be easier, but again I'm not really knowledgeable in this area at all.)
All - During the past two weeks I've been trying to hunt these down, and I have discovered why these modules are loading. The Xen guest is running Docker.... and Docker needs those modules to provide access to the outside world for docker container networking. Those two modules don't exist in /etc/modprobe.d, and I can rmmod them without damaging the Xen guest at all - but when I do, the Docker containers lose their network connectivity. None of my other guests run Docker, which is why those modules don't exist in the other guests, and why, I'm now guessing, the other guests don't have crash issues. *sigh* So I may be in a chicken-and-egg scenario here, but I'll just put this out there - has anyone tried running Docker on a Xen guest, and/or could there be anything in those two modules Jan mentioned, or in Docker, that could be causing or related to these issues? I'd be grateful for any thoughts anyone might have! Glen -- To unsubscribe, e-mail: opensuse-virtual+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-virtual+owner@opensuse.org
I'm really surprised, I've never heard that any Guest running any virtualization on any kind of hypervisor would have any kind of problem running docker and having a networking problem. There should be two possible approaches to your problem based on a common idea... All network bridge devices are the same regardless of how they were created and if they were created by a virtualization tool or not. Once created, they should be discoverable by any entity that needs to use a bridge device. For this reason, although it's possible to cascade bridge devices, you should avoid if possible. So your likely choices, 1. If you create a bridge device in the Guest, it has to be bound to a network interface which communicates correctly with the HostOS network interface which in the case of Xen installed on openSUSE/SUSE likely itself has a bridge device bound to the network interface. Yes, that starts getting complicated, so... 2. You can go "old school" and implement docker networking the old way without a bridge device. Here is one old piece of documentation that describes how things used to be done and should still work https://github.com/putztzu/docker/blob/master/docs/installation/linux/SUSE.m... Otherwise, if you want to configure the first scenario above , you'll have to post your bridge device configurations for both in your Guest and your HostOS describing the actual physical interfaces they're bound to and what type of bridge device each are (are all bridging, are any NAT? Depending on the type of bridge may also need further details). Personally though, I like to live a life as uncomplicated as possible, I even avoid virtual switches as much as possible unless there is a real need that can't be accomplished by using bridge devices intelligently. HTH, TSU On Sun, Jan 5, 2020 at 8:41 AM Glen <glenbarney@gmail.com> wrote:
On Mon, Dec 23, 2019 at 1:26 AM Jan Beulich <jbeulich@suse.com> wrote:
On 21.12.2019 07:31, Glen wrote:
So just for laughs, I ran an lsmod in both modes, and sorted and diffed them: br_netfilter bridge One of these two is, according to my experience, a fair candidate for your problems. I'm not a networking specialist at all, so I can't give any suggestions on how to convert the upgraded guest to a network config not requiring these modules. (Trying to get rid of br_netfilter alone may be easier, but again I'm not really knowledgeable in this area at all.)
All -
During the past two weeks I've been trying to hunt these down, and I have discovered why these modules are loading.
The Xen guest is running Docker.... and Docker needs those modules to provide access to the outside world for docker container networking.
Those two modules don't exist in /etc/modprobe.d, and I can rmmod them without damaging the Xen guest at all - but when I do, the Docker containers lose their network connectivity. None of my other guests run Docker, which is why those modules don't exist in the other guests, and why, I'm now guessing, the other guests don't have crash issues.
*sigh*
So I may be in a chicken-and-egg scenario here, but I'll just put this out there - has anyone tried running Docker on a Xen guest, and/or could there be anything in those two modules Jan mentioned, or in Docker, that could be causing or related to these issues?
I'd be grateful for any thoughts anyone might have!
Glen -- To unsubscribe, e-mail: opensuse-virtual+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-virtual+owner@opensuse.org
-- To unsubscribe, e-mail: opensuse-virtual+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-virtual+owner@opensuse.org
Just saw your other thread, Verifying that this machine isn't experiencing the same problems as the other one, that only this one has networking problems? Now want to ask if this machine had functional networking "as is" with its current configuration before an upgrade? Also, Are you using Docker files to build your docker containers? And, are you using any other composition tools? Tony On Sun, Jan 5, 2020 at 10:53 PM Tony Su <tonysu@su-networking.com> wrote:
I'm really surprised, I've never heard that any Guest running any virtualization on any kind of hypervisor would have any kind of problem running docker and having a networking problem.
There should be two possible approaches to your problem based on a common idea... All network bridge devices are the same regardless of how they were created and if they were created by a virtualization tool or not. Once created, they should be discoverable by any entity that needs to use a bridge device. For this reason, although it's possible to cascade bridge devices, you should avoid if possible.
So your likely choices, 1. If you create a bridge device in the Guest, it has to be bound to a network interface which communicates correctly with the HostOS network interface which in the case of Xen installed on openSUSE/SUSE likely itself has a bridge device bound to the network interface. Yes, that starts getting complicated, so...
2. You can go "old school" and implement docker networking the old way without a bridge device. Here is one old piece of documentation that describes how things used to be done and should still work
https://github.com/putztzu/docker/blob/master/docs/installation/linux/SUSE.m...
Otherwise, if you want to configure the first scenario above , you'll have to post your bridge device configurations for both in your Guest and your HostOS describing the actual physical interfaces they're bound to and what type of bridge device each are (are all bridging, are any NAT? Depending on the type of bridge may also need further details). Personally though, I like to live a life as uncomplicated as possible, I even avoid virtual switches as much as possible unless there is a real need that can't be accomplished by using bridge devices intelligently.
HTH, TSU
On Sun, Jan 5, 2020 at 8:41 AM Glen <glenbarney@gmail.com> wrote:
On Mon, Dec 23, 2019 at 1:26 AM Jan Beulich <jbeulich@suse.com> wrote:
On 21.12.2019 07:31, Glen wrote:
So just for laughs, I ran an lsmod in both modes, and sorted and diffed them: br_netfilter bridge One of these two is, according to my experience, a fair candidate for your problems. I'm not a networking specialist at all, so I can't give any suggestions on how to convert the upgraded guest to a network config not requiring these modules. (Trying to get rid of br_netfilter alone may be easier, but again I'm not really knowledgeable in this area at all.)
All -
During the past two weeks I've been trying to hunt these down, and I have discovered why these modules are loading.
The Xen guest is running Docker.... and Docker needs those modules to provide access to the outside world for docker container networking.
Those two modules don't exist in /etc/modprobe.d, and I can rmmod them without damaging the Xen guest at all - but when I do, the Docker containers lose their network connectivity. None of my other guests run Docker, which is why those modules don't exist in the other guests, and why, I'm now guessing, the other guests don't have crash issues.
*sigh*
So I may be in a chicken-and-egg scenario here, but I'll just put this out there - has anyone tried running Docker on a Xen guest, and/or could there be anything in those two modules Jan mentioned, or in Docker, that could be causing or related to these issues?
I'd be grateful for any thoughts anyone might have!
Glen -- To unsubscribe, e-mail: opensuse-virtual+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-virtual+owner@opensuse.org
-- To unsubscribe, e-mail: opensuse-virtual+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-virtual+owner@opensuse.org
Hi everyone - So my two threads really turn out to be one thread, and I'm replying to both here. I apologize for the mess. First, I really appreciate all the responses and pointers, I'm very grateful for all the help. THANK YOU to all those who responded on either thread! To recap, essentially what I have is: 1. I've always run a bunch of high-traffic 42.3 Xen PV guests on 42.3 hosts. 2. I upgraded the 42.3 hosts to 15.1, via online upgrade and then via fresh load. 3. When I did that, *one* of my 42.3 guests started hanging at random every 2-7 days. The hangs seemed to be related to high network and/or disk traffic. I discovered by accident that if I did an "xl trigger nmi" I could "unhang" the guest and make it resume duty, more or less, without a reboot, but I have no idea why the hang occurs or why it's recoverable in that way. Chasing this down has been painful. It initially looked like sshd was the culprit, but it wasn't. I thought the kernel mismatch might be the issue, but other 42.3 guests run on their 15.1 hosts without a problem, and upgrading the guest to 15.1 didn't solve it. Olaf has been pointing me to new kernels, and that helped somewhat - moving to the SLE kernel extended the guest uptime from a few days to a few weeks (buying me much needed sleep, thank you!) but I still don't have a solution. The problem seems to be in this particular guest... somewhere. The problem seems to travel with the guest: If I clone the guest, and bring it up elsewhere that clone also has the problem. So I've resorted to making a copy of the guest and staging it on a different host just so I can stress-test it. To stress-test it, I basically initiate lots of high-traffic requests against the troubled guest from an outside source. Initially, the guest was hanging during a single full outbound rsync. To prevent SSD wear I modified the command and used a hack to simulate the traffic. If I boot the guest, and, from a different connected machine, do stuff like: nohup ssh 192.168.1.11 tar cf - --one-file-system /a | cat > /dev/null & nohup ssh 192.168.1.11 cat /dev/zero | cat > /dev/null & (where 1.11 is the troubled guest, and /a is a 4TB filesystem full of data) I can make the troubled guest hang in somewhere between 45 minutes and 12 hours. Thanks to Olaf, Jan, Tony and Fajar, I've been able to try a number of things, but so far, I've had no luck: Uprading openssh to the latest version did not solve it. Upgrading the guest to 15.1 (unifying the kernels) did not solve it. Upgrading the 42.3 guest kernel to a different version helped... but did not solve it. Removing some possibly problematic kernel modules did not solve it. Removing Docker did not solve it. I had optimizations from the Xen best practices page in /etc/sysctl.conf for just this guest - removing those did not solve it. The only solution I've found seems to be starting the guest over fresh. If I do a fresh load of 15.1 as a guest and mount that same /a filesystem, and do those same tests... the freshly-loaded guest works fine.... it's rock solid. I had those same tests running against the freshly-loaded guest for over 24 hours and it did just fine. I can literally just swap out root filesystem images - booting the troubled guest's root filesystem results in the hangs - booting the fresh-load seems completely reliable. In short it seems to me now that there's something in this particular guest's root filesystem image... something I can't find... that is causing this. The image started as a 13.1 (thirteen point one) fresh load (years ago) and has been in-place upgraded ever since.... so I am concluding that something bad has been brought forward that I'm not aware of. It seems at this point that I just need to rebuild the guest as a fresh 15.1 load, reinstalling only what I currently need, and going from there, and so that's what I'm going to do. The usual absence of useful log data when the machine crashes is frustrating, and the time it takes (1-12 hours) to make a test machine crash makes the test process slow, so I'm feeling like I should just abandon this and replace the guest. If any of this triggers anything for anyone, please let me know. Otherwise, I'm continuing to stress test my freshly-loaded guest for a few more days (just to be sure) and then I'll start the reconnect and replacement process. It really would have been nice to find out what on that troubled guest was causing the issue, but it's probably some legacy thing brought forward that's causing instability, and since each test cycle takes so long, the "process of elimination" could take months or more. And of course as soon as I send this, something new will break, making this all invalid. :-) Anyway THANK YOU ALL for your support and help here, I am very grateful! Glen -- To unsubscribe, e-mail: opensuse-virtual+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-virtual+owner@opensuse.org
Was thinking along the same lines as the last resort... If you can pin down the problem as a HostOS problem, Shouldn't your Xen Guests(with docker containers within) be easily migratable to a newly built HostOS? Don't know what else is running on your system, but typically it's a recommended best practice to keep the HostOS in a multi-apartment system as simple and uncomplex as possible. And a general observation, If this machine was originally installed as a 13.1 and survived every upgrade for each openSUSE version from then to now, IMO that's quite an accomplishment. IMO, Tony On Wed, Jan 8, 2020 at 9:01 AM Glen <glenbarney@gmail.com> wrote:
Hi everyone -
So my two threads really turn out to be one thread, and I'm replying to both here. I apologize for the mess.
First, I really appreciate all the responses and pointers, I'm very grateful for all the help. THANK YOU to all those who responded on either thread!
To recap, essentially what I have is:
1. I've always run a bunch of high-traffic 42.3 Xen PV guests on 42.3 hosts. 2. I upgraded the 42.3 hosts to 15.1, via online upgrade and then via fresh load. 3. When I did that, *one* of my 42.3 guests started hanging at random every 2-7 days.
The hangs seemed to be related to high network and/or disk traffic. I discovered by accident that if I did an "xl trigger nmi" I could "unhang" the guest and make it resume duty, more or less, without a reboot, but I have no idea why the hang occurs or why it's recoverable in that way.
Chasing this down has been painful. It initially looked like sshd was the culprit, but it wasn't. I thought the kernel mismatch might be the issue, but other 42.3 guests run on their 15.1 hosts without a problem, and upgrading the guest to 15.1 didn't solve it. Olaf has been pointing me to new kernels, and that helped somewhat - moving to the SLE kernel extended the guest uptime from a few days to a few weeks (buying me much needed sleep, thank you!) but I still don't have a solution.
The problem seems to be in this particular guest... somewhere. The problem seems to travel with the guest: If I clone the guest, and bring it up elsewhere that clone also has the problem. So I've resorted to making a copy of the guest and staging it on a different host just so I can stress-test it.
To stress-test it, I basically initiate lots of high-traffic requests against the troubled guest from an outside source. Initially, the guest was hanging during a single full outbound rsync. To prevent SSD wear I modified the command and used a hack to simulate the traffic. If I boot the guest, and, from a different connected machine, do stuff like:
nohup ssh 192.168.1.11 tar cf - --one-file-system /a | cat > /dev/null & nohup ssh 192.168.1.11 cat /dev/zero | cat > /dev/null &
(where 1.11 is the troubled guest, and /a is a 4TB filesystem full of data) I can make the troubled guest hang in somewhere between 45 minutes and 12 hours.
Thanks to Olaf, Jan, Tony and Fajar, I've been able to try a number of things, but so far, I've had no luck:
Uprading openssh to the latest version did not solve it. Upgrading the guest to 15.1 (unifying the kernels) did not solve it. Upgrading the 42.3 guest kernel to a different version helped... but did not solve it. Removing some possibly problematic kernel modules did not solve it. Removing Docker did not solve it. I had optimizations from the Xen best practices page in /etc/sysctl.conf for just this guest - removing those did not solve it.
The only solution I've found seems to be starting the guest over fresh. If I do a fresh load of 15.1 as a guest and mount that same /a filesystem, and do those same tests... the freshly-loaded guest works fine.... it's rock solid. I had those same tests running against the freshly-loaded guest for over 24 hours and it did just fine. I can literally just swap out root filesystem images - booting the troubled guest's root filesystem results in the hangs - booting the fresh-load seems completely reliable.
In short it seems to me now that there's something in this particular guest's root filesystem image... something I can't find... that is causing this. The image started as a 13.1 (thirteen point one) fresh load (years ago) and has been in-place upgraded ever since.... so I am concluding that something bad has been brought forward that I'm not aware of.
It seems at this point that I just need to rebuild the guest as a fresh 15.1 load, reinstalling only what I currently need, and going from there, and so that's what I'm going to do. The usual absence of useful log data when the machine crashes is frustrating, and the time it takes (1-12 hours) to make a test machine crash makes the test process slow, so I'm feeling like I should just abandon this and replace the guest.
If any of this triggers anything for anyone, please let me know. Otherwise, I'm continuing to stress test my freshly-loaded guest for a few more days (just to be sure) and then I'll start the reconnect and replacement process. It really would have been nice to find out what on that troubled guest was causing the issue, but it's probably some legacy thing brought forward that's causing instability, and since each test cycle takes so long, the "process of elimination" could take months or more.
And of course as soon as I send this, something new will break, making this all invalid. :-)
Anyway THANK YOU ALL for your support and help here, I am very grateful!
Glen -- To unsubscribe, e-mail: opensuse-virtual+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-virtual+owner@opensuse.org
-- To unsubscribe, e-mail: opensuse-virtual+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-virtual+owner@opensuse.org
Hi Tony - Thanks for the reply! On Wed, Jan 8, 2020 at 9:24 AM Tony Su <tonysu@su-networking.com> wrote:
Was thinking along the same lines as the last resort... If you can pin down the problem as a HostOS problem, Shouldn't your Xen Guests(with docker containers within) be easily migratable to a newly built HostOS? Don't know what else is running on your system, but typically it's a recommended best practice to keep the HostOS in a multi-apartment system as simple and uncomplex as possible.
Yes but I must have explained it backwards. At this point, I think it's a problem inside the guest, not the host. I have 9 hosts, and about 16 guests, and the problem follows this particular guest. All hosts are solid. All the other guests are solid. It's only this one guest (or copies of it) that fails, no matter which host I run it on. So I am going to fresh-load the guest, and just reconnect services, and yes, it won't be that complex, it just "a pain" and I was hoping to avoid it.
And a general observation, If this machine was originally installed as a 13.1 and survived every upgrade for each openSUSE version from then to now, IMO that's quite an accomplishment.
This made me smile. At this point, all of my hosts are freshly-loaded 15.1. But this guest - yes, it's last fresh load was back on 13.1, and we've just "zypper dup"'ped it at each new release, all the way up until now. But it's clearly time to bite the bullet and reload the guest fresh. THANK YOU again for your support and responses as I've tried to figure this out! Glen -- To unsubscribe, e-mail: opensuse-virtual+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-virtual+owner@opensuse.org
participants (4)
-
Fajar A. Nugraha
-
Glen
-
Jan Beulich
-
Tony Su